Is there a way of checking the content of an email and copying all the hyperlinks within the text to a single variable?
I’ve written a script that searches a selected email content and subject for values that I need. However, I am not sure how to approach the hyperlinks.
Hyperlinks in the emails will always start with the HTTPS:// prefix but there may be several of them, or there may be none of them.
I want all hyperlinks stored in a single variable so I can drop them into a database using the finished script.
This is my first time using the delimiters within Applescript, so this might be a bit messy - the script below is already searching and storing several other values I need to pull from these emails:
set subjectSearch to "-"
set bSearch to "List:"
set qSearch to "Name"
set bkbSearch to "Age"
set qkbSearch to "Birth"
set refSearch to "#"
set theResults to {}
tell application "Mail"
try
set theMessages to selection
if theMessages is {} then
display alert "No Messages Selected" message "Select the messages you want to collect before running this script."
error number -128
return
end if
set dateList to {}
set reportText to ""
repeat with theMessage in theMessages
set reportText to reportText & (content of theMessage) as string
end repeat
set theurls to (content of theMessage) as string
repeat with tMsg in (get selection)
set end of dateList to the short date string of (get date received of tMsg)
end repeat
repeat with aMessage in theMessages
tell aMessage
set theSubject to subject
set theContent to content
end tell
set refNum to my getFirstWordAfterSearchText(refSearch, theSubject)
set refB to my getFirstWordAfterSearchText(bSearch, theContent)
set refQ to my getFirstWordAfterSearchText(qSearch, theContent)
set refBkb to my getFirstWordAfterSearchText(bkbSearch, theContent)
set refQkb to my getFirstWordAfterSearchText(qkbSearch, theContent)
end repeat
on error theError
display dialog theError buttons {"OK"} default button 1
return
end try
end tell
end
on getFirstWordAfterSearchText(searchString, theText)
try
set {tids, AppleScript's text item delimiters} to {AppleScript's text item delimiters, searchString}
set textItems to text items of theText
set AppleScript's text item delimiters to tids
return (first word of (item 2 of textItems))
on error theError
return ""
end try
end getFirstWordAfterSearchText
return {dateList, remNum, refB, refQ, refBkb, refQkb}
If this isn’t possible - maybe simply copying all text that occurs after a certain word to a variable is simpler? Since the URLs I’m trying to grab always appear at the end of the email - so maybe copy everything that appears after the word “URL’s” ?
Hello Adam
You may use that as a draft.
You will not be surprised to read that most of the code was borrowed from Shane STANLEY.
use AppleScript version "2.4"
use framework "Foundation"
use scripting additions
script o
property theSources : {}
property theMessages : {}
end script
on findURLsIn:theString
set theNSDataDetector to current application's NSDataDetector's dataDetectorWithTypes:(current application's NSTextCheckingTypeLink) |error|:(missing value)
set theURLsNSArray to theNSDataDetector's matchesInString:theString options:0 range:{location:0, |length|:length of theString}
return (theURLsNSArray's valueForKeyPath:"URL.absoluteString") as list
end findURLsIn:
set linksList to {}
tell application "Mail"
set o's theMessages to the selection
repeat with aMessage in o's theMessages
set end of o's theSources to source of aMessage
end repeat
end tell
--its findURLsIn:theContent
o's theSources as list
set theNSDataDetector to current application's NSDataDetector's dataDetectorWithTypes:(current application's NSTextCheckingTypeLink) |error|:(missing value)
set theArray to current application's NSMutableArray's array()
repeat with aSource in o's theSources
set aSource to (current application's NSString's stringWithString:aSource) # Thanks to Nigel GARVEY which pointed to this omission
set theURLsNSArray to (theNSDataDetector's matchesInString:aSource options:0 range:{location:0, |length|:aSource's |length|()})
set anArray to (theURLsNSArray's valueForKeyPath:"URL.absoluteString")
(theArray's addObject:anArray)
end repeat
# theArray is an array of arrays
set theArray to (theArray's valueForKeyPath:"@unionOfArrays.self")
# drop duplicates and sort the array
set theSet to current application's NSOrderedSet's orderedSetWithArray:theArray
set theArray to (theSet's array())'s sortedArrayUsingSelector:"localizedStandardCompare:"
set thePred to current application's NSPredicate's predicateWithFormat:"self BEGINSWITH 'https:'"
set theArray to theArray's filteredArrayUsingPredicate:thePred
set theNSString to theArray's componentsJoinedByString:linefeed
theNSString as text
Edited according to Nigel GARVEY’s comment
Yvan KOENIG running Sierra 10.12.3 in French (VALLAURIS, France) mardi 14 mars 2017 13:56:58
As I’m not sure of what you really need, here is an alternate version.
A property allow it to return
– a list of strings whose each component is empty or contain the found links separated by the substring " | "
– a list of the links available in a mail.
use AppleScript version "2.4"
use framework "Foundation"
use scripting additions
property buildListOfStrings : true
# true = return a list of strings built by concatenation of links available in a mail using the string " | " as separator
# false = return a list of list of links available in a mail
script o
property theMessages : {}
property theLinks : {}
end script
tell application "Mail"
set o's theMessages to the selection
repeat with aMessage in o's theMessages
set aSource to source of aMessage
set end of o's theLinks to (my extractLinks:aSource)
end repeat
end tell
o's theLinks
on extractLinks:aSource
set theNSDataDetector to current application's NSDataDetector's dataDetectorWithTypes:(current application's NSTextCheckingTypeLink) |error|:(missing value)
set aSource to (current application's NSString's stringWithString:aSource) # Thanks to Nigel GARVEY which pointed to this omission
set theURLsNSArray to (theNSDataDetector's matchesInString:aSource options:0 range:{location:0, |length|:aSource's |length|()})
set theArray to (theURLsNSArray's valueForKeyPath:"URL.absoluteString")
# drop duplicates and sort the array
set theSet to current application's NSOrderedSet's orderedSetWithArray:theArray
set theArray to (theSet's array())'s sortedArrayUsingSelector:"localizedStandardCompare:"
set thePred to current application's NSPredicate's predicateWithFormat:"self BEGINSWITH 'https://'"
set theArray to theArray's filteredArrayUsingPredicate:thePred
if buildListOfStrings then
set theNSString to theArray's componentsJoinedByString:" | " # You may define an other separator. If you use a comma you will be unable to know which link belongs to which mail
return theNSString as text
else
return theArray as list
end if
end extractLinks:
Edited according to Nigel GARVEY’s comment.
Yvan KOENIG running Sierra 10.12.3 in French (VALLAURIS, France) mardi 14 mars 2017 14:25:37
Hi Yvan.
I’m not sure it’s relevant here, but just to point out that ‘length of aSource’ returns the number of characters in the AppleScript text ‘aSource’, whereas the |length| of an NSString is measured in 16-bit units. These aren’t always the same, so ideally, the latter should be used for the range value:
set aSource to current application's NSString's stringWithString:aSource
set theURLsNSArray to (theNSDataDetector's matchesInString:aSource options:0 range:{location:0, |length|:aSource's |length|()})
What an ass.
Thanks Nigel. I knew that but failed to take care of it.
I edited my two posts.
Yvan KOENIG running Sierra 10.12.3 in French (VALLAURIS, France) mardi 14 mars 2017 15:46:49