Sunday, May 28, 2017

#1 2017-03-14 05:48:22 am

Adam239
Member
Registered: 2016-10-13
Posts: 51

Pulling All Hyperlinks from Email Text & Storing in a Variable

Is there a way of checking the content of an email and copying all the hyperlinks within the text to a single variable?
I've written a script that searches a selected email content and subject for values that I need. However, I am not sure how to approach the hyperlinks.

Hyperlinks in the emails will always start with the HTTPS:// prefix but there may be several of them, or there may be none of them.

I want all hyperlinks stored in a single variable so I can drop them into a database using the finished script.

This is my first time using the delimiters within Applescript, so this might be a bit messy - the script below is already searching and storing several other values I need to pull from these emails:

Applescript:

set subjectSearch to "-"
set bSearch to "List:"
set qSearch to "Name"
set bkbSearch to "Age"
set qkbSearch to "Birth"
set refSearch to "#"
set theResults to {}

tell application "Mail"
   try
       set theMessages to selection
       if theMessages is {} then
           display alert "No Messages Selected" message "Select the messages you want to collect before running this script."
           error number -128
           return
       end if
       
       set dateList to {}
       set reportText to ""
       repeat with theMessage in theMessages
           set reportText to reportText & (content of theMessage) as string
       end repeat
       
       set theurls to (content of theMessage) as string
       
       repeat with tMsg in (get selection)
           set end of dateList to the short date string of (get date received of tMsg)
       end repeat
       
       repeat with aMessage in theMessages
           tell aMessage
               set theSubject to subject
               set theContent to content
           end tell
           set refNum to my getFirstWordAfterSearchText(refSearch, theSubject)
           set refB to my getFirstWordAfterSearchText(bSearch, theContent)
           set refQ to my getFirstWordAfterSearchText(qSearch, theContent)
           set refBkb to my getFirstWordAfterSearchText(bkbSearch, theContent)
           set refQkb to my getFirstWordAfterSearchText(qkbSearch, theContent)
       end repeat
   on error theError
       display dialog theError buttons {"OK"} default button 1
       return
   end try
end tell
end
on getFirstWordAfterSearchText(searchString, theText)
   try
       set {tids, AppleScript's text item delimiters} to {AppleScript's text item delimiters, searchString}
       set textItems to text items of theText
       set AppleScript's text item delimiters to tids
       return (first word of (item 2 of textItems))
   on error theError
       return ""
   end try
end getFirstWordAfterSearchText


return {dateList, remNum, refB, refQ, refBkb, refQkb}



running Sierra 10.12.4

Offline

 

#2 2017-03-14 05:54:02 am

Adam239
Member
Registered: 2016-10-13
Posts: 51

Re: Pulling All Hyperlinks from Email Text & Storing in a Variable

If this isn't possible - maybe simply copying all text that occurs after a certain word to a variable is simpler? Since the URLs I'm trying to grab always appear at the end of the email - so maybe copy everything that appears after the word "URL's" ?


running Sierra 10.12.4

Offline

 

#3 2017-03-14 07:57:56 am

Yvan Koenig
Member
Registered: 2006-09-14
Posts: 3007

Re: Pulling All Hyperlinks from Email Text & Storing in a Variable

Hello Adam

You may use that as a draft.
You will not be surprised to read that most of the code was borrowed from Shane STANLEY.

Applescript:

use AppleScript version "2.4"
use framework "Foundation"
use scripting additions

script o
   property theSources : {}
   property theMessages : {}
end script

on findURLsIn:theString
   set theNSDataDetector to current application's NSDataDetector's dataDetectorWithTypes:(current application's NSTextCheckingTypeLink) |error|:(missing value)
   set theURLsNSArray to theNSDataDetector's matchesInString:theString options:0 range:{location:0, |length|:length of theString}
   return (theURLsNSArray's valueForKeyPath:"URL.absoluteString") as list
end findURLsIn:


set linksList to {}
tell application "Mail"
   set o's theMessages to the selection
   repeat with aMessage in o's theMessages
       set end of o's theSources to source of aMessage
   end repeat
end tell
--its findURLsIn:theContent
o's theSources as list

set theNSDataDetector to current application's NSDataDetector's dataDetectorWithTypes:(current application's NSTextCheckingTypeLink) |error|:(missing value)

set theArray to current application's NSMutableArray's array()
repeat with aSource in o's theSources
   set aSource to (current application's NSString's stringWithString:aSource) # Thanks to Nigel GARVEY which pointed to this omission
   set theURLsNSArray to (theNSDataDetector's matchesInString:aSource options:0 range:{location:0, |length|:aSource's |length|()})
   set anArray to (theURLsNSArray's valueForKeyPath:"URL.absoluteString")
   (theArray's addObject:anArray)
end repeat
# theArray is an array of arrays
set theArray to (theArray's valueForKeyPath:"@unionOfArrays.self")
# drop duplicates and sort the array
set theSet to current application's NSOrderedSet's orderedSetWithArray:theArray
set theArray to (theSet's array())'s sortedArrayUsingSelector:"localizedStandardCompare:"

set thePred to current application's NSPredicate's predicateWithFormat:"self BEGINSWITH 'https:'"
set theArray to theArray's filteredArrayUsingPredicate:thePred
set theNSString to theArray's componentsJoinedByString:linefeed
theNSString as text

Edited according to Nigel GARVEY's comment

Yvan KOENIG running Sierra 10.12.3 in French (VALLAURIS, France) mardi 14 mars 2017 13:56:58

Last edited by Yvan Koenig (2017-03-14 10:04:26 am)

Offline

 

#4 2017-03-14 08:25:41 am

Yvan Koenig
Member
Registered: 2006-09-14
Posts: 3007

Re: Pulling All Hyperlinks from Email Text & Storing in a Variable

As I'm not sure of what you really need, here is an alternate version.
A property allow it to return
-- a list of strings whose each component is empty or contain the found links separated by the substring " | "
-- a list of the links available in a mail.

Applescript:

use AppleScript version "2.4"
use framework "Foundation"
use scripting additions

property buildListOfStrings : true
# true = return a list of strings built by concatenation of links available in a mail using the string " | " as separator
# false = return a list of list of links available in a mail

script o
   property theMessages : {}
   property theLinks : {}
end script

tell application "Mail"
   set o's theMessages to the selection
   repeat with aMessage in o's theMessages
       set aSource to source of aMessage
       set end of o's theLinks to (my extractLinks:aSource)
   end repeat
end tell

o's theLinks

on extractLinks:aSource
   set theNSDataDetector to current application's NSDataDetector's dataDetectorWithTypes:(current application's NSTextCheckingTypeLink) |error|:(missing value)
   set aSource to (current application's NSString's stringWithString:aSource) # Thanks to Nigel GARVEY which pointed to this omission
   set theURLsNSArray to (theNSDataDetector's matchesInString:aSource options:0 range:{location:0, |length|:aSource's |length|()})
   set theArray to (theURLsNSArray's valueForKeyPath:"URL.absoluteString")
   # drop duplicates and sort the array
   set theSet to current application's NSOrderedSet's orderedSetWithArray:theArray
   set theArray to (theSet's array())'s sortedArrayUsingSelector:"localizedStandardCompare:"
   
   set thePred to current application's NSPredicate's predicateWithFormat:"self BEGINSWITH 'https://'"
   set theArray to theArray's filteredArrayUsingPredicate:thePred
   if buildListOfStrings then
       set theNSString to theArray's componentsJoinedByString:" | " # You may define an other separator. If you use a comma you will be unable to know which link belongs to which mail
       return theNSString as text
   else
       return theArray as list
   end if
end extractLinks:

Edited according to Nigel GARVEY's comment.

Yvan KOENIG running Sierra 10.12.3 in French (VALLAURIS, France) mardi 14 mars 2017 14:25:37

Last edited by Yvan Koenig (2017-03-14 10:07:00 am)

Offline

 

#5 2017-03-14 09:12:45 am

Nigel Garvey
Moderator
From: Warwickshire, England
Registered: 2002-11-19
Posts: 4285

Re: Pulling All Hyperlinks from Email Text & Storing in a Variable

Yvan Koenig wrote:

Applescript:

set theURLsNSArray to (theNSDataDetector's matchesInString:aSource options:0 range:{location:0, |length|:length of aSource})

Hi Yvan.

I'm not sure it's relevant here, but just to point out that 'length of aSource' returns the number of characters in the AppleScript text 'aSource', whereas the |length| of an NSString is measured in 16-bit units. These aren't always the same, so ideally, the latter should be used for the range value:

Applescript:

set aSource to current application's NSString's stringWithString:aSource
set theURLsNSArray to (theNSDataDetector's matchesInString:aSource options:0 range:{location:0, |length|:aSource's |length|()})

Last edited by Nigel Garvey (2017-03-14 09:14:17 am)


NG

Offline

 

#6 2017-03-14 09:46:54 am

Yvan Koenig
Member
Registered: 2006-09-14
Posts: 3007

Re: Pulling All Hyperlinks from Email Text & Storing in a Variable

What an ass.

Thanks Nigel. I knew that but failed to take care of it.

I edited my two posts.


Yvan KOENIG running Sierra 10.12.3 in French (VALLAURIS, France) mardi 14 mars 2017 15:46:49

Last edited by Yvan Koenig (2017-03-14 10:07:25 am)

Offline

 

Board footer

Powered by FluxBB

[ Generated in 0.061 seconds, 10 queries executed ]

RSS (new topics) RSS (active topics)