Tuesday, September 17, 2019

#1 2019-04-18 11:11:44 pm

vivikun
Member
Registered: 2019-04-19
Posts: 3

Extract information from signature within body of email

I'm a beginner in applescripting and I would like to get some help to learn how some parts of the scripting work. I was able to find solution on how to extract sender, date, and topic and paste it onto a spreadsheet from https://southcoastweb.co.uk/export-mess … -to-excel/. What I am trying to accomplish next is to extract information from the body of an email and place it into an excel file. The search in the forum yielded https://macscripter.net/viewtopic.php?id=45974 which is helpful, however in my case I would like to extract more information.

I found a posting that is similar to my problem https://discussions.apple.com/thread/6777748 however I am not familiar with the syntax to replicate it. In particular text quoted below

/^Student Name:\\s*(.+?)\\s*$/o           && do { $data[$i]{name}  = $1; next; };
    /^Student eMail Address:\\s*(.+?)\\s*$/o  && do { $data[$i]{email} = $1; next; };
    /with a final grade of ([0-9.]+%)/o       && do { $data[$i]{score} = $1; next; };
    /^ \\s* ([0-9]+) \\s* correct    \\s* \\( [0-9.]+% \\)/ox && do { $data[$i]{correct}    = $1; next; };
    /^ \\s* ([0-9]+) \\s* incorrect  \\s* \\( [0-9.]+% \\)/ox && do { $data[$i]{incorrect}  = $1; next; };
    /^ \\s* ([0-9]+) \\s* unanswered \\s* \\( [0-9.]+% \\)/ox && do { $data[$i]{unanswered} = $1; next; };



What I try to achieve is to extract data from a signature within the body of an e-mail formatted as follows:

Sincerely,
John Doe
San Francisco, CA 94103
johndoe@email.com



Resulting in the following delimited data:
name, email, city, state,
John Doe, johndoe@email.com, San Francisco, CA,

This format is consistent for majority of the emails, so I was thinking of filling in the gaps by matching it with the excel data I generated earlier. I appreciate if you can point me to the appropriate resource or help me out with the code.

Thanks!

Offline

 

#2 2019-04-19 12:05:24 pm

vivikun
Member
Registered: 2019-04-19
Posts: 3

Re: Extract information from signature within body of email

I made do with what I can follow and adapted the following script for my use from the source I cited earlier. From what I gather, what I was trying to achieve with extracting additional information from the signature embedded within the body of the email is not, for the lack of better term, in the dictionary and thus will need a workaround. I will keep reading through the sources that are available here, which are amazingly helpful by the way! I will reply a follow up when I make progress. Thank you!

Applescript:

tell application "Microsoft Excel"
   set LinkRemoval to make new workbook
   set theSheet to active sheet of LinkRemoval
   set formula of range "F1" of theSheet to "To"
   set formula of range "E1" of theSheet to "Reply to"
   set formula of range "D1" of theSheet to "Message"
   set formula of range "C1" of theSheet to "Subject"
   set formula of range "B1" of theSheet to "From"
   set formula of range "A1" of theSheet to "Date"
end tell

tell application "Mail"
   set theRow to 2
   get account
   set theMessages to messages of mailbox "Mailbox1"
   --to get the name of the mailbox run the following script below without the "--"
   -- tell application "Mail"
   -- get mailboxes
   --end tell
   repeat with aMessage in theMessages
       my SetDate(date received of aMessage, theRow, theSheet)
       my SetFrom(sender of aMessage, theRow, theSheet)
       my SetSubject(subject of aMessage, theRow, theSheet)
       my SetMessage(content of aMessage, theRow, theSheet)
       my SetReply(reply to of aMessage, theRow, theSheet)
       my SetRecipient(address of first recipient of aMessage, theRow, theSheet)
       set theRow to theRow + 1
   end repeat
end tell

on SetDate(theDate, theRow, theSheet)
   tell application "Microsoft Excel"
       set theRange to "A" & theRow
       set formula of range theRange of theSheet to theDate
   end tell
end SetDate

on SetFrom(theSender, theRow, theSheet)
   tell application "Microsoft Excel"
       set theRange to "B" & theRow
       set formula of range theRange of theSheet to theSender
   end tell
end SetFrom

on SetSubject(theSubject, theRow, theSheet)
   tell application "Microsoft Excel"
       set theRange to "C" & theRow
       set formula of range theRange of theSheet to theSubject
   end tell
end SetSubject

on SetMessage(theMessage, theRow, theSheet)
   tell application "Microsoft Excel"
       set theRange to "D" & theRow
       set formula of range theRange of theSheet to theMessage
   end tell
end SetMessage

on SetReply(theReply, theRow, theSheet)
   tell application "Microsoft Excel"
       set theRange to "E" & theRow
       set formula of range theRange of theSheet to theReply
   end tell
end SetReply

on SetRecipient(theRecipient, theRow, theSheet)
   tell application "Microsoft Excel"
       set theRange to "F" & theRow
       set formula of range theRange of theSheet to theRecipient
   end tell
end SetRecipient

Last edited by vivikun (2019-04-19 12:07:15 pm)

Offline

 

#3 2019-04-19 06:07:30 pm

Marc Anthony
Member
From:: Dallas, TX
Registered: 2006-04-27
Posts: 893

Re: Extract information from signature within body of email

The syntax you reference in post #1 is a rather complicated regular expression pattern. As a novice scripter, basic text manipulation concepts you want to initially learn are how to use offsets and text item delimiters as well as how to specify text ranges. There are some old tutorial articles in MacScripter's "unScripted" section, and the AppleScript language guide is also a good resource.

https://developer.apple.com/library/arc … -CH208-SW1

Here is some example code to get you started.

Applescript:

set thing to "whatever
something else
Sincerely,
John Doe
San Francisco, CA 94103
johndoe@email.com"



set counter to 1
repeat until thing's paragraph counter begins with "Sincerely,"
   set counter to counter + 1
end repeat

tell thing to set {nom, email, cityLine} to {paragraph (counter + 1), paragraph (counter + 3), paragraph (counter + 2)}
set setoff to offset of "," in cityLine
set city to cityLine's text 1 thru (setoff - 1)
{nom, city, email}

Offline

 

#4 2019-04-25 06:10:16 pm

vivikun
Member
Registered: 2019-04-19
Posts: 3

Re: Extract information from signature within body of email

Thank you for sharing the references! They were extremely helpful. I have put together the code below to extract more information using the delimiters that were available in email sent in. Example below had double "Sincerely," but have double space bars to delimit city name and street.

Applescript:

set thing to "Thank you for the opportunity to express my support
Sincerely,

Sincerely,
John Doe
123 Road Dr City Name, ST 12345-6789
jdoe@email.com"


set counter to 1
repeat until thing's paragraph counter begins with "Sincerely,"
   set counter to counter + 1
end repeat

tell thing to set {nom, email, cityLine} to {paragraph (counter + 3), paragraph (counter + 5), paragraph (counter + 4)}
set setoff to offset of "," in cityLine
set city to cityLine's text 1 thru (setoff + 3)
set AppleScript's text item delimiters to ","
set city2 to text item 1 of city
set AppleScript's text item delimiters to return
set AppleScript's text item delimiters to " "
set city3 to text item 2 of city2
set AppleScript's text item delimiters to return
set state to cityLine's text (setoff + 2) thru (setoff + 3)

{nom, email, city3, state}

This works pretty well for the most part when by passed with the try argument, however when I try to embed this onto my other script it changes some parts of it mainly the text getting changed from text to rich text. Which causes an error "Can't get text 1 thru 13 of..."

Applescript:

tell application "Mail"
   set theRow to 2
   get account
   set theMessages to messages of mailbox "Inbox"
       repeat with aMessage in theMessages
       
       set thing to content of aMessage as rich text
       set counter to 1
       repeat until thing's paragraph counter begins with "Sincerely," --change this according to the ending salutation5
           set counter to counter + 1
       end repeat
       tell thing to set {nom, email, cityLine} to {paragraph (counter + 1), paragraph (counter + 3), paragraph (counter + 2)}
       set setoff to offset of "," in cityLine
       set city to cityLine's rich text 1 thru (setoff + 3)
       my SetName(nom, theRow, theSheet)
       my SetCity(city, theRow, theSheet)
       my SetEmail(email, theRow, theSheet)
       
       my SetDate(date received of aMessage, theRow, theSheet)
       my SetFrom(sender of aMessage, theRow, theSheet)
       my SetSubject(subject of aMessage, theRow, theSheet)
       my SetMessage(content of aMessage, theRow, theSheet)
       my SetReply(reply to of aMessage, theRow, theSheet)
       my SetRecipient(address of first recipient of aMessage, theRow, theSheet)
       set theRow to theRow + 1
   end repeat
end tell

I think this is because of how I am extracting the message from the inbox and my lack of understanding of the different classes? in scripting. See the script embedded between the try end try argument.

Applescript:

tell application "Mail"
   set theRow to 2
   get account
   set theMessages to messages of mailbox "Inbox"
       repeat with aMessage in theMessages
       try
           set thing to content of aMessage as rich text
           set counter to 1
           repeat until thing's paragraph counter begins with "Sincerely," --change this according to the ending salutation5
               set counter to counter + 1
           end repeat
           tell thing to set {nom, email, cityLine} to {paragraph (counter + 3), paragraph (counter + 5), paragraph (counter + 4)}
           set setoff to offset of "," in cityLine
           set city to cityLine's rich text 1 thru (setoff + 3)
           my SetName(nom, theRow, theSheet)
           my SetCity(city, theRow, theSheet)
           my SetEmail(email, theRow, theSheet)
       end try
       my SetDate(date received of aMessage, theRow, theSheet)
       my SetFrom(sender of aMessage, theRow, theSheet)
       my SetSubject(subject of aMessage, theRow, theSheet)
       my SetMessage(content of aMessage, theRow, theSheet)
       my SetReply(reply to of aMessage, theRow, theSheet)
       my SetRecipient(address of first recipient of aMessage, theRow, theSheet)
       set theRow to theRow + 1
   end repeat
end tell

I'm running against a similar issue with the code below. What I do not understand is if I try to embed the argument below, my code doesn't work. For some reason it doesn't like the to argument being in there.

Edit: I just checked the dictionary and there is a to property, although it is part of the standard suite. Maybe I just don't have a clue.

Applescript:

to extractBetween(SearchText, startText, endText)
   set tid to AppleScript's text item delimiters -- save them for later.
   set AppleScript's text item delimiters to startText -- find the first one.
   set endItems to text of text item -1 of SearchText -- everything after the first.
   set AppleScript's text item delimiters to endText -- find the end one.
   set beginningToEnd to text of text item 1 of endItems -- get the first part.
   set AppleScript's text item delimiters to tid -- back to original values.
   return beginningToEnd -- pass back the piece.
end extractBetween

Last edited by vivikun (2019-04-25 06:17:51 pm)

Offline

 

#5 2019-04-26 03:08:49 am

Yvan Koenig
Member
Registered: 2006-09-14
Posts: 3576

Re: Extract information from signature within body of email

Here this code behaves flawlessly:

Applescript:

tell application "Mail"
   tell account "iCloud"
       set theMessages to messages of mailbox "INBOX"
       
       set theRow to 2
       repeat with aMessage in theMessages
           
           set thing to content of aMessage -- no need to add that because the content IS a rich text object
           set counter to 1
           try
               repeat until thing's paragraph counter begins with "Sincerely"
                   --change this according to the ending salutation5
                   set counter to counter + 1
               end repeat
               set theDate to date received of aMessage
               set theSender to sender of aMessage
               set theSubject to subject of aMessage
               --set theMessage to content of aMessage # it's already in thing
               set theReply to reply to of aMessage
               set theRecipient to address of first recipient of aMessage
               my extractor(counter, thing, theDate, theSender, theSubject, theReply, theRecipient)
           end try
           
           
           set theRow to theRow + 1
       end repeat
   end tell
end tell

on extractor(counter, thing, theDate, theSender, theSubject, theReply, theRecipient)
   tell thing to set {nom, email, cityLine} to {paragraph (counter + 1), paragraph (counter + 3), paragraph (counter + 2)}
   set setoff to offset of "," in cityLine
   set city to cityLine's text 1 thru (setoff + 3)
   
   my SetName(nom, theRow, theSheet)
   my SetCity(city, theRow, theSheet)
   my SetEmail(email, theRow, theSheet)
   
   my SetDate(theDate, theRow, theSheet)
   my SetFrom(theSender, theRow, theSheet)
   my SetSubject(theSubject, theRow, theSheet)
   my SetMessage(thing, theRow, theSheet)
   my SetReply(theReply, theRow, theSheet)
   my SetRecipient(theRecipient, theRow, theSheet)
   
end extractor

Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) vendredi 26 avril 2019  11:08:42

Offline

 

#6 2019-05-07 01:50:37 am

LucreziaG
Member
From:: Geneve - Switzerland
Registered: 2019-05-07
Posts: 1

Re: Extract information from signature within body of email

vivikun wrote:



Applescript:

tell application "Microsoft Excel"
   set LinkRemoval to make new workbook
   set theSheet to active sheet of LinkRemoval
   set formula of range "F1" of theSheet to "To"
   set formula of range "E1" of theSheet to "Reply to"
   set formula of range "D1" of theSheet to "Message"
   set formula of range "C1" of theSheet to "Subject"
   set formula of range "B1" of theSheet to "From"
   set formula of range "A1" of theSheet to "Date"
end tell

tell application "Mail"
   set theRow to 2
   get account
   set theMessages to messages of mailbox "Mailbox1"
   --to get the name of the mailbox run the following script below without the "--"
   -- tell application "Mail"
   -- get mailboxes
   --end tell
   repeat with aMessage in theMessages
       my SetDate(date received of aMessage, theRow, theSheet)
       my SetFrom(sender of aMessage, theRow, theSheet)
       my SetSubject(subject of aMessage, theRow, theSheet)
       my SetMessage(content of aMessage, theRow, theSheet)
       my SetReply(reply to of aMessage, theRow, theSheet)
       my SetRecipient(address of first recipient of aMessage, theRow, theSheet)
       set theRow to theRow + 1
   end repeat
end tell

on SetDate(theDate, theRow, theSheet)
   tell application "Microsoft Excel"
       set theRange to "A" & theRow
       set formula of range theRange of theSheet to theDate
   end tell
end SetDate

on SetFrom(theSender, theRow, theSheet)
   tell application "Microsoft Excel"
       set theRange to "B" & theRow
       set formula of range theRange of theSheet to theSender
   end tell
end SetFrom

on SetSubject(theSubject, theRow, theSheet)
   tell application "Microsoft Excel"
       set theRange to "C" & theRow
       set formula of range theRange of theSheet to theSubject
   end tell
end SetSubject

on SetMessage(theMessage, theRow, theSheet)
   tell application "Microsoft Excel"
       set theRange to "D" & theRow
       set formula of range theRange of theSheet to theMessage
   end tell
end SetMessage

on SetReply(theReply, theRow, theSheet)
   tell application "Microsoft Excel"
       set theRange to "E" & theRow
       set formula of range theRange of theSheet to theReply
   end tell
end SetReply

on SetRecipient(theRecipient, theRow, theSheet)
   tell application "Microsoft Excel"
       set theRange to "F" & theRow
       set formula of range theRange of theSheet to theRecipient
   end tell
end SetRecipient



I had the same problem and I tested this script to copy emails from Mail.app to Excel. It works fine, so I thank you so much for this great help. Just a question: it opens a new Excel sheet with no name (WB1,2 ..). I would like the script to open it with a specific name (eg: ORDERS) in a specific folder (eg: WAREHOUSE), so its path - always the same - can be returned to an Excel macro.
Can you tell me if it is possible and which instruction should I add? Thanks in advance for your courtesy

Model: PowerBook late 2009
AppleScript: Script editor V. 2.8.1 (183.1)
Browser: Chrome v. 73.0.3683.103
Operating System: macOS 10.11


I love sailing, biking & Macintoshing

Offline

 

Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)