Extract information from signature within body of email

I’m a beginner in applescripting and I would like to get some help to learn how some parts of the scripting work. I was able to find solution on how to extract sender, date, and topic and paste it onto a spreadsheet from https://southcoastweb.co.uk/export-messages-from-mail-to-excel/. What I am trying to accomplish next is to extract information from the body of an email and place it into an excel file. The search in the forum yielded https://macscripter.net/viewtopic.php?id=45974 which is helpful, however in my case I would like to extract more information.

I found a posting that is similar to my problem https://discussions.apple.com/thread/6777748 however I am not familiar with the syntax to replicate it. In particular text quoted below

What I try to achieve is to extract data from a signature within the body of an e-mail formatted as follows:

Resulting in the following delimited data:
name, email, city, state,
John Doe, johndoe@email.com, San Francisco, CA,

This format is consistent for majority of the emails, so I was thinking of filling in the gaps by matching it with the excel data I generated earlier. I appreciate if you can point me to the appropriate resource or help me out with the code.

Thanks!

I made do with what I can follow and adapted the following script for my use from the source I cited earlier. From what I gather, what I was trying to achieve with extracting additional information from the signature embedded within the body of the email is not, for the lack of better term, in the dictionary and thus will need a workaround. I will keep reading through the sources that are available here, which are amazingly helpful by the way! I will reply a follow up when I make progress. Thank you!

tell application "Microsoft Excel"
	set LinkRemoval to make new workbook
	set theSheet to active sheet of LinkRemoval
	set formula of range "F1" of theSheet to "To"
	set formula of range "E1" of theSheet to "Reply to"
	set formula of range "D1" of theSheet to "Message"
	set formula of range "C1" of theSheet to "Subject"
	set formula of range "B1" of theSheet to "From"
	set formula of range "A1" of theSheet to "Date"
end tell

tell application "Mail"
	set theRow to 2
	get account
	set theMessages to messages of mailbox "Mailbox1"
	--to get the name of the mailbox run the following script below without the "--"
	-- tell application "Mail"
	-- get mailboxes
	--end tell
	repeat with aMessage in theMessages
		my SetDate(date received of aMessage, theRow, theSheet)
		my SetFrom(sender of aMessage, theRow, theSheet)
		my SetSubject(subject of aMessage, theRow, theSheet)
		my SetMessage(content of aMessage, theRow, theSheet)
		my SetReply(reply to of aMessage, theRow, theSheet)
		my SetRecipient(address of first recipient of aMessage, theRow, theSheet)
		set theRow to theRow + 1
	end repeat
end tell

on SetDate(theDate, theRow, theSheet)
	tell application "Microsoft Excel"
		set theRange to "A" & theRow
		set formula of range theRange of theSheet to theDate
	end tell
end SetDate

on SetFrom(theSender, theRow, theSheet)
	tell application "Microsoft Excel"
		set theRange to "B" & theRow
		set formula of range theRange of theSheet to theSender
	end tell
end SetFrom

on SetSubject(theSubject, theRow, theSheet)
	tell application "Microsoft Excel"
		set theRange to "C" & theRow
		set formula of range theRange of theSheet to theSubject
	end tell
end SetSubject

on SetMessage(theMessage, theRow, theSheet)
	tell application "Microsoft Excel"
		set theRange to "D" & theRow
		set formula of range theRange of theSheet to theMessage
	end tell
end SetMessage

on SetReply(theReply, theRow, theSheet)
	tell application "Microsoft Excel"
		set theRange to "E" & theRow
		set formula of range theRange of theSheet to theReply
	end tell
end SetReply

on SetRecipient(theRecipient, theRow, theSheet)
	tell application "Microsoft Excel"
		set theRange to "F" & theRow
		set formula of range theRange of theSheet to theRecipient
	end tell
end SetRecipient

The syntax you reference in post #1 is a rather complicated regular expression pattern. As a novice scripter, basic text manipulation concepts you want to initially learn are how to use offsets and text item delimiters as well as how to specify text ranges. There are some old tutorial articles in MacScripter’s “unScripted” section, and the AppleScript language guide is also a good resource.

https://developer.apple.com/library/archive/documentation/AppleScript/Conceptual/AppleScriptLangGuide/introduction/ASLR_intro.html#//apple_ref/doc/uid/TP40000983-CH208-SW1

Here is some example code to get you started.

set thing to "whatever
something else
Sincerely,
John Doe
San Francisco, CA 94103
johndoe@email.com"


set counter to 1
repeat until thing's paragraph counter begins with "Sincerely,"
	set counter to counter + 1
end repeat

tell thing to set {nom, email, cityLine} to {paragraph (counter + 1), paragraph (counter + 3), paragraph (counter + 2)}
set setoff to offset of "," in cityLine
set city to cityLine's text 1 thru (setoff - 1)
{nom, city, email}

Thank you for sharing the references! They were extremely helpful. I have put together the code below to extract more information using the delimiters that were available in email sent in. Example below had double “Sincerely,” but have double space bars to delimit city name and street.

set thing to "Thank you for the opportunity to express my support
Sincerely,

Sincerely,
John Doe
123 Road Dr  City Name, ST 12345-6789
jdoe@email.com"

set counter to 1
repeat until thing's paragraph counter begins with "Sincerely,"
	set counter to counter + 1
end repeat

tell thing to set {nom, email, cityLine} to {paragraph (counter + 3), paragraph (counter + 5), paragraph (counter + 4)}
set setoff to offset of "," in cityLine
set city to cityLine's text 1 thru (setoff + 3)
set AppleScript's text item delimiters to ","
set city2 to text item 1 of city
set AppleScript's text item delimiters to return
set AppleScript's text item delimiters to "  "
set city3 to text item 2 of city2
set AppleScript's text item delimiters to return
set state to cityLine's text (setoff + 2) thru (setoff + 3)

{nom, email, city3, state}

This works pretty well for the most part when by passed with the try argument, however when I try to embed this onto my other script it changes some parts of it mainly the text getting changed from text to rich text. Which causes an error “Can’t get text 1 thru 13 of…”

tell application "Mail"
	set theRow to 2
	get account
	set theMessages to messages of mailbox "Inbox"
		repeat with aMessage in theMessages
		
		set thing to content of aMessage as rich text
		set counter to 1
		repeat until thing's paragraph counter begins with "Sincerely," --change this according to the ending salutation5
			set counter to counter + 1
		end repeat
		tell thing to set {nom, email, cityLine} to {paragraph (counter + 1), paragraph (counter + 3), paragraph (counter + 2)}
		set setoff to offset of "," in cityLine
		set city to cityLine's rich text 1 thru (setoff + 3)
		my SetName(nom, theRow, theSheet)
		my SetCity(city, theRow, theSheet)
		my SetEmail(email, theRow, theSheet)
		
		my SetDate(date received of aMessage, theRow, theSheet)
		my SetFrom(sender of aMessage, theRow, theSheet)
		my SetSubject(subject of aMessage, theRow, theSheet)
		my SetMessage(content of aMessage, theRow, theSheet)
		my SetReply(reply to of aMessage, theRow, theSheet)
		my SetRecipient(address of first recipient of aMessage, theRow, theSheet)
		set theRow to theRow + 1
	end repeat
end tell

I think this is because of how I am extracting the message from the inbox and my lack of understanding of the different classes? in scripting. See the script embedded between the try end try argument.

tell application "Mail"
	set theRow to 2
	get account
	set theMessages to messages of mailbox "Inbox"
		repeat with aMessage in theMessages
		try
			set thing to content of aMessage as rich text
			set counter to 1
			repeat until thing's paragraph counter begins with "Sincerely," --change this according to the ending salutation5
				set counter to counter + 1
			end repeat
			tell thing to set {nom, email, cityLine} to {paragraph (counter + 3), paragraph (counter + 5), paragraph (counter + 4)}
			set setoff to offset of "," in cityLine
			set city to cityLine's rich text 1 thru (setoff + 3)
			my SetName(nom, theRow, theSheet)
			my SetCity(city, theRow, theSheet)
			my SetEmail(email, theRow, theSheet)
		end try
		my SetDate(date received of aMessage, theRow, theSheet)
		my SetFrom(sender of aMessage, theRow, theSheet)
		my SetSubject(subject of aMessage, theRow, theSheet)
		my SetMessage(content of aMessage, theRow, theSheet)
		my SetReply(reply to of aMessage, theRow, theSheet)
		my SetRecipient(address of first recipient of aMessage, theRow, theSheet)
		set theRow to theRow + 1
	end repeat
end tell

I’m running against a similar issue with the code below. What I do not understand is if I try to embed the argument below, my code doesn’t work. For some reason it doesn’t like the to argument being in there.

Edit: I just checked the dictionary and there is a to property, although it is part of the standard suite. Maybe I just don’t have a clue.

to extractBetween(SearchText, startText, endText)
	set tid to AppleScript's text item delimiters -- save them for later.
	set AppleScript's text item delimiters to startText -- find the first one.
	set endItems to text of text item -1 of SearchText -- everything after the first.
	set AppleScript's text item delimiters to endText -- find the end one.
	set beginningToEnd to text of text item 1 of endItems -- get the first part.
	set AppleScript's text item delimiters to tid -- back to original values.
	return beginningToEnd -- pass back the piece.
end extractBetween

Here this code behaves flawlessly:

tell application "Mail"
	tell account "iCloud"
		set theMessages to messages of mailbox "INBOX"
		
		set theRow to 2
		repeat with aMessage in theMessages
			
			set thing to content of aMessage -- no need to add that because the content IS a rich text object
			set counter to 1
			try
				repeat until thing's paragraph counter begins with "Sincerely"
					--change this according to the ending salutation5
					set counter to counter + 1
				end repeat
				set theDate to date received of aMessage
				set theSender to sender of aMessage
				set theSubject to subject of aMessage
				--set theMessage to content of aMessage # it's already in thing
				set theReply to reply to of aMessage
				set theRecipient to address of first recipient of aMessage
				my extractor(counter, thing, theDate, theSender, theSubject, theReply, theRecipient)
			end try
			
			
			set theRow to theRow + 1
		end repeat
	end tell
end tell

on extractor(counter, thing, theDate, theSender, theSubject, theReply, theRecipient)
	tell thing to set {nom, email, cityLine} to {paragraph (counter + 1), paragraph (counter + 3), paragraph (counter + 2)}
	set setoff to offset of "," in cityLine
	set city to cityLine's text 1 thru (setoff + 3)
	
	my SetName(nom, theRow, theSheet)
	my SetCity(city, theRow, theSheet)
	my SetEmail(email, theRow, theSheet)
	
	my SetDate(theDate, theRow, theSheet)
	my SetFrom(theSender, theRow, theSheet)
	my SetSubject(theSubject, theRow, theSheet)
	my SetMessage(thing, theRow, theSheet)
	my SetReply(theReply, theRow, theSheet)
	my SetRecipient(theRecipient, theRow, theSheet)
	
end extractor

Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) vendredi 26 avril 2019 11:08:42

I had the same problem and I tested this script to copy emails from Mail.app to Excel. It works fine, so I thank you so much for this great help. Just a question: it opens a new Excel sheet with no name (WB1,2 …). I would like the script to open it with a specific name (eg: ORDERS) in a specific folder (eg: WAREHOUSE), so its path - always the same - can be returned to an Excel macro.
Can you tell me if it is possible and which instruction should I add? Thanks in advance for your courtesy

Model: PowerBook late 2009
AppleScript: Script editor V. 2.8.1 (183.1)
Browser: Chrome v. 73.0.3683.103
Operating System: macOS 10.11