I’m a script dunce who’s wondering if Applescript can solve a specific problem I have - I’m running OE5.06 in Classic on OS X 10.2.6, and I’m trying to extract data from incoming emails generated by a formmail.cgi. The fields include name, address 1/2, email etc.
I’m aiming to output the data in tab-delimited text format.
Am I stupid or is this possible? If so is there someone who likes a challenge willing to write it for me?
javascript:emoticon(‘:D’)
You certainly can do it, I think everyone should have a script that processes incoming emails.
Processing each line kind of depends on how your CGI formats the email though. I have one that uses the common structure of:
fieldOne=some data typed in a field
fieldTwo=some data selected from a menu
fildThree=some data from a radio button
So that is what I used in the following script. You can change the way it processes each line to fit how your CGI formats your emails.
--this is the file we'll write the extracted data to.
property logFile : alias "Macintosh HD:Desktop Folder:text file.txt"
--set up a variable for the carraige return
property carraigeReturn : (ASCII character 13)
tell application "Outlook Express"
--get a list of the current messages
--will return messages currently manually selected, or all messages being received
--when the RULE is triggered. (see below)
set lisCurrentMsgs to get the current messages
--repeat with every current message
--(in case more than one match the RULE on receipt)
repeat with alsThisMsg in lisCurrentMsgs
set strBody to the content of alsThisMsg --get the body of this message
--store the current TID's
set oldDelims to AppleScript's text item delimiters
--set the TID's to return
set AppleScript's text item delimiters to return
--get the text items (every item delimited by return) of the body of this email
set submittedItems to the text items of strBody
--reset the TID's
set AppleScript's text item delimiters to oldDelims
repeat with thisItem in submittedItems
--repeat with every text item of the body of this email (we split it up with return as a delimiter)
if thisItem contains "Who:=" then --then this line contains the data for who
set charCount to the (count of characters in (thisItem as string)) --get the character count for this line
set whoField to characters 6 thru charCount of thisItem as string
else if thisItem contains "Email:=" then --this this line contains the email address
set charCount to the (count of characters in (thisItem as string)) --get the character count for this line
set emailField to characters 8 thru charCount of thisItem as string
--add as many else if's as you need and edit the appendText line below to accomodate
end if
end repeat
--now create a tab delim'ed record from the harvested data and write to a file
set appendText to whoField & tab & emailField as string --concatenate the data to form the record
set getEOF to (get eof logFile) + 1 --get the end of the log file and add one (this is where we'll start writing)
write appendText & carriageReturn to logFile starting at getEOF --write to the file
end repeat
end tell
Save the script in OE’s Script Menu Items folder and set up a rule to match some criteria from the emails your CGI send. If the email address is unique, use that, or use the subject line. Make sure the rule action performs the script you just saved, and maybe moves the email or deletes it.
I’ve done my newbie best to compile this from what you supplied and below is where I got to. However I’m getting an execution error: "the variable ‘nameField’ is not defined’.
Can you diagnose this and any other glaring errors? Much appreciated.
set lisCurrentMsgs to get the current messages
repeat with alsThisMsg in lisCurrentMsgs
set strBody to the content of alsThisMsg
set oldDelims to AppleScript's text item delimiters
set AppleScript's text item delimiters to return
set submittedItems to the text items of strBody
set AppleScript's text item delimiters to oldDelims
repeat with thisItem in submittedItems
if thisItem contains "Name:=" then
set charCount to the (count of characters in (thisItem as string))
set nameField to characters 6 thru charCount of thisItem as string
else if thisItem contains "Address1:=" then
set charCount to the (count of characters in (thisItem as string))
set address1Field to characters 8 thru charCount of thisItem as string
else if thisItem contains "Suburb:=" then
set charCount to the (count of characters in (thisItem as string))
set suburbField to characters 8 thru charCount of thisItem as string
else if thisItem contains "Town/City:=" then
set charCount to the (count of characters in (thisItem as string))
set towncityField to characters 8 thru charCount of thisItem as string
else if thisItem contains "Country:=" then
set charCount to the (count of characters in (thisItem as string))
set countryField to characters 8 thru charCount of thisItem as string
else if thisItem contains "Email:=" then
set charCount to the (count of characters in (thisItem as string))
set emailField to characters 8 thru charCount of thisItem as string
else if thisItem contains "Phone:=" then
set charCount to the (count of characters in (thisItem as string))
set phoneField to characters 8 thru charCount of thisItem as string
else if thisItem contains "findout method:=" then
set charCount to the (count of characters in (thisItem as string))
set findoutField to characters 8 thru charCount of thisItem as string
else if thisItem contains "Other:=" then
set charCount to the (count of characters in (thisItem as string))
set otherField to characters 8 thru charCount of thisItem as string
else if thisItem contains "PRODUCT CHOICE:=" then
set charCount to the (count of characters in (thisItem as string))
set productchoiceField to characters 8 thru charCount of thisItem as string
end if
end repeat
set appendText to nameField & tab & address1Field & tab & suburbField & tab & towncityField & tab & countryField & tab & emailField & tab & phoneField & tab & findoutField & tab & otherField & tab & productchoiceField as string
set getEOF to (get eof logFile) + 1
write appendText & carriageReturn to logFile starting at getEOF
end repeat
end tell
The error appears to be generated when you start setting up the text to append. And it is due to the script never entering the if thisItem contains “Name:=” then block. Apparently the email that is generated by the CGI does not supply a line that contains “Name:=” so it never sets the variable “nameField”. I used that format only as an example, (Field Name:=). You will have to use what the CGI actually supplies as the data heading, and adjust the character count to trim it down.
So, if the CGI sending the email formats it like:
Your code would be:
if thisItem contains "Name=" then
set charCount to the (count of characters in (thisItem as string))
set nameField to characters 5 thru charCount of thisItem as string
You will have to open an email sent by the CGI to figure out how it is being formatted.
I’m not entirely sure on your property either. I’ve just never seen the path to a Mac file portrayed with “/”. Usually you would use a colon “:” but maybe you have a reason for that. Otherwise your code looks good.
If you are still having problems, post the actual email the CGI is sending. Feel free to substitute any sensitive information but leave the text the CGI sends as the data headers.
Ok, I think I’m close to getting this - but now I’m stumped again. I’ve copied a CGI email here as suggested:
Here is where I got to with the code, if I understood you correctly, but when I run it in ScriptEditor I get 'OE got an error: can’t make ‘MacintoshHD:Desktop:mailoutput.txt’ into a file. The filename looks ok to me - any ideas?
property logFile : "MacintoshHD:Desktop:mailoutput.txt"
property carriageReturn : (ASCII character 13)
on run
tell application "Outlook Express"
set lisCurrentMsgs to get the current messages
repeat with alsThisMsg in lisCurrentMsgs
set strBody to the content of alsThisMsg
set oldDelims to AppleScript's text item delimiters
set AppleScript's text item delimiters to return
set submittedItems to the text items of strBody
set AppleScript's text item delimiters to oldDelims
repeat with thisItem in submittedItems
if thisItem contains "Name:" then
set charCount to the (count of characters in (thisItem as string))
set nameField to characters 5 thru charCount of thisItem as string
else if thisItem contains "Address1:" then
set charCount to the (count of characters in (thisItem as string))
set address1Field to characters 9 thru charCount of thisItem as string
else if thisItem contains "Suburb:" then
set charCount to the (count of characters in (thisItem as string))
set suburbField to characters 7 thru charCount of thisItem as string
else if thisItem contains "Town/City:" then
set charCount to the (count of characters in (thisItem as string))
set towncityField to characters 10 thru charCount of thisItem as string
else if thisItem contains "Country:" then
set charCount to the (count of characters in (thisItem as string))
set countryField to characters 8 thru charCount of thisItem as string
else if thisItem contains "Email:" then
set charCount to the (count of characters in (thisItem as string))
set emailField to characters 6 thru charCount of thisItem as string
else if thisItem contains "Phone:" then
set charCount to the (count of characters in (thisItem as string))
set phoneField to characters 6 thru charCount of thisItem as string
else if thisItem contains "findout method:" then
set charCount to the (count of characters in (thisItem as string))
set findoutField to characters 15 thru charCount of thisItem as string
else if thisItem contains "Other:" then
set charCount to the (count of characters in (thisItem as string))
set otherField to characters 6 thru charCount of thisItem as string
else if thisItem contains "PRODUCT CHOICE:" then
set charCount to the (count of characters in (thisItem as string))
set productchoiceField to characters 15 thru charCount of thisItem as string
end if
end repeat
set appendText to nameField & tab & address1Field & tab & suburbField & tab & towncityField & tab & countryField & tab & emailField & tab & phoneField & tab & findoutField & tab & otherField & tab & productchoiceField as string
set getEOF to (get eof logFile) + 1
write appendText & carriageReturn to logFile starting at getEOF
end repeat
end tell
end run
The logFile property in the script you posted had the word “Folder” missing after “Desktop” And, unless your drive name is MacintoshHD, (without a space between Macintosh & HD) then your script would have a problem finding the file to write to it. Make sure the property correctly points to the file. Jonn’s suggestion for coding the property is a good one, you can use either. I see no reason your script wouldn’t work after fixing that property.
Thanks for clarifying the logfile point, I’ve addressed that - but it’s not quite resolved yet and I think it might be to do with the CGI headers? Forgive my ignorance, but you may have guessed I’m not a CGI expert either! The first set of sample code I posted didn’t show all the headers - e.g, on examining, I note in each case there’s a unique ‘id’ number. Is this relevant? I’ve pasted some complete code as an example here:
Return-Path: <xxxxxxxx@phosphor.host4u.net>
Received: from phosphor.host4u.net ([64.33.14.39])
by pollux.host4u.net (8.11.6/8.11.6) with ESMTP id h6T0rZ715368
for <info@xxxxxxxx.com>; Mon, 28 Jul 2003 19:53:35 -0500
Received: (from xxxxxxxx@localhost)
by phosphor.host4u.net (8.11.6/8.11.6) id h6T0rUs06314;
Mon, 28 Jul 2003 19:53:30 -0500
Date: Mon, 28 Jul 2003 19:53:30 -0500
Message-Id: <200307290053.h6T0rUs06314@phosphor.host4u.net>
To: info@xxxxxxxx.com
From: ()
Subject: Sample Request From
Status: RO
Below is the result of your feedback form. It was submitted by
() on Monday, July 28, 2003 at 19:53:30
---------------------------------------------------------------------------
Name: xxxxx xxxxxxxx
Address1: xxxxx xxxx xxxxxx
Suburb: xxxxxxx
Town/City: Auckland
Country: New Zealand
Email: xxxxxxxxxxxxx
Phone: xxxxxxxxxxxxx
findout method: Word of mouth
Other: (specify)
PRODUCT CHOICE: xxxxxxxx
The headers shouldn’t make any difference. When you say it “isn’t resolved yet”, what, exactly, do you mean? What is not working? What is the output you are getting? When you get the content of the message, does it include all of the headers?
Sorry Jon, ignore my last post, this is the situation.
I’ve saved the script into OE Script Menu items. I tried selecting three Sample Request emails from the relevant folder, and ran the script, but got ‘Script could not be run. End of file was reached.’
However, I also tried highlighting these same scripts, and then running the script from Script Editor. This generated a mailoutput.txt file on the desktop, content of which is:
Very close, but not quite what I need I’m after the data but not the field names if poss. The name field is missing, ‘address1’, ‘Town/City’ and ‘Phone’ shows ok, but the rest just show fieldnames, not content.
I’m hoping to achieve a tab delimited format that will just open up in Excel in columns so I can edit it to use as a mailing list.
I hope this makes sense, and apologise about the ‘handholding’…
Jon and I are approaching this differently, I just want you to be aware of that. Either way will work fine but I don’t think the time savings is all that significant. Here is my version again, which is different from Jon’s slightly, just so you know. I’ve updated it to work with what your CGI emails send you - you should be able to just copy and paste this into a new script - compile and save. I copied your text and emailed it to myself and ran this script. The result is a tab delimited text file with the cgi form components and without headers.
Warmest regards~
--this is the file we'll write the extracted data to.
property logFile : ((path to desktop) as string) & "mailoutput.txt"
--set up a variable for the carraige return
property carraigeReturn : (ASCII character 13)
tell application "Outlook Express"
--get a list of the current messages
--will return messages currently manually selected, or all messages being received
--when the RULE is triggered.
set lisCurrentMsgs to get the current messages
--repeat with every current message
--(in case more than one match the RULE on receipt)
repeat with alsThisMsg in lisCurrentMsgs
set strBody to the content of alsThisMsg --get the body of this message
--store the current TID's
set oldDelims to AppleScript's text item delimiters
--set the TID's to return
set AppleScript's text item delimiters to return
--get the text items (every item delimited by return) of the body of this email
set submittedItems to the text items of strBody
--reset the TID's
set AppleScript's text item delimiters to oldDelims
repeat with thisItem in submittedItems
set thisItem to thisItem as string
--repeat with every text item of the body of this email (we split it up with return as a delimiter)
if thisItem contains "Name: " then --then this line contains the data for who
set charCount to the (count of characters in (thisItem as string)) --get the character count for this line
set whoField to characters 6 thru charCount of thisItem as string
else if thisItem contains "Address1: " then --this this line contains the email address
set charCount to the (count of characters in (thisItem as string)) --get the character count for this line
set addressField to characters 10 thru charCount of thisItem as string
else if thisItem contains "Suburb: " then --this this line contains the email address
set charCount to the (count of characters in (thisItem as string)) --get the character count for this line
set suburbField to characters 8 thru charCount of thisItem as string --now get the characters after the hader
else if thisItem contains "Town/City: " then --this this line contains the email address
set charCount to the (count of characters in (thisItem as string)) --get the character count for this line
set townCityField to characters 11 thru charCount of thisItem as string --now get the characters after the hader
else if thisItem contains "Country: " then --this this line contains the email address
set charCount to the (count of characters in (thisItem as string)) --get the character count for this line
set countryField to characters 9 thru charCount of thisItem as string --now get the characters after the hader
else if thisItem contains "Email: " then --this this line contains the email address
set charCount to the (count of characters in (thisItem as string)) --get the character count for this line
set emailField to characters 7 thru charCount of thisItem as string --now get the characters after the hader
else if thisItem contains "Phone: " then --this this line contains the email address
set charCount to the (count of characters in (thisItem as string)) --get the character count for this line
set phoneField to characters 7 thru charCount of thisItem as string --now get the characters after the hader
else if thisItem contains "findout method: " then --this this line contains the email address
set charCount to the (count of characters in (thisItem as string)) --get the character count for this line
set findoutField to characters 16 thru charCount of thisItem as string --now get the characters after the hader
else if thisItem contains "Other: " then --this this line contains the email address
set charCount to the (count of characters in (thisItem as string)) --get the character count for this line
set otherfield to characters 7 thru charCount of thisItem as string --now get the characters after the hader
else if thisItem contains "PRODUCT CHOICE: " then --this this line contains the email address
set charCount to the (count of characters in (thisItem as string)) --get the character count for this line
set productField to characters 16 thru charCount of thisItem as string --now get the characters after the hader
--add as many else if's as you need and edit the appendText line below to accomodate
end if
end repeat
--now create a tab delim'ed record from the harvested data and write to a file
set appendText to whoField & tab & addressField & tab & suburbField & tab & addressField & tab & townCityField & tab & countryField & tab & emailField & tab & phoneField & tab & findoutField & tab & otherfield & tab & productField as string --concatenate the data to form the record
try
set getEOF to (get eof logFile) + 1 --get the end of the log file and add one (this is where we'll start writing)
write appendText & carraigeReturn to logFile starting at getEOF --write to the file
on error --if the file does not exist yet, the last couple of lines will fail
set logFile to open for access logFile with write permission --make the file with write permission
write appendText & carraigeReturn to logFile starting at 0 --write to the file
close access logFile --close access
end try
end repeat
end tell
And to be even more clear, Mytzlscript’s solution is better if there might be changes in the format of the email (although there’s still room for optimization ). So, its up to you: speed or accuracy if the format changes.
One more newbie question - not sure if it’s a scriptable solution or not.
Now that I’ve got my huge text file of data, I need to merge with it another one to create a master list. I can do this ok in either Excel or Filemaker - but either way I’m getting loads of duplicate records showing up.
Surely there’s a simple way to sort/filter by duplicates in either app - or do I need to run a script? I just want to delete the duplicates as easily as possible.
To sort, just use the sort command in Excel, that’ll be faster than anything else. In terms of deleting duplicates, after you’ve sorted it in Excel, save the file as a text file and then you can use this script: