Guys,
This will take the emails in all folders, except those that logically are uninteresting (Junk, Deleted, Sent, …) and establishes links to the sender’s contact information in the Address Book, if an entry exists. I use this weekly as I am constantly gaining new contacts and many times emails have come from them prior to me creating an address book entry.
So, OMM this runs somewhere between 560 and 700 seconds to handle 22000 emails and about 2000 contacts in each an Exchange Contact list and an Address Book (nearly all duplicated entries). It creates a log file to check your timing. Notice that the number of emails is not the determining factor for how long a folder takes it is the number of links per contact and the number of addresses found that belong to contacts in a folder.
I tweaked it to squeeze out max speed without resorting to one final trick… getting this to run in parallel as separate sub-applications of a single application. This may eke out a faster result but will be pushing Entourage to its limits, I believe. One trick used, however, was the ordering of found addresses by occurrence and not in alpha order, as my logic was that I get more than the majority of my emails from a finite number of people and so their related email addresses ought to be checked first. It did speed it up a bit.
-- a derivative script based on one from Jolly Roger <jollyroger@pobox.com> that I found some time ago
--author of this version: John Egan
--version 1.0.0 - revamped how it works... tweaked for speed
-- OMM (early 2009 MacBook with 4 Gig RAM, 2.4 GHz processors) over 22K emails with about 2000 contacts in each of the regular Address Book and Exchange Contacts Address Book, this runs a varying amount of time (averages ~700 seconds, with spikes up to 1000 seconds) due to other processes running and Entourage offline or not looking for emails from 7 accounts every 5 minutes
-- tested with Entourage 2008
property pLogEventsOn : true -- generate a log file, add to an existing one
property pReportContacts : false -- if the contacts should be listed in the log (false as default, set to true to verify operation)
global gFolderList -- the list of folders in Entourage, checked each run for validity
global gLogFolder -- where the log file is stored
global gLogFileName -- the log file name
global gUnixEOL -- use this line feed character in logs, Unix end of line character
global gStartTime -- when the script began
global gSenderMatchList -- sender matches already processed
global gContactIDRecords -- the contacts for a particular email address, note that this spans contacts and address book
global gEmailAddressList -- list of all email addresses in Address Book [check adding/toggling with Contacts in Exchange environment]
set gEmailAddressList to {}
set gContactIDRecords to {}
set gSenderMatchList to {}
set gUnixEOL to ASCII character 10 -- use this line feed character in logs, Unix end of line character
set gLogFileName to "Link Contact to Email Message.log"
set gFolderList to {}
set LogFolderLocation to the path to the desktop -- where log and archive folders are, default
set LogFolderName to "Entourage Maintenance Log Messages" -- where I store all logs of Entourage maiontenance
set gLogFolder to ((LogFolderLocation as text) & LogFolderName as text) & ":"
set FoldersToSkip to {"Drafts", "Outbox", "Sent Items", "Deleted Items", "Junk E-mail"} -- saves time on wasted efforts, leave off top level folders only with these names in case user uses same names in lower level folders
-- ========== Initial entry to text file
if pLogEventsOn is true then my LogEntry("Starting up ...", 1) -- the 1 means to start the timer to track how long this took
if pLogEventsOn is true then my LogEntry("contacts followed by ' *' signifies links to multiple records", 0)
-- ========== Get a list of all of the email addresses in the Address Book
set text item delimiters to ","
tell application "Microsoft Entourage"
ignoring application responses
set working offline to true
end ignoring
set gEmailAddressList to every email address of every contact as text
ignoring application responses
set working offline to false
end ignoring
end tell
set text item delimiters to ""
if pLogEventsOn is true then my LogEntry("Email addresses placed in a list...", 0)
-- ========== Get a list of all folders
tell application "Microsoft Entourage" to set TopFolderList to the folders
repeat with nextFolder in TopFolderList
if (name of nextFolder) as text is not in FoldersToSkip then
my GetFolderListing(nextFolder)
end if
end repeat
if pLogEventsOn is true then my LogEntry("Completed folder list creation...", 0)
-- ================ Process the folders 1 by 1 and establish links between emails and contacts
repeat with nextFolder in gFolderList
if pLogEventsOn is true then my LogEntry("................. Starting folder: " & the name of nextFolder, 0)
ProcessFolder(nextFolder)
end repeat
if pLogEventsOn is true then my LogEntry("Finished", 2)
-- Handlers
-- ========== Get the list of all email storage folders in Entourage plus the top level standard ones
on GetFolderListing(someFolder) --a recursive handler, finds nested email folders in Entourage
set end of gFolderList to someFolder
tell application "Microsoft Entourage" to set subFolderList to {} & someFolder's folders
if subFolderList is not {} then -- this folder has subfolders
repeat with SubFolder in subFolderList -- process each subfolder
my GetFolderListing(SubFolder)
end repeat
end if
end GetFolderListing
-- ========== Process each item of list of folders
on ProcessFolder(someFolder)
set senderList to {}
script AScript
property alist : senderList
end script
tell application "Microsoft Entourage"
set theMsgCount to count of messages of someFolder
ignoring application responses
set working offline to true
end ignoring
set AScript's alist to (address of sender of every message of someFolder) as list -- if list is blank returns {}
ignoring application responses
set working offline to false
end ignoring
end tell
if pLogEventsOn is true then my LogEntry("Number of msgs: " & theMsgCount, 0)
if theMsgCount > 0 then --number of messages in the folder
set AScript's alist to my CutDupesNonContactsAndSort(AScript's alist)
if pLogEventsOn is true then my LogEntry("Contact email addresses: " & (count of AScript's alist), 0)
my LinkMessages(someFolder, AScript's alist) -- takes folder and the sender list and scans for matches to Address Book contacts
end if
if pLogEventsOn is true then my LogEntry("Completed folder", 0)
end ProcessFolder
-- ===========
on LinkMessages(someFolder, thePassedSenderList) -- thePassedSenderList only contains email addresses associated with contacts
set ContactsFound to {}
script AddBook
property senderMatchList : gSenderMatchList
property ContactRecs : gContactIDRecords
property FolderContacts : ContactsFound
property theSenderList : thePassedSenderList
end script
ignoring application responses
tell application "Microsoft Entourage" to set working offline to true
end ignoring
repeat with emailSender in AddBook's theSenderList
set emailSenderText to emailSender as text
if emailSenderText is not in AddBook's senderMatchList then -- checks to see if this email address was encountered alreaady
tell application "Microsoft Entourage"
set senderContact to find emailSender
set messageList to (every message of someFolder whose address of sender is emailSender)
ignoring application responses
link messageList to senderContact
end ignoring
end tell
set theRecord to {TheSender:"", TheContact:"", TheCount:1, TheDisplayName:""}
set the end of AddBook's senderMatchList to emailSenderText
set matchTrack to 0
repeat with IDItem in senderContact
if AddBook's FolderContacts does not contain id of IDItem as text then
set the end of AddBook's FolderContacts to id of IDItem as text
else if pReportContacts is true then
set matchTrack to matchTrack + 1
end if
end repeat
if pReportContacts is true then
set LengthSC to length of senderContact
if matchTrack ≠LengthSC then
if the LengthSC > 1 then
set DisplayName to ""
repeat with theContactID in senderContact
tell application "Microsoft Entourage" to set CurrentDisplayName to display name of theContactID
if DisplayName ≠"" and DisplayName ≠CurrentDisplayName then
set DisplayName to DisplayName & ", " & CurrentDisplayName
else if DisplayName = "" then
set DisplayName to CurrentDisplayName
end if
end repeat
set DisplayName to DisplayName & " *"
else
tell application "Microsoft Entourage"
try
set DisplayName to display name of senderContact
on error -- Entourage is offline and contact in Exchange AB only or no display name exists
set DisplayName to emailSenderText
end try
end tell
end if
if pLogEventsOn is true and pReportContacts is true then my LogEntry("linked msgs from: " & DisplayName, 0)
end if
copy DisplayName to TheDisplayName of theRecord
end if
copy emailSenderText to TheSender of theRecord
copy senderContact to TheContact of theRecord
set the end of AddBook's ContactRecs to theRecord
else -- there is a match to the email address meaning its related info is already found and in a record
set ListItemID to FindSenderAndIncrementRecord(emailSender, emailSenderText)
set matchTrack to 0
if class of ListItemID is integer then
set senderContact to {} & (TheContact of (record ListItemID of AddBook's ContactRecs))
set DisplayName to (TheDisplayName of (record ListItemID of AddBook's ContactRecs))
else
set senderContact to ListItemID
set DisplayName to emailSenderText
end if
tell application "Microsoft Entourage"
set messageList to (every message of someFolder whose address of sender is emailSender)
ignoring application responses
link messageList to senderContact
end ignoring
end tell
repeat with IDItem in senderContact
if AddBook's FolderContacts does not contain (id of IDItem as text) then
set the end of AddBook's FolderContacts to id of IDItem as text
else if pReportContacts is true then
set matchTrack to matchTrack + 1 -- tracks how many matches were found
end if
end repeat
if pReportContacts is true then
if matchTrack ≠length of senderContact then
if pLogEventsOn is true then my LogEntry("linked msgs from: " & DisplayName, 0)
end if
end if
end if
end repeat
ignoring application responses
tell application "Microsoft Entourage" to set working offline to false
end ignoring
set gSenderMatchList to AddBook's senderMatchList
set gContactIDRecords to AddBook's ContactRecs
end LinkMessages
-- ==========
on CutDupesNonContactsAndSort(thePassedList) --remove duplicates & email addresses not associated with contacts, then sort list
set theModifiedList to {}
set theReturnList to {}
script j
property EmailList : gEmailAddressList
property PassedList : thePassedList
property ModifiedList : theModifiedList
end script
ignoring case
repeat with theItem in j's PassedList
set theItem to theItem as text
if j's ModifiedList does not contain theItem then -- cuts out dupes
if theItem is in j's EmailList then -- cuts out any email address not associated with a contact, reduces sort time
set the end of j's ModifiedList to theItem
end if
end if
end repeat
end ignoring
set theLength to the length of j's ModifiedList
if theLength > 10 then
set theReturnList to AdamSort(j's ModifiedList, 1, theLength)
else
set theReturnList to bubbleSwapSort(j's ModifiedList)
end if
return theReturnList
end CutDupesNonContactsAndSort
-- ==========
on AdamSort(array, leftEnd, rightEnd) -- Hoare's QuickSort Algorithm
script A
property L : array
end script
set {i, j} to {leftEnd, rightEnd}
set v to item ((leftEnd + rightEnd) div 2) of A's L -- pivot in the middle
repeat while (j > i)
repeat while (item i of A's L < v)
set i to i + 1
end repeat
repeat while (item j of A's L > v)
set j to j - 1
end repeat
if (not i > j) then
tell (a reference to A's L) to set {item i, item j} to {item j, item i} -- swap
set {i, j} to {i + 1, j - 1}
end if
end repeat
if (leftEnd < j) then AdamSort(A's L, leftEnd, j)
if (rightEnd > i) then AdamSort(A's L, i, rightEnd)
return A's L
end AdamSort
-- ========== bubble sort for when there are 10 or less items in list to sort, seems to be faster than AdamSort at small lists
on bubbleSwapSort(theList)
script Bscript
property alist : theList
end script
set TheCount to length of Bscript's alist
if TheCount < 2 then return Bscript's alist
set swaps to true
repeat while swaps
set swaps to false
repeat with theIndex from 1 to TheCount - 1
if item theIndex of Bscript's alist > item (theIndex + 1) of Bscript's alist then
set temp to item theIndex of Bscript's alist
set item theIndex of Bscript's alist to item (theIndex + 1) of Bscript's alist
set item (theIndex + 1) of Bscript's alist to temp
set swaps to true
end if
end repeat
set TheCount to TheCount - 1
end repeat
return Bscript's alist
end bubbleSwapSort
-- ========== this will find the sender info in the gContactIDRecords
on FindSenderAndIncrementRecord(emailSender, emailSenderText)
script G
property GotRec : gContactIDRecords
end script
set theListCount to the length of G's GotRec
repeat with listItem from 1 to theListCount
if (TheSender of record listItem of G's GotRec) is equal to emailSenderText then
set TheCount of record listItem of G's GotRec to ((TheCount of record listItem of G's GotRec) + 1)
if listItem ≠1 then -- shifts the most encountered senders to the start of the list for faster processing, not into alpha order
repeat with itemPos from listItem to 2 by -2
set theRecCount to record (itemPos - 1) of G's GotRec
if (TheCount of record itemPos of G's GotRec) > (TheCount of theRecCount) then
if itemPos - 2 > 0 and (TheCount of record itemPos of G's GotRec) > (TheCount of record (itemPos - 2) of G's GotRec) then
set theRecCount to record (itemPos - 2) of G's GotRec
end if
-- swaps records if latest record's count is higher than one before it
tell (a reference to G's GotRec) to set {record itemPos of G's GotRec, theRecCount} to {theRecCount, record itemPos of G's GotRec}
set listItem to (itemPos - 1)
else
exit repeat
end if
end repeat
end if
set gContactIDRecords to G's GotRec
return listItem
else -- no match somehow
if listItem = theListCount then
tell application "Microsoft Entourage" to set senderContact to find emailSender
return senderContact
end if
end if
end repeat
end FindSenderAndIncrementRecord
---==================== Log data to text file ==========================
on LogEntry(someText, timingAct)
stop log
--create text entry to log file, append to end of file to maintain long record
set logFile to (gLogFolder as text) & gLogFileName
set logRef to 0
try
set logRef to open for access file logFile with write permission
set fileContents to (read logRef)
end try
ignoring application responses
if logRef ≠0 then
if someText is "Starting up ..." then write gUnixEOL starting at eof to logRef
write FormatDateTime(current date) & ": " & someText & gUnixEOL starting at eof to logRef
if timingAct is 1 then
set gStartTime to current date
else if timingAct is 2 then
write "Time to complete: " & ((current date) - gStartTime) & gUnixEOL starting at eof to logRef
end if
close access logRef
end if
end ignoring
start log
end LogEntry
---================== places date and time in format for time stamp ======
on FormatDateTime(theDate)
set theDate to theDate as date
set dd to text -2 thru -1 of ("0" & theDate's day)
copy theDate to tempDate
set the month of tempDate to January
set mm to text -2 thru -1 of ("0" & 1 + (theDate - tempDate + 1314864) div 2629728)
set yy to text -1 thru -4 of ((year of theDate) as text)
set hh to time string of theDate
return (yy & "/" & mm & "/" & dd & " " & hh as text)
end FormatDateTime
I look forward to any feedback or improvements.