As promised here is my script inclusive of a lot of Excel related code to to produce a nice looking spreadsheet listing the duplicate contact records…that said, please note that additional commentary appears after the script which I think is interesting…
tell application "Contacts"
-- Set variables
set firstNames to {} -- List of all first names from all Contacts
set lastNames to {} -- List of all last names from all Contacts
set orgNames to {} -- List of all organization names from all Contacts
set phoneNumbers to {} -- List of all work telephone numbers from all Contacts
set myList to {} -- List of all first names, last names, organization names, work telephone numbers and "first name last name" combinations from all Contacts
set myListNames to {} -- List of all "first name last name" combinations from all Contacts
set myListNamesUnique to {} -- List of myListNames WITH DUPLICATE ENTRIES REMOVED
set myListUnique to {} -- List of myList WITH DUPLICATE ENTRIES REMOVED
set myListNamesDuplicate to {} -- List of myListNames DUPLICATES
set myListDuplicate to {} -- List of myList DUPLICATES ENTRIES
set myListPhantom to {} -- List of myList PHANTOM ENTRIES
-- Set / extract the Contact first names, last names, organization names and work telephone numbers
set {firstNames, lastNames, orgNames, phoneNumbers} to {first name, last name, organization, value of every phone whose label is "work"} of every person
repeat with i from 1 to count lastNames
if item i of firstNames is equal to missing value then set item i of firstNames to " "
if item i of lastNames is equal to missing value then set item i of lastNames to " "
if item i of orgNames is equal to missing value then set item i of orgNames to " "
if item i of phoneNumbers is equal to missing value then set item i of phoneNumbers to " "
end repeat
repeat with i from 1 to (count lastNames)
set end of myList to {i, item i of firstNames, item i of lastNames, item i of orgNames, item i of phoneNumbers, item i of firstNames & " " & item i of lastNames}
set end of myListNames to (item i of firstNames & " " & item i of lastNames)
end repeat
-- Set / create the list of duplicate entries and unique entries noting that the test is based on matching "first name last name" as an "extended test" [i.e a test that also includes other contact properties] would not produce the correct results [i.e. two contact records that identical in every sense except the first record has the organization as ACME while the second record has the organization as ACME Inc. would not be identified as being a duplicate].
repeat with i from 1 to (count myListNames)
if (((count of item 2 of item i of myList) < 2) and ((count of item 3 of item i of myList) < 2) and ((count of item 4 of item i of myList) < 2)) then
set end of myListPhantom to item i of myList -- Ignore / kickout the phantom contacts
else
if ((item i of myListNames is not in myListNamesUnique) or (((count of item 3 of item i of myList) < 2) and (item 4 of item i of myList > 1))) then -- The "or" portion of the test is to include retailers, restaurants, etc. where i) there is no first name or last name but ii) there is an organization name. WIthout this added test these contact records would eb considered duplicates because there "first nane last name" combination [i.e. " "] would be in myListNamesUnique
set end of myListNamesUnique to item i of myListNames
set end of myListUnique to item i of myList
else
set end of myListNamesDuplicate to item i of myListNames
set end of myListDuplicate to item i of myList
end if
end if
end repeat
end tell
tell application "Microsoft Excel"
open
-- Get current date in YYYYMMDD format
set currentDate to current date
set currentDateYear to year of currentDate
set currentDateMonth to month of currentDate as integer
if ((day of currentDate < 10) is true) then
set currentDateDay to "0" & day of currentDate
else
set currentDateDay to day of currentDate
end if
set currentDateYYYYMMDD to currentDateYear & currentDateMonth & currentDateDay as string
-- Make and name a new workbook
make new workbook
set theBook to the active workbook
set theSheet to the active sheet of the theBook
-- Set the magniifcation / zoom percentage
set zoom of the active window to 100
-- Set the column widths in the new workbook for readability
set column width of the first column of theSheet to 3
set column width of the second column of theSheet to 5
set column width of the third column of theSheet to 20
set column width of the fourth column of theSheet to 70
set column width of the fifth column of theSheet to 3
-- Set and format the column heading and titles
set myRangeCells to range ("B2:D2") of theSheet
set weight of (get border of myRangeCells which border edge top) to border weight thick
set borderWeightLog to get weight of (get border of myRangeCells which border edge top)
set insertedTextTop to "Contact Records Which Appear Two or More Times" as string
set insertedTextBottom to "Selected Contact Source: Work's Exchange Server"
set myRangeTop to range ("B3:B3") of theSheet
set myRangeBottom to range ("B4:B4") of theSheet
set value of myRangeTop to insertedTextTop
set value of myRangeBottom to insertedTextBottom
set myRangeCells to range ("B7:D7") of theSheet
set weight of (get border of myRangeCells which border edge top) to border weight thin
set borderWeightLog to get weight of (get border of myRangeCells which border edge top)
set myRangeCells to range ("B3:B3")
set font size of font object of myRangeCells to 16
set font style of font object of myRangeCells to "Bold"
set fontStyleLog to (get font style of font object of myRangeCells)
set myRangeCells to range ("B4:B4")
set font style of font object of myRangeCells to "Bold"
-- set fontStyleLog to (get font style of font object of myRangeCells)
-- log fontStyleLog
set InsertedTextBottom1 to "Count"
set InsertedtextBottom2 to "Name"
set InsertedTextBottom3 to "Organization"
set myRangeBottom1 to range ("B9:B9") of theSheet
set myRangeBottom2 to range ("C9:C9") of theSheet
set myRangeBottom3 to range ("D9:D9") of theSheet
set value of myRangeTop to insertedTextTop
set value of myRangeBottom1 to InsertedTextBottom1
set value of myRangeBottom2 to InsertedtextBottom2
set value of myRangeBottom3 to InsertedTextBottom3
set myRangeCells to range ("B9:D9") of theSheet
set horizontal alignment of myRangeCells to horizontal align center
set myRangeCells to range ("B11:C100000") of theSheet
set horizontal alignment of myRangeCells to horizontal align center
set myRangeCells to range ("B9:D9")
-- set font size of font object of myRangeCells to 16
set font style of font object of myRangeCells to "Bold"
set myRangeCells to range ("B9:D9") of theSheet
set weight of (get border of myRangeCells which border edge bottom) to border weight thin
set borderWeightLog to get weight of (get border of myRangeCells which border edge bottom)
-- Import myListDuplicates
repeat with i from 1 to count of myListDuplicate
set value of cell ("B" & (10 + i) as string) to i
set value of cell ("C" & (10 + i) as string) to item 6 of item i of myListDuplicate
if item 4 of item i of myListDuplicate < 2 then set item 4 of item i of myListDuplicate to "Organization field is blank"
set value of cell ("D" & (10 + i) as string) to item 4 of item i of myListDuplicate
end repeat
sort range ("C11:D" & (10 + (count of myListDuplicate) as string)) of worksheet theSheet key1 (range "C11" of worksheet theSheet) key2 (range "D11" of worksheet theSheet)
-- Set the name of the spreadsheet to be saved
set userName to do shell script "whoami"
set fileNameSaved to "Macintosh HD:Users:" & userName & ":Desktop:" & currentDateYYYYMMDD & "_contact records which appear two or more times.xls"
-- Save the workbook / spreadsheet
tell theBook
save workbook as theBook filename fileNameSaved overwrite yes
end tell
-- Test, and depending on the test result, close the spreadsheet
display dialog "Do you want to close the Excel spreadsheet which lists those contact records which appear two or more times?" buttons {"Yes", "No"} default button 2 giving up after 5
set spreadsheetClose to button returned of result
if ((spreadsheetClose = "Yes") is true) then quit
end tell
The intersecting commentary:
-
Appreciate the help from everyone particularly for point me in a direction that did not require the need to sort the data noting that the script runs fairly quickly on my mid-2012 i7 8GB MBA [I wish it had more RAM]!
-
In testing the script I noted that there were a few problems that need to be sorted:
a) Comparing records based on multiple entries such as a list consisting of {first name, last name, organization, phone number} does not work…consider the case where the same person is entered twice but the first record has his organization as ACME while the second record has his organization as ACME Inc.
b) Comparing records based on name alone did not work because records which had no name entries but did have organization entries [i.e. restaurants, stores, etc.] were defaulting to the duplicates list because there were multiple occurences with no names!
c) Due to 2a and 2b the coding got a little more complicated that I had hoped but at least it is “done” in that i) it functions and ii) it needs to be optimized in terms of code and performance [but, at least it is a start]!
Would appreciate any and all feedback / suggestions you may have!
Thanks!