The need to sort Applescript numeric or text lists arises frequently. Many scripting solutions are available, notably Nigel Garvey and Arthur Knapp’s efficient “qsort” algorithm for numeric lists (http://macscripter.net/viewtopic.php?id=17340), and Kai Edwards and Adam Bell’s solution for co-sorting multiple lists (http://macscripter.net/viewtopic.php?id=21534). Another option is Terminal’s built-in “sort” command, which combines efficient execution speed with several options to customize the sort. The following handler is an Applescript wrapper for the Terminal “sort” command that incorporates most of its functionality. In addition, the handler offers the option to co-sort secondary lists based on the sort order of the primary list, a feature that I’ve frequently found use for. It compares favorably with the “qsort” algorithm in execution speed, sorting a list of 10,000 random numbers from 0 to 4000 30% more quickly than “qsort” on my old Powerbook G4 machine (1.5 vs 2.2 seconds). One important feature to note is that the handler returns the sorted list (or list plus co-sorted secondary lists) as a new independent list, in contrast with “qsort”, which sorts the original list in place.
I welcome any suggestions, enhancements, criticisms, corrections, etc.
-- Examples:
set list_1 to {2, 1, 3, 3, 3, 1, 4}
set list_2 to {"a", "C", "B", "c", "C", "d", "A"}
set list_3 to {"11", "3333", "3", "111", "222", "2", "22"}
sort_lists(list_1) --> {1, 1, 2, 3, 3, 3, 4}
sort_lists({list_1, "remove duplicates"}) --> {1, 2, 3, 4}
sort_lists({list_1, "remove duplicates", "descending sort"}) --> {4, 3, 2, 1}
sort_lists(list_2) --> {"A", "B", "C", "C", "a", "c", "d"}
sort_lists({list_2, "remove duplicates"}) --> {"A", "B", "C", "a", "c", "d"}
sort_lists({list_2, "ignore case"}) --> {"A", "a", "B", "C", "C", "c", "d"}
sort_lists({list_2, "ignore case", "remove duplicates"}) --> {"a", "B", "C", "d"}
sort_lists(list_3) --> {"11", "111", "2", "22", "222", "3", "3333"}
sort_lists({list_3, "numeric sort"}) --> {"2", "3", "11", "22", "111", "222", "3333"}
sort_lists({list_1, list_2}) --> {{1, 1, 2, 3, 3, 3, 4}, {"C", "d", "a", "B", "c", "C", "A"}}
sort_lists({list_1, list_2, list_3}) --> {{1, 1, 2, 3, 3, 3, 4}, {"C", "d", "a", "B", "c", "C", "A"}, {"3333", "2", "11", "3", "111", "222", "22"}}
-- Note that in the last two examples, list_2 and list_3 are sorted in the same sort order as list_1
on sort_lists(input_argument)
(*USAGE NOTES:
input_argument =
¢ Primary list to be sorted -OR-
¢ {Primary list to be sorted, Optional item #1, Optional item #2, ...}
where:
Primary list to be sorted (must be the first input argument) = list of 2 or more items, either all text items or all numeric items
Optional items (may be entered in any order after the primary list) =
¢ Secondary lists to be co-sorted in the same sort order as the primary list (must have the same number of items as the primary list) -AND/OR-
¢ Any of the following sort options:
- For text and numeric lists:
"descending sort" -> sort in descending order (default is ascending)
"remove duplicates" -> remove duplicate items from the sorted primary list and corresponding items in co-sorted secondary lists (default is not to delete duplicate items)
- For text lists only:
"ignore case" -> case-insensitive sort (default is case-sensitive)
"keep leading blanks" -> incorporate leading blanks in the text sort (default is to remove leading blanks prior to sorting)
"alphanumerics and blanks sort" -> sort by alphanumeric characters and blanks only (default is to sort by all characters)
"ascii 32 to 126 sort" -> sort by ASCII characters 32 to 126 only (default is to sort by all characters)
"month sort" -> case-sensitive sort by the month value incorporated in the first three characters of each text item: "Jan", "Feb", "Mar", ...
"numeric sort" -> sort by the integer or real numeric value incorporated in the beginning characters of each text item (default is to sort by the ASCII value of the beginning characters, ie, a text sort) (Note: this option is not required if the primary list is of numeric type)
value returned by handler =
¢ If no secondary list is present in the input argument -> sorted primary list
¢ If secondary lists are present in the input argument -> {sorted primary list, co-sorted secondary list #1, co-sorted secondary list #2, ...}
*)
-- Set property, constant, and default values
script s -- script properties are used to enhance execution speed of "repeat" loops
property primary_list : {}
property secondary_lists : {}
property output_value : {}
end script
set valid_input_argument_format to return & return & "A valid input argument is a list in which the first item is the primary list to be sorted, optionally followed in any order by (1) secondary lists to be co-sorted and/or (2) any of the following valid sort options:" & return & return & " \"descending sort\"" & return & " \"remove duplicates\"" & return & " \"ignore case\"" & return & " \"keep leading blanks\"" & return & " \"alphanumerics and blanks sort\"" & return & " \"ascii 32 to 126 sort\"" & return & " \"month sort\"" & return & " \"numeric sort\""
set ascii_1 to ASCII character 1
set {sort_in_descending_order, remove_duplicate_items, ignore_case, keep_leading_blanks, sort_by_alphanumerics_and_blanks, sort_by_ascii_32_to_126, sort_by_month, sort_numerically} to {false, false, false, false, false, false, false, false}
try
-- Validate and parse input argument
tell input_argument
if class is not list then error "*** Invalid input argument: The input argument is not a list. ***" & valid_input_argument_format
if length = 0 then error "*** Invalid input argument: The input argument is an empty list. ***" & valid_input_argument_format
tell item 1 to if class is not list then set input_argument to {input_argument} -- handles the case where the input argument consists solely of the primary list to be sorted
end tell
tell input_argument
-- Process the primary list
tell item 1
if length = 0 then error "*** Invalid input argument: The primary sort list is empty. ***" & valid_input_argument_format
if it contains "descending sort" or it contains "remove duplicates" or it contains "ignore case" or it contains "keep leading blanks" or it contains "alphanumerics and blanks sort" or it contains "ascii 32 to 126 sort" or it contains "month sort" or it contains "numeric sort" then error "*** Invalid input argument: A sort option was found within the primary sort list. ***" & valid_input_argument_format
tell its item 1 to if class is in {string, text, Unicode text} then -- the first item of the primary sort list determines the sort type ("text" vs "numeric")
set primary_list_type to "text"
else if class is in {integer, real} then
set primary_list_type to "numeric"
else
error "*** Invalid input argument: The primary sort list is neither a text list nor a numeric list. ***" & valid_input_argument_format
end if
copy it to s's primary_list
end tell
-- Process the optional items
set s's secondary_lists to {}
repeat with i in rest
tell i's contents to if class is list then
if length ≠s's primary_list's length then error "*** Invalid input argument: A secondary list differs in size from the primary sort list. ***" & valid_input_argument_format
copy it to end of s's secondary_lists
else if it is "descending sort" then
set sort_in_descending_order to true
else if it is "remove duplicates" then
set remove_duplicate_items to true
else if it is "ignore case" then
set ignore_case to true
else if it is "keep leading blanks" then
set keep_leading_blanks to true
else if it is "alphanumerics and blanks sort" then
set sort_by_alphanumerics_and_blanks to true
else if it is "ascii 32 to 126 sort" then
set sort_by_ascii_32_to_126 to true
else if it is "month sort" then
set sort_by_month to true
else if it is "numeric sort" then
set sort_numerically to true
else
error "*** Invalid input argument: An optional item is neither a secondary list to be co-sorted nor a valid sort option. ***" & valid_input_argument_format
end if
end repeat
set number_of_secondary_lists to s's secondary_lists's length
end tell
-- Create an options string for the Terminal "sort" command
set sort_options to ""
if sort_in_descending_order then set sort_options to sort_options & " -r"
set use_sort_command_to_remove_duplicate_items to number_of_secondary_lists = 0 or primary_list_type is "numeric" or sort_numerically or sort_by_month
if remove_duplicate_items and use_sort_command_to_remove_duplicate_items then set sort_options to sort_options & " -u"
if primary_list_type is "text" then
if ignore_case then set sort_options to sort_options & " -f"
if not keep_leading_blanks then set sort_options to sort_options & " -b"
if sort_by_alphanumerics_and_blanks then set sort_options to sort_options & " -d"
if sort_by_ascii_32_to_126 then set sort_options to sort_options & " -i"
if sort_by_month then set sort_options to sort_options & " -M"
end if
if primary_list_type is "numeric" or sort_numerically then
set sort_options to sort_options & " -n"
end if
-- Sort lists
if number_of_secondary_lists = 0 then
-- If there are no secondary lists, sort the primary list alone
set AppleScript's text item delimiters to ASCII character 10
set s's output_value to (do shell script "echo " & (s's primary_list as string)'s quoted form & " | sort" & sort_options)'s paragraphs -- converts the primary list into a sorted list of text items
set AppleScript's text item delimiters to ""
-- If the original primary list was numeric, convert the sorted primary list from numbers in string format back to a numeric list
if primary_list_type is "numeric" then repeat with i from 1 to s's output_value's length
set s's output_value's item i to (s's output_value's item i) as number
end repeat
else
-- If there are secondary lists, sort a transformed version of the primary list (each primary list item appended with an ASCII 1 separator character followed by the item's index in the list); then, using the appended indexes, co-sort the secondary lists
if primary_list_type is "numeric" then
set AppleScript's text item delimiters to ASCII character 10
set s's primary_list to paragraphs of (s's primary_list as string) -- if the primary list is numeric, converts it to a list of numbers in string format
set AppleScript's text item delimiters to ""
end if
repeat with i from 1 to s's primary_list's length
set s's primary_list's item i to s's primary_list's item i & ascii_1 & i -- appends to each primary list item an ASCII 1 separator followed by the item's index; the indexes will be used to co-sort the secondary lists
end repeat
set AppleScript's text item delimiters to ASCII character 10
set s's primary_list to (do shell script "echo " & (s's primary_list as string)'s quoted form & " | sort" & sort_options & " | tr '" & (ASCII character 10) & "' '" & ascii_1 & "'")'s text 1 thru -2 -- sorts the transformed primary list, and converts line separators to ASCII 1 separators; "thru -2" removes the final trailing ASCII 1 separator
set AppleScript's text item delimiters to ascii_1
set s's primary_list to s's primary_list's text items -- places the sorted primary list items into the 1st, 3rd, 5th, ... positions, and their indexes in the original primary list immediately following in the 2nd, 4th, 6th, ... positions
set AppleScript's text item delimiters to ""
repeat (1 + number_of_secondary_lists) times
set end of s's output_value to {} -- s's output_value will take the form {sorted primary list, sorted secondary list #1, sorted secondary list #2, ...}
end repeat
-- Extract the sorted primary list, co-sort the secondary lists, and optionally remove duplicate items
set {remove_duplicate_items_here, previous_value} to {remove_duplicate_items and not use_sort_command_to_remove_duplicate_items, null}
repeat with i from 1 to ((s's primary_list's length) - 1) by 2
set include_current_item to true
if remove_duplicate_items_here then
if ignore_case then
ignoring case
set include_current_item to s's primary_list's item i is not previous_value
end ignoring
else
considering case
set include_current_item to s's primary_list's item i is not previous_value
end considering
end if
set previous_value to s's primary_list's item i
end if
if include_current_item then
set end of s's output_value's item 1 to s's primary_list's item i -- extracts sorted primary list item
set j to s's primary_list's item (i + 1) -- sorted primary list item's index in original list
repeat with k from 1 to number_of_secondary_lists -- co-sorts secondary lists using the sorted primary list item's index in original list
set end of s's output_value's item (1 + k) to s's secondary_lists's item k's item j
end repeat
end if
end repeat
-- If the original primary list was numeric, convert the sorted primary list from numbers in string format back to a numeric list
if primary_list_type is "numeric" then repeat with i from 1 to s's output_value's item 1's length
set s's output_value's item 1's item i to (s's output_value's item 1's item i) as number
end repeat
end if
on error error_message number error_number
set AppleScript's text item delimiters to ""
set n to ""
if error_number ≠-2700 then set n to (error_number as string) & ": "
activate
display dialog "Problem encountered during execution of handler \"sort_lists\":" & return & return & n & error_message buttons "OK" default button "OK" cancel button "OK"
end try
return s's output_value
end sort_lists