OK. To merge sort “on” anything, you need a customisable merge sort. Here’s the customisable version of the above, which I don’t think I’ve posted before:
13th September 2010: I’ve uploaded a new version of this script with improved customisation and different names for the “slave move” handlers, as part of a collection, to ScriptBuilders.
27th November 2013: Since ScriptBuilders is now defunct, I’ve replaced the code originally posted here with the latest version. There are differences in both the code and the form of the customisation parameter, so I’ve updated the entire post.
14th June 2015: The amount of waste has been further reduced (after eight years!) by only extracting the left halves of the merge ranges to separate lists. Comments and variable names have been changed and the former mutually recursive sortHalf() handler has been replaced with a couple of bits of in-line code. A reposition() hander has been added to the slave script object to cover for the fact that the right half of each merge now comes directly from the original list instead of from an auxiliary.
(* Merge sort ” customisable version
Merge sort algorithm: John von Neumann, 1945.
AppleScript implementation: Nigel Garvey, 2007/2015.
Parameters: (list, range index 1, range index 2, record with optional 'comparer' and 'slave' properties). The 'comparer' value is a script object containing an isGreater(a, b) handler which determines if object a is "greater" than object b. The 'slave' value is a script object containing an extract(a, b) handler, which is called when the sort has extracted a copy of range a thru b of the list being sorted; a reposition(a, b) handler, called when item b in the main list has been moved to position b; a merge(a, b) handler, called when item a in an auxiliary list has been moved to position b in the main list; and a swap(a, b) handler, called when items a and b in the main list have been swapped.
Where the 'comparer' or 'slave' properties are omitted, or the customisation parameter isn't a record, the sort has default handlers which respectively compare items directly and do nothing.
*)
on CustomMergeSort(theList, l, r, customiser) -- Sort items l thru r of theList.
script o
property comparer : me
property slave : me
property lst : theList
property auxList : missing value
on msrt(l, r)
set leftR to (l + r) div 2 -- Left half end index.
set rx to leftR + 1 -- Right half start/tracking index.
-- Sort the left half by recursion if it has more than two items or by swapping as necessary if there are two. If one, do nothing.
if (leftR - l > 1) then
msrt(l, leftR)
else if (leftR > l) then
set lv to item l of my lst
set rv to item leftR of my lst
if (comparer's isGreater(lv, rv)) then
set item l of my lst to rv
set item leftR of my lst to lv
slave's swap(l, leftR)
end if
end if
-- Sort the right half similarly.
if (r - rx > 1) then
msrt(rx, r)
else if (r > rx) then
set lv to item rx of my lst
set rv to item r of my lst
if (comparer's isGreater(lv, rv)) then
set item rx of my lst to rv
set item r of my lst to lv
slave's swap(rx, r)
end if
end if
-- Extract a separate list of the left half's items and set tracking and end indices for it.
set auxList to items l thru leftR of my lst
slave's extract(l, leftR)
set lx to 1
set leftR to rx - l
-- Iterate through the range, merging the two halves by comparing the lowest unassigned value from auxList with that from the right half of the range and assigning the lower of the two (or the left if they're equal) to the current slot.
-- Begin with the first (lowest) value from each half.
set lv to beginning of my auxList
set rv to item rx of my lst
repeat with dx from l to r
if (comparer's isGreater(lv, rv)) then
-- The right value's less than the left. Assign it to this slot.
set item dx of my lst to rv
slave's reposition(rx, dx)
-- If no more right-half values, assign the remaining left values to the remaining slots and exit this repeat and recursion level.
if (rx is r) then
repeat with dx from (dx + 1) to r
set item dx of my lst to item lx of my auxList
slave's merge(lx, dx)
set lx to lx + 1
end repeat
exit repeat
end if
-- Otherwise get the next right-half value.
set rx to rx + 1
set rv to item rx of my lst
else
-- The left value's less than or equal to the right.
set item dx of my lst to lv
slave's merge(lx, dx)
-- If no more left-half values, simply exit this repeat and recursion level as the remaining right values are in place anyway.
if (lx is leftR) then exit repeat
-- Otherwise get the next left-half value
set lx to lx + 1
set lv to item lx of my auxList
end if
end repeat
end msrt
-- Default comparison and slave handlers for an ordinary sort.
on isGreater(a, b)
(a > b)
end isGreater
on extract(a, b)
end extract
on reposition(a, b)
end reposition
on merge(a, b)
end merge
on swap(a, b)
end swap
end script
-- Process the input parameters.
set listLen to (count theList)
if (listLen > 1) then
-- Negative and/or transposed range indices.
if (l < 0) then set l to listLen + l + 1
if (r < 0) then set r to listLen + r + 1
if (l > r) then set {l, r} to {r, l}
-- Supplied or default customisation scripts.
if (customiser's class is record) then set {comparer:o's comparer, slave:o's slave} to (customiser & {comparer:o, slave:o})
-- Do the sort.
o's msrt(l, r)
end if
return -- nothing
end CustomMergeSort
property sort : CustomMergeSort
-- (* Contrived demo:
-- Reverse sort a list of random integers, sorting a list of matching strings in parallel with it.
script backwards
-- Returns the opposite of the truth, for a reversed sort.
on isGreater(a, b)
(a < b) -- (a ≤ b) to reverse the stability too!
end isGreater
end script
script parallel
-- Duplicates moves from a merge sort in its own list.
property lst : missing value
property auxList : missing value
on extract(a, b)
set auxList to items a thru b of my lst
end extract
on reposition(a, b)
set item b of my lst to item a of my lst
end reposition
on merge(a, b)
set item b of my lst to item a of my auxList
end merge
on swap(a, b)
tell item a of my lst
set item a of my lst to item b of my lst
set item b of my lst to it
end tell
end swap
end script
set l to {}
set l2 to {}
repeat 1000 times
set end of my l to (random number 1000)
set end of my l2 to result as text
end repeat
set parallel's lst to l2
sort(l, 1, -1, {comparer:backwards, slave:parallel})
l2
-- *)
Like CustomQsort() (a combined QuickSort/insertion sort by Arthur Knapp and myself), CustomMergeSort() takes a fourth, customisation parameter. Originally, this was a script object containing both comparison and “slave” handlers (which I’ll explain in a moment). But the new version above is more flexible. Its customisation parameter is a record with ‘comparer’ and ‘slave’ properties which separately specify script objects for the comparison and slave actions. The two kinds of action can thus be in the same or different script objects, allowing a more mix-and-match approach. Furthermore, both script objects are now optional, the sort having internal defaults to use when they’re omitted.
Comparisons: Instead of comparing the list items itself, CustomMergeSort() passes each pair to an isGreater() handler in the ‘comparer’ script object. This handler is expected to return a boolean indicating whether or not the value of the first item is “greater” than that of the second by the criteria the handler’s designed to employ. For instance, if the items are lists, the handler may be written to see if item x of list a is greater than item x of list b. Or if list a is longer than list b. Or if list a contains a certain kind of value and list b doesn’t. etc. If the items are records, the values of certain properties may need to be compared, eg.:
script compareRecordsByAardvark
on isGreater(a, b)
return (a's aardvark > b's aardvark)
end isGreater
end script
Slave actions: Every time the sort moves items in the list, it sends information about the moves to relevant handlers in the ‘slave’ script object. The slave handler(s) will normally either reproduce the moves in another list ” thus sorting it in parallel with the first ” or simply be blank and do nothing at all. Unfortunately, you have to know how the sort moves things around to be able to write functional slave handlers yourself. The merge sorts here have three kinds of action and CustomMergeSort() expects to be able to call extract(), merge(), and swap() handlers in the ‘slave’ script object. To sort another list in parallel with the first, a “slave” script object might look like this:
script parallel
property lst : missing value -- Set this property to the list to be sorted in parallel with the main list.
property auxList : missing value
on extract(a, b)
-- The range a thru b of the main list (the "left half" for a merge) has been extracted for merging back into it. Do the same in respect of the slave list.
set auxList to items a thru b of my lst
end extract
on reposition(a, b)
-- Item b of the main list has been set to item a of the main list.
set item b of my lst to item a of my lst
end reposition
on merge(a, b)
-- Item b of the main list has been set to item a of the extracted left half.
set item b of my lst to item a of my auxList
end merge
on swap(a, b)
-- Items a and b of the main list have been swapped over.
tell item a of my lst
set item a of my lst to item b of my lst
set item b of my lst to it
end tell
end swap
end script
Here are some example calls using the above script objects:
-- Sort items 1 thru -1 of a list of records by their 'aardvark' properties, simultaneously sorting another list in parallel.
-- The "slave" (parallel) list must have at least as many items as the main list.
set sortInParallel's lst to myOtherList
CustomMergeSort(myListOfRecords, 1, -1, {comparer:compareRecordsByAardvark, slave:parallel})
-- Sort the list of records, but without a parallel sort.
CustomMergeSort(myListOfRecords, 1, -1, {comparer:compareRecordsByAardvark})
-- Sort a list values which don't need custom comparisons, simultaneously sorting another list in parallel.
set sortInParallel's slaveList to myOtherList
CustomMergeSort(myListOfDates, 1, -1, {slave:parallel})
-- Do a non-custom sort! ie. compare values directly and perform no slave action.
CustomMergeSort(myListOfDates, 1, -1, {})
McUsr pointed out above that Merge Sort is a “stable” sort:
For my own use, I sometimes need to sort the rows of a spreadsheet on column F, subsorting on columns B, C, and E in turn. (In the script, the “rows” are a list of lists whose items are the cell values at the intersections with the columns.)
I use CustomMergeSort() for this. But instead of sorting on one column at a time and exploiting merge sort’s stability to keep the rest in relative order, I sort on all four columns at once using an isGreater() handler which compares items 6, 2, 3, and 5 of the rows in that order each time it’s called. It only needs to work through the column sequence until the two rows are found to have different values in one of the columns, and the entire process is accomplished with one sort instead of four. It’s analogous to the way strings are compared when they’re sorted.
Here’s a demo of sorting on columns. You have to supply your own list of lists and your own path to the CustomMergeSort script.
-- Sort rows from a spreadsheet range, sorting and subsorting on four of its columns.
-- This code only demos the use of the custom sort.
-- It uses CustomMergeSort, but the sortOnCols script object works with any of my custom sorts.
-- This handler controls the sort.
on sortOnColumns(theRows, sortColumns)
script sortOnCols
property rows : theRows
property sortCols : sortColumns
property sortColumnCount : (count sortColumns)
-- Compare the values at the given sort columns in row list a with those in row list b.
on isGreater(a, b)
script p
property row_a : a
property row_b : b
end script
-- Repeat with each of the sort columns, in the given order.
repeat with i from 1 to sortColumnCount
set col to item i of o's sortCols
-- Get the cell values from the intersection of this sort column with these two rows.
set val_a to p's row_a's item col
set val_b to p's row_b's item col
if (val_a = val_b) then
-- Keep going if the cell values are equal.
else
-- Otherwise return whether or not cell value a is greater cell value b.
return (val_a > val_b)
end if
end repeat
return false -- a's sort-column values are all the same as b's, so a's not greater than b.
end isGreater
end script
-- Assuming one's CustomMergeSort() script is kept in the following location, load it and do the sort.
set sorter to (load script file ((path to scripts folder from user domain as Unicode text) & "Libraries:Sorts:Custom Merge Sort.scpt"))
sorter's CustomMergeSort(theRows, 1, -1, {comparer:sortOnCols})
end sortOnColumns
set theRows to aListOfListsExtractedFromNumbers
set sortColumns to {6, 2, 3, 5} -- Columns F, B, C, & E.
sortOnColumns(theRows, sortColumns)