Thursday, December 13, 2018

#1 2018-12-02 02:43:30 pm

Nigel Garvey
Moderator
From:: Warwickshire, England
Registered: 2002-11-20
Posts: 4741

timsort and nigsort

The two posts following this one contain a couple of customisable merge sorts on which I've been working recently, mainly for my own interest. One's an implementation of Tim Peters's timsort, the other my own iterative ternary merge sort. They're pretty fast, sort lists in place, and can sort in ways that macOS's built-in sorts can't — or at least currently don't.

Their main handlers take four positional parameters:

1. The list to be sorted.
2. The index (positive or negative) of one end of the sort range.
3. The index (ditto) of the other end of the sort range. The order of the indices doesn't matter, but the range must exist in the list and in any "slave" lists to be rearranged in parallel with it.
4. The customisation parameter. This is a record containing either, both, or neither of the following properties:
comparer: If this is included, its value must be a script object containing a handler labelled isGreater which takes two positional parameters. The handler must compare the parameter values (items from the list, passed to it by the sort) and return a boolean indicating whether or not the first has to go after the second when sorted. The idea is that you write your own handler to compare items in the way needed for the sort: say, records by certain properties or lists by certain items, or returning the opposite of the truth to get a descending sort. There are loads of possibilities once you get into them. If the comparer property's omitted, the sort defaults to an internal handler which compares items directly.
slave: If this is included, its value should be a list containing one or more lists whose items you want reordered in the same way as those in the list being sorted.
Both properties are optional, but the record itself isn't, so pass {} for a normal sort.

The main handler in each script has its own descriptive label for clarity. But the scripts also have properties labelled sort which are set to the handlers, so you can use sort instead of the handler labels in calls. Here are some example calls:

Applescript:

(* Sample comparison script objects. *)

script recordsByA
   -- Compare records by their 'a' properties.
   on isGreater(a, b)
       return (a's a > b's a)
   end isGreater
end script

script listsByItem1
   -- Compare lists by their first items.
   on isGreater(a, b)
       return (a's beginning > b's beginning)
   end isGreater
end script

script descending
   -- Compare items directly, lying about the result in order to get a reversed (descending) sort.
   on isGreater(a, b)
       return (a < b) -- (a ≤ b) to reverse stability too!
   end isGreater
end script

(* Sample calls. *)

sort(myList, 1, -1, {}) -- Sort an entire list normally.
sort(myList, 1, 10, {comparer:descending}) -- Reverse sort items 1 thru 10 of a list.
sort(myList, 1, -1, {slave:{anotherList, yetAnotherList}}) -- Sort an entire list, replicating the moves in two other lists.
sort(myListOfRecords, 1, -1, {comparer:recordsByA}) -- Sort a list of records on the records' 'a' properties.
sort(myListoflists, 1, -1, {comparer:listsByItem1}) -- Sort a list of lists on the lists' first items.
sort(myListoflists, 1, -1, {comparer:listsByItem1, slave:{anotherList}}) -- Sort a list of lists on the lists' first items, replicating the moves in another list.


NG

Offline

 

#2 2018-12-02 02:44:41 pm

Nigel Garvey
Moderator
From:: Warwickshire, England
Registered: 2002-11-20
Posts: 4741

Re: timsort and nigsort

Back in August, kerflooey directed my attention to a sort algorithm called timsort which was supposed to be very fast — so of course I had to look into it and have a go at implementing it in AppleScript! The only documentation I could find for it was the inventor's description, which doesn't include any sample or pseudocode, so I've had to work out much of the implementation from scratch.

timsort's described by its inventor Tim Peters as adaptive, stable, natural, and modestly named. It's essentially a one-pass iterative binary merge sort with numerous modifications to take advantage of any fortuitously placed order already present in the array. The AppleScript implementation of it here turns out to be a mixed bag speedwise. With lists that are already largely ordered — and I mean largely — and whose items take a relatively long time to compare, it can be up to twice as fast as my implementation of Yaroslavskiy's dual-pivot Quicksort and one-and-a-half times as fast as my own iterative ternary merge sort (see below). But with lists which start out randomly ordered, it can take up to twice as long as the other two sorts — although admittedly even that's not very long.

Although it's a merge sort, timsort uses insertion sorting to create its initial runs. (cf. versions of Quicksort which switch to insertion sorting for partitions below a certain length.) A minimum run length is calculated based on the total number of items being sorted and the comfort zone of the insertion sort. This length goes into the sort range length either exactly a power-of-two times or just under a power-of-two times with or without a large remainder. The idea is that merges throughout the sort will thereby be between runs of similar lengths and thus maximally efficient. This is somewhat academic in light of what follows, but Peters does it anyway and so does this script.

Each insertion sort is preceded by a check for the existing ascending or descending order where an initial run's to begin. This order will possibly be only two items long, but otherwise it's followed for as far as it goes. If it's descending order, the ordered group is simply reversed into ascending order. (Equal values count as ascending order, so stability isn't compromised.) If the group's shorter than the minimum run length, an insertion sort sorts in more items up to the minimum length as if it had arranged the already-ordered items itself. (Insertion sorts are stable too, so again stability's preserved.) Otherwise, the entire group counts as a run. This can lead to somewhat unequal run lengths, but the imbalance tends to be mitigated later in the sort.

As each initial run's created, its range details are stored. Where appropriate, there's then some intermediate merging of existing runs to maintain a situation whereby each run so far is both longer than the one which follows it and longer than the combined lengths of the two which follow it. Once the final initial run has been mixed into this process, the assembled runs are merged from last to first in increasing order of length.

A great inefficiency in merge sorts is that runs have to be either merged to other lists or extracted as lists in their own right and their items merged back. Timsort extracts just the shorter run each time (or the left if they're the same length), which leaves enough wiggle room in the original list to allow merging back to it not to overwrite any pending items in the unextracted "home" run. If it's the left run that's extracted, merging progresses from left to right; otherwise it has to be from right to left.

Merges begin conventionally. In a left-to-right merge, the lesser of the current values from each run (or the one from the left if they're equal) goes into the current merge slot. In a right-to-left merge, the greater value (or from the right if equal) is used. But if either run produces a certain number of "winning" items in a row, timsort switches to "galloping" mode. In this, a binary search is done of each run to see how many items from it must be used before the current item from the other. The relevant items are then moved en masse with hopefully fewer comparisons having been needed. If galloping starts not to pay off, the merge switches back to "normal" mode (which term I fancy is slightly snappier than Peters's "one pair at a time" mode.) Each time the merge stays in galloping mode, the threshold is lowered by one to make it easier to get back into galloping mode later. The threshold is raised by one on each switch back to normal mode. This makes practically no difference at all, but it's a nice touch.  wink

Since there are four possible merge styles — left-to-right or right-to-left in either normal mode or galloping mode — the amount of code required is potentially huge. I've telescoped the left-to-right and right-to-left aspects to a large extent by using variables to control the merge direction. But the left/right significance of comparisons isn't reversible, so references and "plug-in" handlers are used to manage the associations.

Known deviations from Peters's description:
• The minimum run lengths are shorter than Peters recommends because insertion sorting in AppleScript is more expensive than in Python. But Peters's formula has been slightly modified to avoid the lowest coefficient figure it might otherwise produce. So if the script's upper limit of 16 goes into the sort range length exactly a power-of-two times, 16 is the length used, not 8.
• Reciprocally, the shorter minimum run lengths make a "linear" insertion sort preferable to the binary one advocated by Peters.
• Peters allows the last run in the range to be just one item long if things work out that way. This script extends the preceding run instead.
• It's proved too slow in AppleScript to pre-check runs before extracting them in order to leave out items which would just go back to the slots from which they came.
• Similarly, a refinement for reducing the "area of uncertainty" before a binary search adds more time than it saves and has been omitted.
• The refinement for lowering and raising the galloping mode threshold has been kept, but I'm not sure how closely it conforms to the original idea. Peters seems to suggest that the threshold be raised or lowered indefinitely, but I've found it less damaging to performance to restrict the movement to within certain narrow limits.
• It's not clear from the description if galloping mode is supposed to be able to carry over between merges. In this script, it doesn't.

Applescript:

(* Timsort — customisable version

Merge sort algorithm: John von Neumann, 1945.
Insertion sort algorithm: unknown author.
Timsort algorithm: Tim Peters, 2002.
AppleScript implementation: Nigel Garvey, 2018, based on earlier implementations of merge and binary insertion sorts.

Parameters: (the list, range index 1, range index 2, customisation object)
   The range index parameters are integers specifying the range to be sorted in the list. They can be positive or negative and don't need to be in a particular order. The values 1 and -1 can be used to indicate the entire list.
   The customisation object is a record with optional 'comparer' and/or 'slave' properties.
       If given, the 'comparer' value must be a script object containing an isGreater(item1, item2) handler which compares two items passed to it from the list and returns a boolean indicating whether or not the first is "greater than" the second according to its criteria.
       If given, the 'slave' value can be a list of lists in which the moves of the main sort are to be echoed. These lists must of course be long enough to allow the same absolute range indices as for the main list. Alternatively, the value can be a script object containing its own slave handlers and any necessary preset properties. This is for compatibility with scripts written to use earlier versions of my customisable sorts and might conceivably be used for other purposes, such as transposing the sort range in the slave list(s) or counting item movements. The same script object can be used for both the 'comparer' and 'slave' values if it contains both sets of handlers.
       Where the 'comparer' and/or 'slave' properties are omitted, or the customisation parameter isn't a record or a script object, the defaults are direct comparisons and no 'slave' actions. Because of the customisation hooks, a "straight" sort with this code isn't as fast as with the non-customisable version. But subjectively, this isn't noticeable.
       
Result returned: None. The passed lists are sorted in place.
*)

on customTimsort(theList, rangeIndex1, rangeIndex2, customiser)
   script main
       -- Customisation properties.
       property comparer : me
       property slave : me
       property slaveSorting : missing value
       
       -- Main list properties.
       property lst : theList
       property externalRun : missing value
       property rangeStack : {}
       
       -- Properties for the current left and right items from two runs. They'll be accessed both directly and via references in variables labelled homeValue and externalValue.
       property leftValue : missing value
       property rightValue : missing value
       
       -- Properties to be set to the relevant binary search handlers.
       property searchHomeRun : missing value
       property searchExternalRun : missing value
       
       -- Tuning controls.
       property maxMinRun : 16.0
       property maxMinGallop : 7
       property minMinGallop : 4
       
       -- Slave sorting properties.
       property slaveList : missing value
       property slaveExternalRun : missing value
       property multipleSlaveLists : {}
       property multipleSlaveExternalRuns : {}
       property slaveListCount : missing value
       
       (* Timsort process: essentially a one-pass iterative binary merge sort. *)
       on tsrt(rangeStart, rangeEnd)
           -- Calculate the optimum minimum run length for a merge sort of this many items.
           set minimumRunLength to calculateMinimumRunLength(rangeEnd - rangeStart + 1)
           
           (* Starting at the beginning of the sort range:
   Begin each new run by skipping past items which are already in a continuous ascending or descending order until the order breaks or the end of the range is reached.
   If the skipped items are in descending order, reverse them.
   If they number the minimum run length or more, count them all as a run.
   Otherwise include more items up to the minimum run length (or to the end of the sort range if sooner) and "complete" an insertion sort of the group.
   Stack the run's range details ({start index, length}).
   Merge existing runs as necessary so that each run existing so far is both longer than the one which follows it and longer than the combined lengths of the two which follow it.
   Begin another new run at the next item in the sort range and repeat the above until the end of the range is reached.
   Merge the resulting runs in increasing order of length.
           *)

           set runStart to rangeStart
           set runCount to 0
           repeat while (runStart < rangeEnd) -- For each new run.
               set previousValue to my lst's item (runStart + 1) -- This *will* be the "previous" value in a moment!
               set k to runStart + 2
               -- Test the order of the first two items here. (Equality = ascending order.)
               if (my comparer's isGreater(my lst's item runStart, previousValue)) then
                   -- If it's descending order, follow it for as far as it goes, then reverse the group.
                   repeat until (k > rangeEnd)
                       set thisValue to my lst's item k
                       if (my comparer's isGreater(previousValue, thisValue)) then
                           set previousValue to thisValue
                           set k to k + 1
                       else
                           exit repeat
                       end if
                   end repeat
                   -- Reverse the group into ascending order.
                   set l to runStart
                   set r to k - 1
                   if (slaveSorting) then slave's |reverse|(l, r)
                   repeat while (l < r)
                       tell my lst's item l
                           set my lst's item l to my lst's item r
                           set my lst's item r to it
                       end tell
                       set l to l + 1
                       set r to r - 1
                   end repeat
               else
                   -- Otherwise just follow the ascending order for as far as it goes.
                   repeat until (k > rangeEnd)
                       set thisValue to my lst's item k
                       if (my comparer's isGreater(previousValue, thisValue)) then exit repeat
                       set previousValue to thisValue
                       set k to k + 1
                   end repeat
               end if
               
               -- Make the run the calculated mininum length unless it would end thereby within the initial order, beyond the sort range, or just one item from the end of the range. Where relevant, "complete" an insertion sort of it from where the order breaks.
               set runEnd to runStart + minimumRunLength - 1
               if (runEnd < k) then
                   -- Within the initial order. Count the entire ordered group as the run.
                   set runEnd to k - 1
               else
                   -- If beyond the end of the sort range, or only one item from the end, end the run at the end of the range.
                   if (runEnd + 2 > rangeEnd) then set runEnd to rangeEnd
                   -- Complete an insertion sort from k.
                   isrt(runStart, k, runEnd)
               end if
               -- Store the run's start index and length on the range stack.
               set end of my rangeStack to {runStart, runEnd - runStart + 1}
               set runCount to runCount + 1
               
               -- Merge existing runs as necessary to maintain "invariants" (a Python term?) whereby each run is longer than the one which follows it and longer than the combined lengths of the two which follow it.
               repeat -- until the invariants are satisfied and in equilibrium.
                   set endRunLength to my rangeStack's end's end
                   if ((runCount < 3) or (my rangeStack's item -3's end > ((my rangeStack's item -2's end) + endRunLength))) then
                       -- A third-from-last run either doesn't exist or is longer than the last two put together.
                       -- If a second-from-last doesn't exist either, or is longer than the last, the invariants are met.
                       if ((runCount < 2) or (my rangeStack's item -2's end > endRunLength)) then exit repeat
                       -- Otherwise prepare to merge the second-from-last and last runs.
                       set stackPointer to -1
                   else if (my rangeStack's item -3's end < endRunLength) then
                       -- A third-from-last run exists and is not longer than the last two put together and is in fact shorter than the last. Prepare to merge it and the second-from-last.
                       set stackPointer to -2
                   else
                       -- A third-from-last run exists and is not longer than the last two put together, but is not shorter than the last. Prepare to merge the second-from-last and last.
                       set stackPointer to -1
                   end if
                   -- Merge the selected runs.
                   set leftRunRange to my rangeStack's item (stackPointer - 1)
                   set rightRunRange to my rangeStack's item stackPointer
                   merge(leftRunRange, rightRunRange)
                   -- Increase the left range's length value to that of the now merged runs and remove the redundant right range from the stack.
                   set leftRunRange's item 2 to (leftRunRange's end) + (rightRunRange's end)
                   set my rangeStack's item stackPointer to missing value
                   set my rangeStack to my rangeStack's lists
                   set runCount to runCount - 1
               end repeat
               
               -- Prepare to start another run.
               set runStart to runEnd + 1
           end repeat
           
           -- When we have a full, invariant-conforming set of runs, merge them serially in increasing order of length.
           repeat with stackPointer from runCount to 2 by -1
               set leftRunRange to my rangeStack's item (stackPointer - 1)
               set rightRunRange to my rangeStack's item stackPointer
               merge(leftRunRange, rightRunRange)
               set leftRunRange's item 2 to (leftRunRange's end) + (rightRunRange's end)
           end repeat
       end tsrt
       
       (* Return the optimum minimum run length for a merge sort of n items. It should go into n either exactly a power of 2 times or just under a power of 2 times with or without a large remainder. Peters reckons n itself if < 64, otherwise a length between 32 and 64. This script performs best with lengths ≤ 16 or between 9 and 16. *)
       on calculateMinimumRunLength(n)
           -- Keep halving n while the result's > maxMinRun (16) then round up if it's not a whole number.
           repeat while (n > maxMinRun)
               set n to n / 2.0
           end repeat
           set minimumRunLength to n div 1
           if (minimumRunLength < n) then set minimumRunLength to minimumRunLength + 1
           
           return minimumRunLength
       end calculateMinimumRunLength
       
       (* Complete an insertion sort of a range where items rangeStart thru (k - 1) are known to be in order already. Peters recommends a binary insertion sort, but a "linear" one's faster with the shorter minimum run lengths in this script. *)
       on isrt(rangeStart, k, rangeEnd)
           set highestValueSoFar to my lst's item (k - 1)
           repeat with k from k to rangeEnd
               set currentValue to my lst's item k
               if (my comparer's isGreater(highestValueSoFar, currentValue)) then
                   repeat with insertionIndex from (k - 2) to rangeStart by -1
                       tell my lst's item insertionIndex
                           if (my comparer's isGreater(it, currentValue)) then
                               set my lst's item (insertionIndex + 1) to it
                           else
                               set insertionIndex to insertionIndex + 1
                               exit repeat
                           end if
                       end tell
                   end repeat
                   set my lst's item insertionIndex to currentValue
                   if (slaveSorting) then slave's rotate(insertionIndex, k)
               else
                   set my lst's item (k - 1) to highestValueSoFar
                   set highestValueSoFar to currentValue
               end if
           end repeat
           set my lst's item rangeEnd to highestValueSoFar
       end isrt
       
       (* Decide how to merge two runs (if at all) and do it. *)
       on merge(leftRunRange, rightRunRange)
           -- Merging's unnecessary if the right run follows on from the left anyway.
           set rightRunStart to rightRunRange's beginning
           set leftRunEnd to rightRunStart - 1
           set leftValue to my lst's item leftRunEnd
           set rightValue to my lst's item rightRunStart
           if (my comparer's isGreater(leftValue, rightValue)) then
               (* Merge required. *)
               -- Firstly compare the runs' lengths to identify the shorter and thus the merge direction required.
               set leftLength to leftRunRange's end
               set rightRunLength to rightRunRange's end
               set mergingRightToLeft to (leftLength > rightRunLength) -- True if the right run's shorter, otherwise merge from left to right.
               -- The same code's used for both left-to-right and right-to-left merges. Set variables to control the direction.
               if (mergingRightToLeft) then
                   -- Get the right run's items as a separate list, initialise traversal and limit indices for it, and get its last item.
                   set rightRunEnd to leftRunEnd + rightRunLength
                   set externalRun to my lst's items rightRunStart thru rightRunEnd
                   if (slaveSorting) then slave's extract(rightRunStart, rightRunEnd)
                   set externalIndex to rightRunLength
                   set externalLimit to 1
                   set rightValue to my externalRun's end
                   -- Set a reference to associate being an external-run value with being a right-run value.
                   set externalValue to a reference to rightValue
                   -- Initialise traversal and limit indices for the left run in the main list. We already have its last item.
                   set homeIndex to leftRunEnd
                   set homeLimit to leftRunRange's beginning
                   -- Set a reference to associate being a home-run value with being a left-run value.                    
                   set homeValue to a reference to leftValue
                   -- Plug in the relevant binary search handlers for a right-to-left merge,
                   set searchExternalRun to my firstSlotWGEV
                   set searchHomeRun to my firstSlotWGV
                   -- Use a negative increment value.
                   set step to -1
                   -- Initialise the merge traversal index to just beyond the merge range.
                   set k to rightRunEnd + 1
               else
                   -- Get the left run's items as a separate list, initialise traversal and limit indices for it, and get its first item
                   set leftRunStart to leftRunRange's beginning
                   set externalRun to my lst's items leftRunStart thru leftRunEnd
                   if (slaveSorting) then slave's extract(leftRunStart, leftRunEnd)
                   set externalIndex to 1
                   set externalLimit to leftLength
                   set leftValue to my externalRun's beginning
                   -- Set a reference to associate being an external-run value with being a left-run value.
                   set externalValue to a reference to leftValue
                   -- Initialise traversal and limit indices for the right run in the main list. We already have its first item
                   set homeIndex to rightRunStart
                   set homeLimit to leftRunEnd + rightRunLength
                   -- Set a reference to associate being a home-run value with being a right-run value.                    
                   set homeValue to a reference to rightValue
                   -- Plug in the relevant binary search handlers for a left-to-right merge.
                   set searchExternalRun to my lastSlotWLEV
                   set searchHomeRun to my lastSlotWLV
                   -- Use a positive increment value.
                   set step to 1
                   -- Initialise the merge traversal index to just before the merge range.
                   set k to leftRunStart - 1
               end if
               
               -- Begin the merge normally, switching to "galloping mode" if a "streak" (my term for the number of times in succession that a run supplies the value used) reaches the minGallop threshold and switching back again if both streaks drop below the threshold.
               set homeStreak to 0
               set externalStreak to 0
               set minGallop to maxMinGallop
               set normalMode to true
               repeat
                   set k to k + step
                   if (normalMode) then
                       (* If the current value from the left run is greater than that from the right, move the one from the "home" run to the next merge slot and get the next value from the home run. Otherwise use the value from the external run and get the next value from there. Finish when either run is used up, otherwise update the streaks and switch to galloping mode if appropriate. *)
                       if (my comparer's isGreater(leftValue, rightValue)) then
                           set my lst's item k to homeValue's contents
                           if (slaveSorting) then slave's reposition(homeIndex, k)
                           if (homeIndex = homeLimit) then
                               set k to k + step
                               if (slaveSorting) then slave's massPlace(externalIndex, externalLimit, k, step)
                               set my lst's item k to externalValue's contents
                               repeat with externalIndex from externalIndex + step to externalLimit by step
                                   set k to k + step
                                   set my lst's item k to my externalRun's item externalIndex
                               end repeat
                               exit repeat
                           end if
                           set homeStreak to homeStreak + 1
                           set externalStreak to 0
                           set normalMode to (homeStreak < minGallop) -- Galloping mode if false.
                           set homeIndex to homeIndex + step
                           set homeValue's contents to my lst's item homeIndex
                       else
                           set my lst's item k to externalValue's contents
                           if (slaveSorting) then slave's place(externalIndex, k)
                           if (externalIndex = externalLimit) then exit repeat
                           set homeStreak to 0
                           set externalStreak to externalStreak + 1
                           set normalMode to (externalStreak < minGallop) -- Galloping mode if false.
                           set externalIndex to externalIndex + step
                           set externalValue's contents to my externalRun's item externalIndex
                       end if
                   else
                       (* Galloping mode. Do a binary search of the remaining items in the "home" run for the last which must be used before the current item from the external run. If any, use all the home run items up to that point and get the next (if any) as the current one. Then search the remaining items in the external run for the last to use before the current home run item and use the relevant items from there. Update streaks and mode as appropriate. *)
                       -- Search the home run:
                       set x to searchHomeRun(externalValue's contents, homeIndex, homeLimit)
                       if (x > 0) then
                           -- Move the home run values(s), up to and including the one at the returned index, to the next merge slot(s).
                           set homeStreak to (x - homeIndex) * step + 1
                           if (slaveSorting) then slave's massReposition(homeIndex, x, k, step)
                           set my lst's item k to homeValue's contents
                           repeat with homeIndex from homeIndex + step to x by step
                               set k to k + step
                               set my lst's item k to my lst's item homeIndex
                           end repeat
                           set k to k + step
                           -- If the home run's used up, place the remaining external run item(s) and exit the merge.
                           if (homeIndex = homeLimit) then
                               if (slaveSorting) then slave's massPlace(externalIndex, externalLimit, k, step)
                               set my lst's item k to externalValue's contents
                               repeat with externalIndex from externalIndex + step to externalLimit by step
                                   set k to k + step
                                   set my lst's item k to my externalRun's item externalIndex
                               end repeat
                               exit repeat
                           end if
                           -- Otherwise get the next value from the home run as the current home run value.
                           set homeIndex to homeIndex + step
                           set homeValue's contents to my lst's item homeIndex
                       else
                           -- No hits in the home run.
                           set homeStreak to 0
                       end if
                       
                       -- Search the external run. Coming after the previous search, there's always at least one hit:
                       set x to searchExternalRun(homeValue's contents, externalIndex, externalLimit)
                       -- Move the external run values(s), up to and including the one at the returned index, to the next merge slot(s) …
                       set externalStreak to (x - externalIndex) * step + 1
                       if (slaveSorting) then slave's massPlace(externalIndex, x, k, step)
                       set my lst's item k to externalValue's contents
                       repeat with externalIndex from externalIndex + step to x by step
                           set k to k + step
                           set my lst's item k to my externalRun's item externalIndex
                       end repeat
                       -- … followed by the current home run value.
                       set k to k + step
                       if (slaveSorting) then slave's reposition(homeIndex, k)
                       set my lst's item k to homeValue's contents
                       
                       -- If either run is now used up, exit the merge. Otherwise get the next value from each.
                       if (externalIndex = externalLimit) then exit repeat
                       set externalIndex to externalIndex + step
                       set externalValue's contents to my externalRun's item externalIndex
                       
                       if (homeIndex = homeLimit) then
                           set k to k + step
                           if (slaveSorting) then slave's massPlace(externalIndex, externalLimit, k, step)
                           set my lst's item k to externalValue's contents
                           repeat with externalIndex from externalIndex + step to externalLimit by step
                               set k to k + step
                               set my lst's item k to my externalRun's item externalIndex
                           end repeat
                           exit repeat
                       end if
                       set homeIndex to homeIndex + step
                       set homeValue's contents to my lst's item homeIndex
                       
                       -- If both streaks were less than the galloping mode threshold, switch back to normal mode and raise the threshold. Otherwise lower it.
                       if ((externalStreak < minGallop) and (homeStreak < minGallop)) then
                           set externalStreak to 0
                           set homeStreak to 1
                           set normalMode to true
                           if (minGallop < maxMinGallop) then set minGallop to minGallop + 1
                       else
                           if (minGallop > minMinGallop) then set minGallop to minGallop - 1
                       end if
                   end if
               end repeat
           end if
       end merge
       
       (* Binary search handlers. *)
       -- Get the index of the last value in range l thru r of the home run that's less than the one given. 0 if none.
       on lastSlotWLV(value, l, r)
           if (comparer's isGreater(value, my lst's item l)) then
               -- At least one lesser value.
               repeat until (l = r)
                   set m to r - (r - l) div 2
                   if (comparer's isGreater(value, my lst's item m)) then
                       set l to m
                   else
                       set r to m - 1
                   end if
               end repeat
               return r
           else
               return 0
           end if
       end lastSlotWLV
       
       -- Get the index of the last value in range l thru r of the external run that's less than or equal to the one given. Never none in this script.
       on lastSlotWLEV(value, l, r)
           (*if comparer's isGreater(my externalRun's item l, value)) then
               return 0
           else*)

           -- At least one lesser or equal value.
           repeat until (l = r)
               set m to r - (r - l) div 2
               if (comparer's isGreater(my externalRun's item m, value)) then
                   set r to m - 1
               else
                   set l to m
               end if
           end repeat
           return l
           --end if
       end lastSlotWLEV
       
       -- Get the index of the first value in range l thru r of the home run (ie. last from the right in a right-to-left merge) that's greater than the one given. 0 if none.
       on firstSlotWGV(value, r, l)
           if (comparer's isGreater(my lst's item r, value)) then
               -- At least one greater value.
               repeat until (l = r)
                   set m to (l + r) div 2 -- = l + (r - l) div 2.
                   if (comparer's isGreater(my lst's item m, value)) then
                       set r to m
                   else
                       set l to m + 1
                   end if
               end repeat
               return l
           else
               return 0
           end if
       end firstSlotWGV
       
       -- Get the index of the first value in range l thru r of the external run (ie. last from the right in a right-to-left merge) that's greater than or equal to the one given. Never none in this script.
       on firstSlotWGEV(value, r, l)
           (*if (comparer's isGreater(value, my externalRun's item r)) then
               return 0
           else*)

           -- At least one greater or equal value.
           repeat until (l = r)
               set m to (l + r) div 2
               if (comparer's isGreater(value, my externalRun's item m)) then
                   set l to m + 1
               else
                   set r to m
               end if
           end repeat
           return r
           --end if
       end firstSlotWGEV
       
       (* Default comparison handler. *)
       on isGreater(a, b)
           return (a > b)
       end isGreater
       
       (* Default slave handlers. *)
       -- Extract a "run" from the slave list.
       on extract(a, b)
           set my slaveExternalRun to my slaveList's items a thru b
       end extract
       
       -- Set the slave list's item b to its extracted run's item a.
       on place(a, b)
           set my slaveList's item b to my slaveExternalRun's item a
       end place
       
       -- Set the slave list's item b to its item a.
       on reposition(a, b)
           set my slaveList's item b to my slaveList's item a
       end reposition
       
       -- Set the slave list's items c-onwards to its extracted run's items a thru b in the direction governed by step (1 or -1).
       on massPlace(a, b, c, step)
           repeat with a from a to b by step
               set my slaveList's item c to my slaveExternalRun's item a
               set c to c + step
           end repeat
       end massPlace
       
       -- Set the slave list's items c-onwards to its items a thru b in the direction governed by step (1 or -1).
       on massReposition(a, b, c, step)
           repeat with a from a to b by step
               set my slaveList's item c to my slaveList's item a
               set c to c + step
           end repeat
       end massReposition
       
       -- Reverse the order of the slave list's items a thru b.
       on |reverse|(a, b)
           repeat ((b - a + 1) div 2) times
               tell my slaveList's item a
                   set my slaveList's item a to my slaveList's item b
                   set my slaveList's item b to it
               end tell
               set a to a + 1
               set b to b - 1
           end repeat
       end |reverse|
       
       -- Rotate item b of the slave list down to slot a.
       on rotate(a, b)
           tell my slaveList's item b
               repeat with rotationIndex from (b - 1) to a by -1
                   set my slaveList's item (rotationIndex + 1) to my slaveList's item rotationIndex
               end repeat
               set my slaveList's item a to it
           end tell
       end rotate
       
       (* Alternative slave handlers for multiple slave lists. *)
       on extractMultiple(a, b)
           set my multipleSlaveExternalRuns to {}
           repeat with i from 1 to slaveListCount
               set end of my multipleSlaveExternalRuns to my multipleSlaveLists's item i's items a thru b
           end repeat
       end extractMultiple
       
       on placeMultiple(a, b)
           repeat with i from 1 to slaveListCount
               set my multipleSlaveLists's item i's item b to my multipleSlaveExternalRuns's item i's item a
           end repeat
       end placeMultiple
       
       on repositionMultiple(a, b)
           repeat with i from 1 to slaveListCount
               set my multipleSlaveLists's item i's item b to my multipleSlaveLists's item i's item a
           end repeat
       end repositionMultiple
       
       on massPlaceMultiple(a, b, c, step)
           repeat with i from 1 to slaveListCount
               set k to c
               repeat with j from a to b by step
                   set my multipleSlaveLists's item i's item k to my multipleSlaveExternalRuns's item i's item j
                   set k to k + step
               end repeat
           end repeat
       end massPlaceMultiple
       
       on massRepositionMultiple(a, b, c, step)
           repeat with i from 1 to slaveListCount
               set k to c
               repeat with j from a to b by step
                   set my multipleSlaveLists's item i's item k to my multipleSlaveLists's item i's item j
                   set k to k + step
               end repeat
           end repeat
       end massRepositionMultiple
       
       on reverseMultiple(a, b)
           repeat with i from 1 to slaveListCount
               set l to a
               set r to b
               repeat ((r - l + 1) div 2) times
                   tell my multipleSlaveLists's item i's item l
                       set my multipleSlaveLists's item i's item l to my multipleSlaveLists's item i's item r
                       set my multipleSlaveLists's item i's item r to it
                   end tell
                   set l to l + 1
                   set r to r - 1
               end repeat
           end repeat
       end reverseMultiple
       
       on rotateMultiple(a, b)
           repeat with i from 1 to slaveListCount
               tell my multipleSlaveLists's item i's item b
                   repeat with rotationIndex from (b - 1) to a by -1
                       set my multipleSlaveLists's item i's item (rotationIndex + 1) to my multipleSlaveLists's item i's item rotationIndex
                   end repeat
                   set my multipleSlaveLists's item i's item a to it
               end tell
           end repeat
       end rotateMultiple
   end script
   
   (* Process the input parameters. *)
   set listLen to (count theList)
   if (listLen < 2) then return
   -- Negative and/or transposed range indices.
   if (rangeIndex1 < 0) then set rangeIndex1 to listLen + rangeIndex1 + 1
   if (rangeIndex2 < 0) then set rangeIndex2 to listLen + rangeIndex2 + 1
   if (rangeIndex1 > rangeIndex2) then set {rangeIndex1, rangeIndex2} to {rangeIndex2, rangeIndex1}
   
   (* If the sort range contains more than one item, set up the customisation and do the sort. *)
   if (rangeIndex2 > rangeIndex1) then
       if ((customiser is {}) or (customiser's class is record)) then
           -- Use the passed or default comparer script. Get the passed or default slave parameter.
           set {comparer:main's comparer, slave:slaveParam} to customiser & {comparer:main, slave:main}
           if (slaveParam's class is script) then
               -- Passed or default slave script. Use it.
               set main's slave to slaveParam
               set main's slaveSorting to (slaveParam is not main)
           else if (slaveParam's class is list) then
               -- Passed list of slave lists. Set the default 'slave' script object's multipleSlaveLists property to it.
               set main's multipleSlaveLists to slaveParam
               -- Configure the default 'slave' script object to use the best handlers for the number of slave lists. This makes no difference with a non-default slave.
               set main's slaveListCount to (count main's multipleSlaveLists)
               set main's slaveSorting to (main's slaveListCount > 0)
               if (main's slaveSorting) then
                   if (main's slaveListCount is 1) then
                       set main's slaveList to beginning of main's multipleSlaveLists
                   else
                       set main's extract to main's extractMultiple
                       set main's place to main's placeMultiple
                       set main's reposition to main's repositionMultiple
                       set main's massPlace to main's massPlaceMultiple
                       set main's massReposition to main's massRepositionMultiple
                       set main's |reverse| to main's reverseMultiple
                       set main's rotate to main's rotateMultiple
                   end if
               end if
           end if
       end if
       
       tell main to tsrt(rangeIndex1, rangeIndex2)
   end if
   
   return -- nothing.
end customTimsort

property sort : customTimsort


NG

Offline

 

#3 2018-12-02 02:46:45 pm

Nigel Garvey
Moderator
From:: Warwickshire, England
Registered: 2002-11-20
Posts: 4741

Re: timsort and nigsort

This is a ternary merge sort of my own devising from a few years ago. It was my fastest sort until deposed by my dual-pivot Quicksort implementation earlier this year. But it's now back in contention thanks to the inclusion of a ternary version of timsort's minimum-run-length calculation and an idea I had recently for reducing the number of times the list items are compared. It's now roughly the same speed as the dual-pivot Quicksort when sorting randomly ordered lists — usually slightly faster, but sometimes not quite as fast — and up to twice as fast as timsort. With lists that are already largely ordered, it's intermediate in speed between the slower dual-pivot Quicksort and the faster timsort. Being a merge sort, it's also stable. So for general use, it's pretty all right!

It's a ternary sort (runs are merged three at a time instead of two) and iterative (ie. it has repeats rather than recursion). Up till now, I've never been sure whether to call it a ternary iterative merge sort or an iterative ternary merge sort. But since the former abbreviates to "timsort", which would never do, I've finally settled on the latter — although maybe "nigsort" would be less of a mouthful….  roll

It's remarkably simple compared with timsort, despite its ternary complexity. Instead of individual runs being extracted as separate lists to be merged back to the original, the entire sort range is extracted just once and runs are merged back and forth between the two lists on alternate passes, with the last pass always merging back to the original list. So fewer additional lists are generated (!), fewer items are extracted to them, and each item only travels one way per merge. No time's wasted looking for possibly non-existent order to exploit, although of course any which is there consequently won't be actively exploited. Run lengths are fixed and optimal, so they don't need to be stored, retrieved, or updated and merges are guaranteed to be balanced thoughout the sort.

Applescript:

(* Ternary merge sort — custom iterative version (or Iterative merge sort — custom ternary version)
Merge sort algorithm: John von Neumann, 1945.
AppleScript implementation: Nigel Garvey, 2007. Ternary iterative version 2012 & 2018.

Parameters: (the list, range index 1, range index 2, customisation object)
   The range index parameters are integers specifying the range to be sorted in the list. They can be positive or negative and don't need to be in a particular order. The values 1 and -1 can be used to indicate the entire list.
   The customisation object is a record with optional 'comparer' and/or 'slave' properties.
       If given, the 'comparer' value must be a script object containing an isGreater(item1, item2) handler which compares two items passed to it from the list and returns a boolean indicating whether or not the first is "greater than" the second according to its criteria.
       If given, the 'slave' value can be a list of lists in which the moves of the main sort are to be echoed. These lists must of course be long enough to allow the same absolute range indices as for the main list. Alternatively, the value can be a script object containing its own slave handlers and any necessary preset properties. This is for compatibility with scripts written to use earlier versions of my customisable sorts and might conceivably be used for other purposes, such as transposing the sort range in the slave list(s) or counting item movements. The same script object can be used for both the 'comparer' and 'slave' values if it contains both sets of handlers.
       Where the 'comparer' and/or 'slave' properties are omitted, or the customisation parameter isn't a record or a script object, the defaults are direct comparisons and no 'slave' actions. Because of the customisation hooks, a "straight" sort with this code isn't as fast as with the non-customisable version. But subjectively, this isn't noticeable.
       
Result returned: None. The passed lists are sorted in place.
*)


on customIterativeTernaryMergeSort(theList, rangeIndex1, rangeIndex2, customiser)
   -- Auxiliary script object. Alternates with the main one below as the source and destination for merges. Its list(s) will only be as long as the sort range, so it has its own range indices.
   script aux
       property lst : missing value
       property rangeStart : missing value
       property rangeEnd : missing value
       
       property slaveList : missing value
       property multipleSlaveLists : missing value
   end script
   
   -- Main script object. Contains the sort code, the list being sorted and its sort range indices, any slave lists being sorted in-house, and the default comparison and slave handlers.
   script main
       property comparer : me
       property slave : me
       
       property source : missing value
       property destination : missing value
       
       property lst : theList
       property rangeStart : missing value
       property rangeEnd : missing value
       property slaveSorting : missing value
       
       property slaveList : missing value
       property multipleSlaveLists : missing value
       property slaveListCount : missing value
       
       property maxMinRun : 9.0
       
       on nsrt(rangeStart, rangeEnd)
           set rangeLength to rangeEnd - rangeStart + 1
           -- Calculate the optimum initial run length for a ternary merge sort of this many items. Also the number of merging passes that will needed.
           set {initialRunLength, mergingPassesNeeded} to calculateInitialRunLengthAndMergingPassesNeeded(rangeLength)
           
           -- Use insertion sorting to set up initial runs of the initial length.
           set diff to initialRunLength - 1
           repeat with runStart from rangeStart to rangeEnd - 1 by initialRunLength
               set runEnd to runStart + diff
               if (runEnd > rangeEnd) then set runEnd to rangeEnd
               isrt(runStart, runEnd)
           end repeat
           -- If the sort range isn't longer than the initial run length, it's now sorted!
           if (mergingPassesNeeded is 0) then return
           
           -- Set the auxiliary script object's lst to the partly sorted range of the main list and set relevant range indices for both scripts. Items will be merged back and forth between the two lists on alternate passes.
           set aux's lst to my lst's items rangeStart thru rangeEnd
           if (slaveSorting) then slave's extract(rangeStart, rangeEnd)
           set aux's rangeStart to 1
           set aux's rangeEnd to rangeLength
           set my rangeStart to rangeStart
           set my rangeEnd to rangeEnd
           
           -- Set the initial source and destination scripts so that the final pass will merge back to the original list.
           set {source, destination} to item (mergingPassesNeeded mod 2 + 1) of {{me, aux}, {aux, me}}
           
           -- Perform the merge passes, each merging groups of three runs from the source list into longer runs in the destination list. The last group in each pass will often have a shorter last run and/or fewer than three runs.
           set groupLength to initialRunLength
           repeat mergingPassesNeeded times
               set runLength to groupLength
               set groupLength to runLength * 3
               
               -- Traverse the source range a group at a time.
               set k to (destination's rangeStart) - 1 -- Merge destination index.
               repeat with leftIndex from source's rangeStart to source's rangeEnd by groupLength
                   set destinationEnd to k + groupLength -- Merge destination end index.
                   if (destinationEnd > destination's rangeEnd) then set destinationEnd to destination's rangeEnd
                   
                   set leftRunEnd to leftIndex + runLength - 1
                   if (leftRunEnd < source's rangeEnd) then
                       -- Two or three runs in this group.
                       set middleRunEnd to leftRunEnd + runLength
                       if (middleRunEnd < source's rangeEnd) then
                           -- Three runs, in fact. Set up for a ternary merge.
                           set middleIndex to leftIndex + runLength
                           set rightIndex to middleIndex + runLength
                           set rightRunEnd to middleRunEnd + runLength
                           if (rightRunEnd > source's rangeEnd) then set rightRunEnd to source's rangeEnd
                           -- Get the first (lowest) value from each run.
                           set leftValue to source's lst's item leftIndex
                           set middleValue to source's lst's item middleIndex
                           set rightValue to source's lst's item rightIndex
                           -- Note the relationships between them. Updating these only as required saves recomparing items unnecessarily in the repeat.
                           set |lv>rv| to (comparer's isGreater(leftValue, rightValue))
                           set |mv>rv| to (comparer's isGreater(middleValue, rightValue))
                           set |lv>mv| to (comparer's isGreater(leftValue, middleValue))
                           repeat with k from (k + 1) to destinationEnd
                               if (|lv>rv|) then
                                   if (|mv>rv|) then
                                       -- The right value's the lowest. Assign it to the destination slot.
                                       set destination's lst's item k to rightValue
                                       if (slaveSorting) then slave's place(rightIndex, k)
                                       -- If no more right-run values, recast the middle run as the right and exit to the binary-merge repeat below.
                                       if (rightIndex = rightRunEnd) then
                                           set rightIndex to middleIndex
                                           set rightValue to middleValue
                                           set rightRunEnd to middleRunEnd
                                           exit repeat
                                       end if
                                       -- Otherwise get the next right-run value.
                                       set rightIndex to rightIndex + 1
                                       set rightValue to source's lst's item rightIndex
                                       -- Note if the middle-run value's greater than it.
                                       set |mv>rv| to (comparer's isGreater(middleValue, rightValue))
                                       -- If so and the left-run value's > the middle-run value, the left-run value must also be > the right-run value.
                                       -- If not and the left-run value's not > the middle-run value, the left-run value can't be > the right-run value.
                                       -- Otherwise actually compare the left- and right-run values.
                                       if ((|mv>rv|) = (|lv>mv|)) then
                                           set |lv>rv| to (|mv>rv|)
                                       else
                                           set |lv>rv| to (comparer's isGreater(leftValue, rightValue))
                                       end if
                                   else
                                       -- The middle value's the lowest or co-lowest with the right. Assign it to the destination slot.
                                       set destination's lst's item k to middleValue
                                       if (slaveSorting) then slave's place(middleIndex, k)
                                       -- If no more middle-run values, exit to the "two runs" repeat.
                                       if (middleIndex = middleRunEnd) then exit repeat
                                       -- Otherwise get the next middle-run value.
                                       set middleIndex to middleIndex + 1
                                       set middleValue to source's lst's item middleIndex
                                       -- Note if it's greater than the right-run value.
                                       set |mv>rv| to (comparer's isGreater(middleValue, rightValue))
                                       -- If not, no relationships have changed. Otherwise compare the left- and middle-run values.
                                       if (|mv>rv|) then set |lv>mv| to (comparer's isGreater(leftValue, middleValue))
                                   end if
                               else if (|lv>mv|) then
                                   -- The middle value's the lowest. Assign it to the destination slot.
                                   set destination's lst's item k to middleValue
                                   if (slaveSorting) then slave's place(middleIndex, k)
                                   -- If no more middle-run values, exit to the "two runs" repeat.
                                   if (middleIndex = middleRunEnd) then exit repeat
                                   -- Otherwise get the next middle-run value.
                                   set middleIndex to middleIndex + 1
                                   set middleValue to source's lst's item middleIndex
                                   -- Note if the left-run value's greater than it.
                                   set |lv>mv| to (comparer's isGreater(leftValue, middleValue))
                                   -- If so, no relationships have changed. Otherwise compare the middle- and right-run values.
                                   if (|lv>mv|) then
                                   else
                                       set |mv>rv| to (comparer's isGreater(middleValue, rightValue))
                                   end if
                               else
                                   -- The left value's the lowest or co-lowest. Assign it to the destination slot.
                                   set destination's lst's item k to leftValue
                                   if (slaveSorting) then slave's place(leftIndex, k)
                                   -- If no more left-run values, recast the middle run as the left and exit to the "two runs" repeat.
                                   if (leftIndex = leftRunEnd) then
                                       set leftIndex to middleIndex
                                       set leftValue to middleValue
                                       set leftRunEnd to middleRunEnd
                                       exit repeat
                                   end if
                                   -- Otherwise get the next left-run value
                                   set leftIndex to leftIndex + 1
                                   set leftValue to source's lst's item leftIndex
                                   -- Note if it's greater than the middle-run value.
                                   set |lv>mv| to (comparer's isGreater(leftValue, middleValue))
                                   -- If so and and the middle-run value's > the right-run value, the left-run value must also be > the right-run value.
                                   -- If not and the middle-run value's not > the right-run value, the left-run value can't be > the right-run value.
                                   -- Otherwise actually compare the left- and right-run values.
                                   if ((|lv>mv|) = (|mv>rv|)) then
                                       set |lv>rv| to (|lv>mv|)
                                   else
                                       set |lv>rv| to (comparer's isGreater(leftValue, rightValue))
                                   end if
                               end if
                           end repeat
                       else
                           -- Only two runs in this group. Set up for a binary merge.
                           set rightIndex to leftIndex + runLength
                           set rightRunEnd to middleRunEnd
                           if (rightRunEnd > source's rangeEnd) then set rightRunEnd to source's rangeEnd
                           set leftValue to source's lst's item leftIndex
                           set rightValue to source's lst's item rightIndex
                       end if
                       
                       -- Do a binary merge of the (remaining) two runs.
                       repeat with k from (k + 1) to destinationEnd
                           if (comparer's isGreater(leftValue, rightValue)) then
                               -- The right value's less than the left. Assign it to the destination slot.
                               set destination's lst's item k to rightValue
                               if (slaveSorting) then slave's place(rightIndex, k)
                               -- If no more right-run values, exit to the "one run" repeat below.
                               if (rightIndex = rightRunEnd) then exit repeat
                               -- Otherwise get the next right-run value.
                               set rightIndex to rightIndex + 1
                               set rightValue to source's lst's item rightIndex
                           else
                               -- The left value's less than or equal to the right. Assign it to the destination slot.
                               set destination's lst's item k to leftValue
                               if (slaveSorting) then slave's place(leftIndex, k)
                               -- If no more left-run values, recast the right run as the left and exit to the "one run" repeat.
                               if (leftIndex = leftRunEnd) then
                                   set leftIndex to rightIndex
                                   exit repeat
                               end if
                               -- Otherwise get the next left-run value.
                               set leftIndex to leftIndex + 1
                               set leftValue to source's lst's item leftIndex
                           end if
                       end repeat
                   end if
                   
                   -- Copy over the (remaining) one run as is.
                   if (slaveSorting) then slave's massPlace(leftIndex, k + 1, destinationEnd)
                   repeat with k from (k + 1) to destinationEnd
                       set destination's lst's item k to source's lst's item leftIndex
                       set leftIndex to leftIndex + 1
                   end repeat
               end repeat
               
               -- Swap the script objects' source and destination roles for the next pass.
               tell source
                   set source to destination
                   set destination to it
               end tell
               if (slaveSorting) then slave's switch()
           end repeat
       end nsrt
       
       (* Return the optimum initial run length for a ternary merge sort of this many items. Also the number of merging passes that will be needed. *)
       on calculateInitialRunLengthAndMergingPassesNeeded(n)
           -- The length is n itself if ≤ maxMinRun, otherwise a figure between (maxMinRun / 3 + 1) and maxMinRun which goes into n exactly a power-of-3 times or just under a power-of-3 times with or without a large remainder. For this, keep dividing by 3 while the result's > maxMinRun then round up if it's not a whole number. The number of divisions performed is also the number of merging passes that will be needed.
           set mergingPassesNeeded to 0
           repeat while (n > maxMinRun)
               set n to n / 3.0
               set mergingPassesNeeded to mergingPassesNeeded + 1
           end repeat
           set initialRunLength to n div 1
           if (initialRunLength < n) then set initialRunLength to initialRunLength + 1
           
           return {initialRunLength, mergingPassesNeeded}
       end calculateInitialRunLengthAndMergingPassesNeeded
       
       (* Insertion sort handler. *)
       on isrt(rangeLeft, rangeRight)
           -- Presort the first two items to set up a minor optimisation whereby the most recent instance of the highest value so far doesn't go back into the list until it's superseded or the end of the sort is reached.
           set highestValueSoFar to my lst's item rangeLeft
           set currentValue to my lst's item (rangeLeft + 1)
           if (comparer's isGreater(highestValueSoFar, currentValue)) then
               set my lst's item rangeLeft to currentValue
               if (slaveSorting) then slave's swap(rangeLeft, rangeLeft + 1)
           else
               set highestValueSoFar to currentValue
           end if
           -- Work through the rest of the range, rotating values back into the sorted group as necessary.
           repeat with k from (rangeLeft + 2) to rangeRight
               set currentValue to my lst's item k
               if (comparer's isGreater(highestValueSoFar, currentValue)) then
                   repeat with insertionIndex from (k - 2) to rangeLeft by -1
                       tell my lst's item insertionIndex
                           if (comparer's isGreater(it, currentValue)) then
                               set my lst's item (insertionIndex + 1) to it
                           else
                               set insertionIndex to insertionIndex + 1
                               exit repeat
                           end if
                       end tell
                   end repeat
                   set my lst's item insertionIndex to currentValue
                   if (slaveSorting) then slave's rotate(insertionIndex, k)
               else
                   set my lst's item (k - 1) to highestValueSoFar
                   set highestValueSoFar to currentValue
               end if
           end repeat
           -- At the end, ensure that the main list's highest value goes back into it.
           set my lst's item rangeRight to highestValueSoFar
       end isrt
       
       (* Default comparison handler. *)
       on isGreater(a, b)
           (a > b)
       end isGreater
       
       (* Default slave handlers. *)
       -- Swap items a and b in the slave list. (Used in the insertion sort.)
       on swap(a, b)
           tell my slaveList's item a
               set my slaveList's item a to my slaveList's item b
               set my slaveList's item b to it
           end tell
       end swap
       
       -- Rotate item b of the slave list down to slot a. (Insertion sort.)
       on rotate(a, b)
           tell my slaveList's item b
               repeat with rotationIndex from (b - 1) to a by -1
                   set my slaveList's item (rotationIndex + 1) to my slaveList's item rotationIndex
               end repeat
               set my slaveList's item a to it
           end tell
       end rotate
       
       -- Get an auxiliary list consisting of just the items in the range to be sorted. (Used to prepare the merge sort.)
       on extract(a, b)
           set aux's slaveList to my slaveList's items a thru b
       end extract
       
       -- Set item b of the current destination list to item a of the current source list. (Used in the merge sort.)
       on place(a, b)
           set destination's slaveList's item b to source's slaveList's item a
       end place
       
       -- Set items b thru c of the current destination list to items a thru (a + (c - b)) of the current source list. (Used in the merge sort.)
       on massPlace(a, b, c)
           repeat with b from b to c
               set destination's slaveList's item b to source's slaveList's item a
               set a to a + 1
           end repeat
       end massPlace
       
       -- Switch the source and destination roles of the slave list and its auxiliary. (By default, they're in script objects whose roles are switched anyway.)
       on switch()
       end switch
       
       (* Alternative slave handlers for multiple slave lists. *)
       on swapMultiple(a, b)
           repeat with i from 1 to slaveListCount
               tell my multipleSlaveLists's item i's item a
                   set my multipleSlaveLists's item i's item a to my multipleSlaveLists's item i's item b
                   set my multipleSlaveLists's item i's item b to it
               end tell
           end repeat
       end swapMultiple
       
       on rotateMultiple(a, b)
           repeat with i from 1 to slaveListCount
               tell my multipleSlaveLists's item i's item b
                   repeat with rotationIndex from (b - 1) to a by -1
                       set my multipleSlaveLists's item i's item (rotationIndex + 1) to my multipleSlaveLists's item i's item rotationIndex
                   end repeat
                   set my multipleSlaveLists's item i's item a to it
               end tell
           end repeat
       end rotateMultiple
       
       on extractMultiple(a, b)
           set aux's multipleSlaveLists to {}
           repeat with i from 1 to slaveListCount
               set end of aux's multipleSlaveLists to my multipleSlaveLists's item i's items a thru b
           end repeat
       end extractMultiple
       
       on placeMultiple(a, b)
           repeat with i from 1 to slaveListCount
               set destination's multipleSlaveLists's item i's item b to source's multipleSlaveLists's item i's item a
           end repeat
       end placeMultiple
       
       on massPlaceMultiple(a, b, c)
           repeat with i from 1 to slaveListCount
               set x to a
               repeat with k from b to c
                   set destination's multipleSlaveLists's item i's item k to source's multipleSlaveLists's item i's item x
                   set x to x + 1
               end repeat
           end repeat
       end massPlaceMultiple
   end script
   
   -- Process the input parameters.
   set listLen to (count theList)
   if (listLen < 2) then return
   -- Negative and/or transposed range indices.
   if (rangeIndex1 < 0) then set rangeIndex1 to listLen + rangeIndex1 + 1
   if (rangeIndex2 < 0) then set rangeIndex2 to listLen + rangeIndex2 + 1
   if (rangeIndex1 > rangeIndex2) then set {rangeIndex1, rangeIndex2} to {rangeIndex2, rangeIndex1}
   
   if (rangeIndex2 > rangeIndex1) then -- More than one item to sort.
       -- The customisation parameter is a record containing comparer and/or slave properties, whose values are script objects containing the comparison and slave handlers respectively. If either property is omitted from the record, or if the parameter isn't a record, use the appropriate default handers in the script object above. NEW: THE SLAVE PARAMETER CAN NOW BE JUST A LIST OF SLAVE LISTS, IN WHICH CASE THE SCRIPT ITSELF WILL APPLY THE NORMALLY REQUIRED SLAVE ACTIONS TO EACH ONE.
       if ((customiser is {}) or (customiser's class is record)) then
           -- Use the passed or default comparer script. Get the passed or default slave parameter.
           set {comparer:main's comparer, slave:slaveParam} to customiser & {comparer:main, slave:main}
           if (slaveParam's class is script) then
               -- Passed or default slave script. Use it.
               set main's slave to slaveParam
               set slaveSorting to (slaveParam is not main)
           else if (slaveParam's class is list) then
               -- Passed list of slave lists. Set the default 'slave' script object's multipleSlaveLists property to it.
               set main's multipleSlaveLists to slaveParam
               set slaveSorting to false
           end if
           -- Configure the default 'slave' script object to use the best handlers for the number of slave lists. This makes no difference with a non-default slave.
           if (main's multipleSlaveLists is not missing value) then
               set main's slaveListCount to (count main's multipleSlaveLists)
               if (main's slaveListCount > 0) then
                   set slaveSorting to true
                   if (main's slaveListCount is 1) then
                       set main's slaveList to beginning of main's multipleSlaveLists
                   else
                       set main's extract to main's extractMultiple
                       set main's place to main's placeMultiple
                       set main's massPlace to main's massPlaceMultiple
                       set main's swap to main's swapMultiple
                       set main's rotate to main's rotateMultiple
                   end if
               end if
           end if
           set main's slaveSorting to slaveSorting
           
           -- Do the sort.
           tell main to nsrt(rangeIndex1, rangeIndex2)
       end if
   end if
   
   return -- nothing
end customIterativeTernaryMergeSort

property sort : customIterativeTernaryMergeSort


NG

Offline

 

Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)