Sunday, September 22, 2019

#1 2018-03-24 09:18:25 am

Nigel Garvey
Moderator
From:: Warwickshire, England
Registered: 2002-11-20
Posts: 5013

Dual-pivot Quicksort

I recently came across Vladimir Yaroslavskiy's dual-pivot Quicksort algorithm — which he claims is even faster than a Quicksort optimised in the ways suggested by Robert Sedgewick and others — and have spent the past month or so trying to get it to be faster in AppleScript instead of slower!

As the name suggests, a dual-pivot Quicksort uses two pivot values for partitioning instead of one. The Java code in Yaroslavskiy's 2009 write-up of his idea includes analogues of three of the optimisations recommended for single-pivot sorts (which makes its claimed superiority over single-pivot sorts with similar optimisations somewhat less impressive.  wink )  These optimisations — two-medians-of-five pivot selection, five-way partitioning, and insertion sorting of short partitions — are all inefficiently coded when translated directly into AppleScript (the insertion sort eyewateringly so), but it may be that the code was simplified to clarify the method. Once these inefficiencies are fixed, the result's a very fast vanilla sort.

Yaroslavskiy's pivot selection involves pre-sorting five evenly-spaced items in the current range, using a sequence of nine compare-and-swap actions. The second and fourth sorted items are then swapped with the items at the ends of the range and used as pivot values — with the minor optimisation that they're not actually put back into the list at that stage. My version of the idea includes the two end items among the five pivot candidates, uses the insertion sort algorithm, sorts the values between variables instead of in the list, and returns the unsuccessful candidates directly to the appropriate slots afterwards. This takes just eight list accesses, four to ten comparisons, and zero to fourteen variable reassignments.

The partitioning scheme in Yaroslavskiy's code initially divides each range into values less than the lesser pivot, an instance of the lesser pivot, values between or equal to the pivots, an instance of the greater pivot, and values greater than the greater pivot. The central partition's then partitioned further to move any other pivot instances it contains to the appropriate ends of it. The idea is that once all instances of the two pivot values are in place, they needn't be touched again for the rest of the sort. In AppleScript, it's faster to do the whole thing in one pass instead of two, which involves more comparisons and moves, but actually takes less time. However, it's still not fast enough to bring sort times down to those obtained with my implementation of the equivalent single-pivot sort. Full five-way partitioning only pays for itself in AppleScript where a pathologically huge proportion of the values is duplicates. It's better to stick with Yaroslavskiy's initial separation and leave any pivot instances in the central partition to be sorted out incidentally later. This does makes the dual-pivot sort faster than my single-pivot implementation in the vast majority of cases!

A less critical, but philosophically more challenging decision involves a related optimisation whereby the middle partition is ignored if the pivots are equal. If they are equal, the partition only contains more instances of that one value and is therefore fully sorted. The problem is that the pivots rarely are equal, so the time spent comparing them is usually wasted. But when the number of duplicates in the list is such that there are fewer than about fifteen different values per hundred items, the sort begins to suffer a speed penalty if it doesn't have this optimisation. (Quicksorts are very inefficient at sorting values that are all the same.) Since the penalty with the optimisation is much less than the worst-case penalty without it, I've kept it in.

The "straight" version of the dual-pivot Quicksort here vies with a couple of my other sorts as my fastest vanilla sort implementation. The customisable version is easily my fastest customisable sort. While testing them, I also tried iterative versions, which are remarkably identical in speed. For some reason, the iterative version of the customisable sort is consistently a few milliseconds faster than its recursive sibling with lists containing more than around ten thousand items. For this reason, and to make the code more interesting (hopefully!), the iterative version of the customisable sort is the one given in the following post. Meanwhile, here's the recursive "straight" version. The script includes a demo handler at the end:

Applescript:

(* Quicksort — dual-pivot version

Quicksort algorithm: S.A.R. (Tony) Hoare, 1960.
Insertion sort algorithm: unknown author.
AppleScript implementations: Arthur J. Knapp and Nigel Garvey, 2003. Minor modifications by Nigel Garvey 2007, 2010, and 2018.
Dual-pivot Quicksort algorithm: Vladimir Yaroslavskiy, 2008/2009.
AppleScript implementation: Nigel Garvey 2018.

Parameters: (list, range index 1, range index 2)
   The range index parameters are integers specifying the range to be sorted in the list. They can be positive or negative and don't need to be in a particular order. The values 1 and -1 can be used to indicate the entire list.
Result returned: None. The passed list is sorted in place.
*)


on dualPivotQuicksort(theList, rangeIndex1, rangeIndex2)
   script o
       property cutoff : 16 -- Use an insertion sort or swap with ranges of this length or less.
       property lst : theList
       
       on qsrt(rangeLeft, rangeRight) -- The dual-pivot Quicksort handler.
           (* TWO-MEDIANS-OF-5 PIVOT SELECTION *)
           -- Select the second lowest and second highest of five values from the current range to use as pivot values.
           set slot3 to (rangeLeft + rangeRight) div 2 -- Roughly the middle of the range
           set slot2 to (rangeLeft + slot3) div 2 -- Roughly halfway between the middle and the beginning.
           set slot4 to (slot3 + rangeRight) div 2 -- Roughly halfway between the middle and the end.
           set v1 to my lst's item rangeLeft
           set pivot1 to my lst's item slot2 -- 'pivot1' rather than 'v2', in anticipation of the eventual contents.
           set v3 to my lst's item slot3
           set pivot2 to my lst's item slot4 -- 'pivot2' rather than 'v4', ditto.
           set v5 to my lst's item rangeRight
           -- Sort the five values between the variables using an insertion sort algorithm.
           if (v1 > pivot1) then
               tell pivot1
                   set pivot1 to v1
                   set v1 to it
               end tell
           end if
           if (pivot1 > v3) then
               tell v3
                   set v3 to pivot1
                   if (v1 > it) then
                       set pivot1 to v1
                       set v1 to it
                   else
                       set pivot1 to it
                   end if
               end tell
           end if
           if (v3 > pivot2) then
               tell pivot2
                   set pivot2 to v3
                   if (pivot1 > it) then
                       set v3 to pivot1
                       if (v1 > it) then
                           set pivot1 to v1
                           set v1 to it
                       else
                           set pivot1 to it
                       end if
                   else
                       set v3 to it
                   end if
               end tell
           end if
           if (pivot2 > v5) then
               tell v5
                   set v5 to pivot2
                   if (v3 > it) then
                       set pivot2 to v3
                       if (pivot1 > it) then
                           set v3 to pivot1
                           if (v1 > it) then
                               set pivot1 to v1
                               set v1 to it
                           else
                               set pivot1 to it
                           end if
                       else
                           set v3 to it
                       end if
                   else
                       set pivot2 to it
                   end if
               end tell
           end if
           -- Put the non-pivot values back into the list at the appropriate points. The pivot values nominally go to the ends of the range, but don't physically need to be there.
           set my lst's item slot2 to v1
           set my lst's item slot3 to v3
           set my lst's item slot4 to v5
           
           (* PARTIAL FIVE-WAY PARTITIONING *)
           -- Group the items which aren't the two pivot instances into values less than pivot1, values between or equal to the pivots, and values greater than pivot2. Then "swap" the pivot instances into place between the middle and outer groups.
           set leftPointer to rangeLeft + 1
           set rightPointer to rangeRight - 1
           set k to leftPointer
           repeat until (k > rightPointer)
               set leftValue to my lst's item k
               if (pivot1 > leftValue) then
                   -- The current value's less than pivot1. Swap it with the leftmost value in the middle partition and cede that slot to the less-than-pivot1 partition. The item may be swapped with itself the first few times, but it's best not to worry about it.
                   set my lst's item k to my lst's item leftPointer
                   set my lst's item leftPointer to leftValue
                   set leftPointer to leftPointer + 1
               else if (leftValue > pivot2) then
                   -- The value's greater than pivot2. Cede any values greater than pivot2 at the end of the middle partition to the greater-than-pivot2 partition until a value less than or equal to pivot2 shows up or the end pointer meets the current index.
                   set rightValue to my lst's item rightPointer
                   repeat while ((rightValue > pivot2) and (rightPointer > k))
                       set rightPointer to rightPointer - 1
                       set rightValue to my lst's item rightPointer
                   end repeat
                   -- Swap the current value with the one just found. If this is less than pivot1, make it a three-way swap with the item at the beginning of the partition. Extend the outer partition(s) accordingly.
                   set my lst's item rightPointer to leftValue
                   if (pivot1 > rightValue) then
                       set my lst's item k to my lst's item leftPointer
                       set my lst's item leftPointer to rightValue
                       set leftPointer to leftPointer + 1
                   else
                       set my lst's item k to rightValue
                   end if
                   set rightPointer to rightPointer - 1
               end if
               set k to k + 1
           end repeat
           -- "Swap" the pivot instances into place between the middle and outer partitions.
           set item rangeLeft of my lst to item (leftPointer - 1) of my lst
           set item (leftPointer - 1) of my lst to pivot1
           set item rangeRight of my lst to item (rightPointer + 1) of my lst
           set item (rightPointer + 1) of my lst to pivot2
           -- At this point:
           -- The left partition (rangeLeft thru (leftPointer - 2)) contains values less than pivot1 or is empty.
           -- Item (leftPointer - 1) is an instance of pivot1.
           -- The middle partition (leftPointer thru rightPointer) contains at least one value, which is greater than or equal to pivot1 and less than or equal to pivot2.
           -- Item (rightPointer + 1) is an instance of pivot2.
           -- The right partition ((rightPointer + 2) thru rangeRight) contains values greater than pivot2 or is empty.
           -- The two known pivot instances are in the right places in the list and can be left alone for the rest of the sort.
           
           (* RECURSION OR OTHERWISE *)
           -- Deal with partitions containing more than one item by recursion, insertion sorting, or simple swapping as appropriate.
           set leftPartitionLength to (leftPointer - 1) - rangeLeft
           if (leftPartitionLength > cutoff) then
               qsrt(rangeLeft, leftPointer - 2)
           else if (leftPartitionLength > 2) then
               isrt(rangeLeft, leftPointer - 2)
           else if (leftPartitionLength is 2) then
               set leftValue to my lst's item rangeLeft
               set rightValue to my lst's item (leftPointer - 2)
               if (leftValue > rightValue) then
                   set my lst's item rangeLeft to rightValue
                   set my lst's item (leftPointer - 2) to leftValue
               end if
           end if
           if (pivot2 > pivot1) then -- (If the pivots are equal, the middle partition only contains instances of that value and doesn't need any further action.)
               set middlePartitionLength to rightPointer - leftPointer + 1
               if (middlePartitionLength > cutoff) then
                   qsrt(leftPointer, rightPointer)
               else if (middlePartitionLength > 2) then
                   isrt(leftPointer, rightPointer)
               else if (middlePartitionLength is 2) then
                   set leftValue to my lst's item leftPointer
                   set rightValue to my lst's item rightPointer
                   if (leftValue > rightValue) then
                       set my lst's item leftPointer to rightValue
                       set my lst's item rightPointer to leftValue
                   end if
               end if
           end if
           set rightPartitionLength to rangeRight - (rightPointer + 1)
           if (rightPartitionLength > cutoff) then
               qsrt(rightPointer + 2, rangeRight)
           else if (rightPartitionLength > 2) then
               isrt(rightPointer + 2, rangeRight)
           else if (rightPartitionLength is 2) then
               set leftValue to my lst's item (rightPointer + 2)
               set rightValue to my lst's item rangeRight
               if (leftValue > rightValue) then
                   set my lst's item (rightPointer + 2) to rightValue
                   set my lst's item rangeRight to leftValue
               end if
           end if
       end qsrt
       
       on isrt(rangeLeft, rangeRight) -- The insertion sort handler.
           -- Presort the first two items to set up a minor optimisation whereby the most recent instance of the highest value so far doesn't go back into the list until it's superseded or the end of the sort is reached.
           set highestValueSoFar to my lst's item rangeLeft
           set currentValue to my lst's item (rangeLeft + 1)
           if (highestValueSoFar > currentValue) then
               set my lst's item rangeLeft to currentValue
           else
               set highestValueSoFar to currentValue
           end if
           -- Work through the rest of the range, rotating values back into the sorted group as necessary.
           repeat with k from (rangeLeft + 2) to rangeRight
               set currentValue to my lst's item k
               if (highestValueSoFar > currentValue) then
                   -- The current value's less than the highest so far and must be inserted into the sorted group. Shift previously sorted values greater than it up a slot (except for the highest so far, which is in a variable) until the appropriate insertion location's found.
                   repeat with insertionIndex from (k - 2) to rangeLeft by -1
                       tell my lst's item insertionIndex
                           if (it > currentValue) then
                               -- Greater value. Move it up.
                               set my lst's item (insertionIndex + 1) to it
                           else
                               -- Lesser or equal value. Set the just vacated slot above it as the insertion location and stop looking.
                               set insertionIndex to insertionIndex + 1
                               exit repeat
                           end if
                       end tell
                   end repeat
                   -- Insert the current value at the determined location.
                   set my lst's item insertionIndex to currentValue
               else
                   -- The current value's greater than or equal to the highest so far and simply inherits that role.
                   set my lst's item (k - 1) to highestValueSoFar
                   set highestValueSoFar to currentValue
               end if
           end repeat
           -- At the end, ensure that the highest value goes back into the list.
           set my lst's item rangeRight to highestValueSoFar
       end isrt
   end script
   
   -- Process the input parameters.
   set listLen to (count theList)
   if (listLen > 1) then
       -- Negative and/or transposed range indices.
       if (rangeIndex1 < 0) then set rangeIndex1 to listLen + rangeIndex1 + 1
       if (rangeIndex2 < 0) then set rangeIndex2 to listLen + rangeIndex2 + 1
       if (rangeIndex1 > rangeIndex2) then set {rangeIndex1, rangeIndex2} to {rangeIndex2, rangeIndex1}
       
       -- Do the sort.
       if (rangeIndex2 - rangeIndex1 + 1 > o's cutoff) then
           tell o to qsrt(rangeIndex1, rangeIndex2)
       else
           tell o to isrt(rangeIndex1, rangeIndex2)
       end if
   end if
   
   return -- nothing.
end dualPivotQuicksort

property sort : dualPivotQuicksort

(*
on demo()
   script o
       property lst : {}
   end script
   
   repeat 1000 times
       set end of o's lst to (random number 1000)
   end repeat
   
   -- Sort the entire list (items 1 thru -1).
   sort(o's lst, 1, -1)
   
   return o's lst
end demo

demo()
*)

12th August 2018: Demo at end commented out for clarity.

Last edited by Nigel Garvey (2018-08-19 08:30:57 am)


NG

Offline

 

#2 2018-03-24 09:21:26 am

Nigel Garvey
Moderator
From:: Warwickshire, England
Registered: 2002-11-20
Posts: 5013

Re: Dual-pivot Quicksort

Following on from the previous post, here's the iterative customisable version. The script includes a couple of demo handlers at the end. Don't worry about its length. It is quite fast.  wink

Applescript:

(* Quicksort — customisable iterative dual-pivot version

Quicksort algorithm: S.A.R. (Tony) Hoare, 1960.
Insertion sort algorithm: unknown author.
AppleScript implementations: Arthur J. Knapp and Nigel Garvey, 2003. Minor modifications by Nigel Garvey 2007, 2010, and 2018.
Dual-pivot Quicksort algorithm: Vladimir Yaroslavskiy, 2008/2009.
AppleScript implementation: Nigel Garvey 2018.

Parameters: (the list, range index 1, range index 2, customisation object)
   The range index parameters are integers specifying the range to be sorted in the list. They can be positive or negative and don't need to be in a particular order. The values 1 and -1 can be used to indicate the entire list.
   The customisation object is a record with optional 'comparer' and/or 'slave' properties.
       If given, the 'comparer' value must be a script object containing an isGreater(item1, item2) handler which compares two items passed to it from the list and returns a boolean indicating whether or not the first is "greater than" the second according to its criteria.
       If given, the 'slave' value can be a list of lists in which the moves of the main sort are to be echoed. These lists must of course be long enough to allow the same absolute range indices as for the main list. Alternatively, the value can be a script object containing its own slave handlers and any necessary preset properties. This is for compatibility with scripts written to use earlier versions of my customisable sorts and might conceivably be used for other purposes, such as transposing the sort range in the slave list(s) or counting item movements. The same script object can be used for both the 'comparer' and 'slave' values if it contains both sets of handlers.
       Where the 'comparer' and/or 'slave' properties are omitted, or the customisation parameter isn't a record or a script object, the defaults are direct comparisons and no 'slave' actions. Because of the customisation hooks, a "straight" sort with this code isn't as fast as with the non-customisable version. But subjectively, this isn't noticeable.
       
Result returned: None. The passed lists are sorted in place.
*)


on customIterativeDualPivotQuicksort(theList, rangeIndex1, rangeIndex2, customiser)
   script o
       property cutoff : 16 -- Use an insertion sort or swap with ranges of this length or less.
       property comparer : me
       property slave : me
       property slaveSorting : true
       
       property lst : theList
       property stack : {} -- For the range indices of partitions waiting to be processed.
       property slaveLists : {}
       property slaveListCount : missing value
       property singleSlaveList : missing value
       
       on qsrt(rangeLeft, rangeRight)
           set stackLength to 0
           set stackIndex to 1
           repeat -- Iteration repeat.
               
               (* TWO-MEDIANS-OF-5 PIVOT SELECTION *)
               -- Select the second lowest and second highest of five values from the current range to use as pivot values.
               set slot3 to (rangeLeft + rangeRight) div 2 -- Roughly the middle of the range
               set slot2 to (rangeLeft + slot3) div 2 -- Roughly halfway between the middle and the beginning.
               set slot4 to (slot3 + rangeRight) div 2 -- Roughly halfway between the middle and the end.
               set v1 to my lst's item rangeLeft
               set pivot1 to my lst's item slot2 -- 'pivot1' rather than 'v2', in anticipation of the eventual contents.
               set v3 to my lst's item slot3
               set pivot2 to my lst's item slot4 -- 'pivot2' rather than 'v4', ditto.
               set v5 to my lst's item rangeRight
               -- Sort the five values between the variables using an insertion sort algorithm. In any slave sort, the sorting takes place in the slave list(s).
               if (comparer's isGreater(v1, pivot1)) then
                   tell pivot1
                       set pivot1 to v1
                       set v1 to it
                   end tell
                   if (slaveSorting) then tell slave to swap(slot2, rangeLeft)
               end if
               if (comparer's isGreater(pivot1, v3)) then
                   tell v3
                       set v3 to pivot1
                       if (comparer's isGreater(v1, it)) then
                           set pivot1 to v1
                           set v1 to it
                           if (slaveSorting) then tell slave to swap3(slot3, slot2, rangeLeft)
                       else
                           set pivot1 to it
                           if (slaveSorting) then tell slave to swap(slot3, slot2)
                       end if
                   end tell
               end if
               if (comparer's isGreater(v3, pivot2)) then
                   tell pivot2
                       set pivot2 to v3
                       if (comparer's isGreater(pivot1, it)) then
                           set v3 to pivot1
                           if (comparer's isGreater(v1, it)) then
                               set pivot1 to v1
                               set v1 to it
                               if (slaveSorting) then
                                   tell slave
                                       swap3(slot4, slot3, slot2)
                                       swap(slot2, rangeLeft)
                                   end tell
                               end if
                           else
                               set pivot1 to it
                               if (slaveSorting) then tell slave to swap3(slot4, slot3, slot2)
                           end if
                       else
                           set v3 to it
                           if (slaveSorting) then tell slave to swap(slot4, slot3)
                       end if
                   end tell
               end if
               if (comparer's isGreater(pivot2, v5)) then
                   tell v5
                       set v5 to pivot2
                       if (comparer's isGreater(v3, it)) then
                           set pivot2 to v3
                           if (comparer's isGreater(pivot1, it)) then
                               set v3 to pivot1
                               if (comparer's isGreater(v1, it)) then
                                   set pivot1 to v1
                                   set v1 to it
                                   if (slaveSorting) then
                                       tell slave
                                           swap3(rangeRight, slot4, slot3)
                                           swap3(slot3, slot2, rangeLeft)
                                       end tell
                                   end if
                               else
                                   set pivot1 to it
                                   if (slaveSorting) then
                                       tell slave
                                           swap3(rangeRight, slot4, slot3)
                                           swap(slot3, slot2)
                                       end tell
                                   end if
                               end if
                           else
                               set v3 to it
                               if (slaveSorting) then tell slave to swap3(rangeRight, slot4, slot3)
                           end if
                       else
                           set pivot2 to it
                           if (slaveSorting) then tell slave to swap(rangeRight, slot4)
                       end if
                   end tell
               end if
               -- Put the non-pivot values back into the list at the appropriate points. The pivot values nominally go to the ends of the range, but don't physically need to be there (except in any slave lists).
               set my lst's item slot2 to v1
               set my lst's item slot3 to v3
               set my lst's item slot4 to v5
               if (slaveSorting) then
                   tell slave
                       swap(rangeLeft, slot2)
                       swap(slot4, rangeRight)
                   end tell
               end if
               
               (* PARTIAL FIVE-WAY PARTITIONING *)
               -- Group the items which aren't the two pivot instances into values less than pivot1, values between or equal to the pivots, and values greater than pivot2. Then "swap" the pivot instances into place between the middle and outer groups.
               set leftPointer to rangeLeft + 1
               set rightPointer to rangeRight - 1
               set k to leftPointer
               repeat until (k > rightPointer)
                   set leftValue to my lst's item k
                   if (comparer's isGreater(pivot1, leftValue)) then
                       -- The current value's less than pivot1. Swap it with the leftmost value in the middle partition and cede that slot to the less-than-pivot1 partition. The item may be swapped with itself the first few times, but it's best not to worry about it.
                       set my lst's item k to my lst's item leftPointer
                       set my lst's item leftPointer to leftValue
                       if (slaveSorting) then tell slave to swap(leftPointer, k)
                       set leftPointer to leftPointer + 1
                   else if (comparer's isGreater(leftValue, pivot2)) then
                       -- The value's greater than pivot2. Cede any values greater than pivot2 at the end of the middle partition to the greater-than-pivot2 partition until a value less than or equal to pivot2 shows up or the end pointer meets the current index.
                       set rightValue to my lst's item rightPointer
                       repeat while ((comparer's isGreater(rightValue, pivot2)) and (rightPointer > k))
                           set rightPointer to rightPointer - 1
                           set rightValue to my lst's item rightPointer
                       end repeat
                       -- Swap the current value with the one just found. If this is less than pivot1, make it a three-way swap with the item at the beginning of the partition. Extend the outer partition(s) accordingly.
                       set my lst's item rightPointer to leftValue
                       if (comparer's isGreater(pivot1, rightValue)) then
                           set my lst's item k to my lst's item leftPointer
                           set my lst's item leftPointer to rightValue
                           if (slaveSorting) then tell slave to swap3(k, leftPointer, rightPointer)
                           set leftPointer to leftPointer + 1
                       else
                           set my lst's item k to rightValue
                           if (slaveSorting) then tell slave to swap(k, rightPointer)
                       end if
                       set rightPointer to rightPointer - 1
                   end if
                   set k to k + 1
               end repeat
               -- "Swap" the pivot instances into place between the middle and outer partitions. (Real swaps in any slave list.)
               set item rangeLeft of my lst to item (leftPointer - 1) of my lst
               set item (leftPointer - 1) of my lst to pivot1
               if (slaveSorting) then tell slave to swap(rangeLeft, leftPointer - 1)
               set item rangeRight of my lst to item (rightPointer + 1) of my lst
               set item (rightPointer + 1) of my lst to pivot2
               if (slaveSorting) then tell slave to swap(rightPointer + 1, rangeRight)
               -- At this point:
               -- The left partition (rangeLeft thru (leftPointer - 2)) contains values less than pivot1 or is empty.
               -- Item (leftPointer - 1) is an instance of pivot1.
               -- The middle partition (leftPointer thru rightPointer) contains at least one value, which is greater than or equal to pivot1 and less than or equal to pivot2.
               -- Item (rightPointer + 1) is an instance of pivot2.
               -- The right partition ((rightPointer + 2) thru rangeRight) contains values greater than pivot2 or is empty.
               -- The two known pivot instances are in the right places in the list and can be left alone for the rest of the sort.
               
               (* ITERATION CONTROL *)
               -- Deal with non-empty partitions in subsequent repeats or by insertion sorting or simple swapping as appropriate.
               set leftPartitionLength to (leftPointer - 1) - rangeLeft
               if (comparer's isGreater(pivot2, pivot1)) then
                   set middlePartitionLength to rightPointer - leftPointer + 1
               else -- The pivots are equal, so the middle partition only contains instances of that value and doesn't need any further action.
                   set middlePartitionLength to 0
               end if
               set rightPartitionLength to rangeRight - (rightPointer + 1)
               -- If the right partition's to be handled in a repeat after either or both of the other partitions, store its range indices for then. Otherwise, if it's to be insertion sorted or its items swapped, do that now.
               if (rightPartitionLength > cutoff) then
                   if ((leftPartitionLength > cutoff) or (middlePartitionLength > cutoff)) then
                       if (stackIndex > stackLength) then
                           set end of my stack to rightPointer + 2
                           set end of my stack to rangeRight
                           set stackLength to stackLength + 2
                       else
                           set my stack's item stackIndex to rightPointer + 2
                           set my stack's item (stackIndex + 1) to rangeRight
                       end if
                       set stackIndex to stackIndex + 2
                   end if
               else if (rightPartitionLength > 2) then
                   isrt(rightPointer + 2, rangeRight)
               else if (rightPartitionLength is 2) then
                   set leftValue to my lst's item (rightPointer + 2)
                   set rightValue to my lst's item rangeRight
                   if (comparer's isGreater(leftValue, rightValue)) then
                       set my lst's item (rightPointer + 2) to rightValue
                       set my lst's item rangeRight to leftValue
                       if (slaveSorting) then tell slave to swap(rightPointer + 2, rangeRight)
                   end if
               end if
               -- Similarly with the middle partition, except that it won't be handled after the right.
               if (middlePartitionLength > cutoff) then
                   if (leftPartitionLength > cutoff) then
                       if (stackIndex > stackLength) then
                           set end of my stack to leftPointer
                           set end of my stack to rightPointer
                           set stackLength to stackLength + 2
                       else
                           set my stack's item stackIndex to leftPointer
                           set my stack's item (stackIndex + 1) to rightPointer
                       end if
                       set stackIndex to stackIndex + 2
                   end if
               else if (middlePartitionLength > 2) then
                   isrt(leftPointer, rightPointer)
               else if (middlePartitionLength is 2) then
                   set leftValue to my lst's item leftPointer
                   set rightValue to my lst's item rightPointer
                   if (comparer's isGreater(leftValue, rightValue)) then
                       set my lst's item leftPointer to rightValue
                       set my lst's item rightPointer to leftValue
                       if (slaveSorting) then tell slave to swap(rightPointer + 2, rangeRight)
                   end if
               end if
               -- If the left partition's to be handled in a repeat, set the repeat's right range index to the partition's. Otherwise, use an insertion sort or swap (or do nothing) and set the repeat's range indices to those of one of the other partitions. If neither qualifies, retrieve a pair of indices from the stack and use those. If there are none left, finish.
               if (leftPartitionLength > cutoff) then
                   set rangeRight to leftPointer - 2
               else
                   if (leftPartitionLength > 2) then
                       isrt(rangeLeft, leftPointer - 2)
                   else if (leftPartitionLength is 2) then
                       set leftValue to my lst's item rangeLeft
                       set rightValue to my lst's item (leftPointer - 2)
                       if (comparer's isGreater(leftValue, rightValue)) then
                           set my lst's item rangeLeft to rightValue
                           set my lst's item (leftPointer - 2) to leftValue
                           if (slaveSorting) then tell slave to swap(rightPointer + 2, rangeRight)
                       end if
                   end if
                   if (middlePartitionLength > cutoff) then
                       set rangeLeft to leftPointer
                       set rangeRight to rightPointer
                   else if (rightPartitionLength > cutoff) then
                       set rangeLeft to rightPointer + 2
                   else
                       set stackIndex to stackIndex - 2
                       if (stackIndex < 1) then exit repeat -- The stacked partition indices have been used up. There's nothing more to do.
                       set rangeLeft to my stack's item stackIndex
                       set rangeRight to my stack's item (stackIndex + 1)
                   end if
               end if
               
           end repeat -- End of iteration repeat.
       end qsrt
       
       on isrt(rangeLeft, rangeRight) -- The insertion sort handler.
           -- Presort the first two items to set up a minor optimisation whereby the most recent instance of the highest value so far doesn't go back into the list until it's superseded or the end of the sort is reached.
           set highestValueSoFar to my lst's item rangeLeft
           set currentValue to my lst's item (rangeLeft + 1)
           if (comparer's isGreater(highestValueSoFar, currentValue)) then
               set my lst's item rangeLeft to currentValue
               if (slaveSorting) then tell slave to swap(rangeLeft, rangeLeft + 1)
           else
               set highestValueSoFar to currentValue
           end if
           -- Work through the rest of the range, rotating values back into the sorted group as necessary.
           repeat with k from (rangeLeft + 2) to rangeRight
               set currentValue to my lst's item k
               if (comparer's isGreater(highestValueSoFar, currentValue)) then
                   -- The current value's less than the highest so far and must be inserted into the sorted group. Shift previously sorted values greater than it up a slot (except for the highest so far, which is in a variable) until the appropriate insertion location's found.
                   repeat with insertionIndex from (k - 2) to rangeLeft by -1
                       tell my lst's item insertionIndex
                           if (comparer's isGreater(it, currentValue)) then
                               -- Greater value. Move it up.
                               set my lst's item (insertionIndex + 1) to it
                           else
                               -- Lesser or equal value. Set the just vacated slot above it as the insertion location and stop looking.
                               set insertionIndex to insertionIndex + 1
                               exit repeat
                           end if
                       end tell
                   end repeat
                   -- Insert the current value at the determined location and reproduce the entire rotation in any slave sort.
                   set my lst's item insertionIndex to currentValue
                   if (slaveSorting) then tell slave to rotate(insertionIndex, k)
               else
                   -- The current value's greater than or equal to the highest so far and simply inherits that role.
                   set my lst's item (k - 1) to highestValueSoFar
                   set highestValueSoFar to currentValue
               end if
           end repeat
           -- At the end, ensure that the main list's highest value goes back into it.
           set my lst's item rangeRight to highestValueSoFar
       end isrt
       
       -- Default comparison and slave handlers.
       on isGreater(a, b)
           (a > b)
       end isGreater
       
       on swap(a, b)
           tell my singleSlaveList's item a
               set my singleSlaveList's item a to my singleSlaveList's item b
               set my singleSlaveList's item b to it
           end tell
       end swap
       
       on swap3(a, b, c)
           tell my singleSlaveList's item a
               set my singleSlaveList's item a to my singleSlaveList's item b
               set my singleSlaveList's item b to my singleSlaveList's item c
               set my singleSlaveList's item c to it
           end tell
       end swap3
       
       on rotate(a, b)
           tell my singleSlaveList's item b
               repeat with rotationIndex from (b - 1) to a by -1
                   set my singleSlaveList's item (rotationIndex + 1) to my singleSlaveList's item rotationIndex
               end repeat
               set my singleSlaveList's item a to it
           end tell
       end rotate
       
       -- Alternative slave handlers for when there are multiple slave lists.
       on swapMultiple(a, b)
           repeat with i from 1 to slaveListCount
               tell my slaveLists's item i's item a
                   set my slaveLists's item i's item a to my slaveLists's item i's item b
                   set my slaveLists's item i's item b to it
               end tell
           end repeat
       end swapMultiple
       
       on swap3Multiple(a, b, c)
           repeat with i from 1 to slaveListCount
               tell my slaveLists's item i's item a
                   set my slaveLists's item i's item a to my slaveLists's item i's item b
                   set my slaveLists's item i's item b to my slaveLists's item i's item c
                   set my slaveLists's item i's item c to it
               end tell
           end repeat
       end swap3Multiple
       
       on rotateMultiple(a, b)
           repeat with i from 1 to slaveListCount
               tell my slaveLists's item i's item b
                   repeat with rotationIndex from (b - 1) to a by -1
                       set my slaveLists's item i's item (rotationIndex + 1) to my slaveLists's item i's item rotationIndex
                   end repeat
                   set my slaveLists's item i's item a to it
               end tell
           end repeat
       end rotateMultiple
   end script
   
   -- Process the input parameters.
   set listLen to (count theList)
   if (listLen > 1) then
       -- Negative and/or transposed range indices.
       if (rangeIndex1 < 0) then set rangeIndex1 to listLen + rangeIndex1 + 1
       if (rangeIndex2 < 0) then set rangeIndex2 to listLen + rangeIndex2 + 1
       if (rangeIndex1 > rangeIndex2) then set {rangeIndex1, rangeIndex2} to {rangeIndex2, rangeIndex1}
       
       if (rangeIndex2 > rangeIndex1) then
           -- The customisation parameter for the earliest version of Custom Quicksort was a script object containing both comparison and slave handlers. Accept that, if presented; otherwise expect a record containing comparer and/or slave properties, whose values are script objects containing the comparison and slave handlers respectively. If either property is omitted from the record, or if the parameter isn't a script object or a record, use the appropriate default handers in the script object above. NEW: THE SLAVE PARAMETER CAN NOW BE JUST A LIST OF SLAVE LISTS, IN WHICH CASE THE SCRIPT ITSELF WILL APPLY THE APPROPRIATE SLAVE ACTIONS TO EACH ONE.
           if (customiser's class is script) then
               try -- Use the customising script's isGreater handler, if it has one.
                   customiser's isGreater
                   set o's comparer to customiser
               end try
               try -- Use the customising script's slave handler and properties, if it has them.
                   customiser's swap
                   set o's slave to customiser
               on error
                   set o's slaveSorting to false
               end try
           else if ((customiser is {}) or (customiser's class is record)) then
               -- Use the passed or default comparer script. Get the passed or default slave parameter.
               set {comparer:o's comparer, slave:slaveParam} to customiser & {comparer:o, slave:o}
               if (slaveParam's class is script) then
                   -- Passed or default slave script. Use it.
                   set o's slave to slaveParam
                   set o's slaveSorting to (slaveParam is not o)
               else if (slaveParam's class is list) then
                   -- Passed list of slave lists. Set the default 'slave' script object's slaveLists property to it.
                   set o's slaveLists to slaveParam
                   set o's slaveSorting to false
               end if
               -- Configure the default 'slave' script object to use the best handlers for the number of slave lists. This makes no difference with a non-default slave.
               set o's slaveListCount to (count o's slaveLists)
               if (o's slaveListCount > 0) then
                   set o's slaveSorting to true
                   if (o's slaveListCount is 1) then
                       set o's singleSlaveList to beginning of o's slaveLists
                   else
                       set o's swap to o's swapMultiple
                       set o's swap3 to o's swap3Multiple
                       set o's rotate to o's rotateMultiple
                   end if
               end if
           end if
           
           -- Do the sort.
           if (rangeIndex2 - rangeIndex1 + 1 > o's cutoff) then
               tell o to qsrt(rangeIndex1, rangeIndex2)
           else
               tell o to isrt(rangeIndex1, rangeIndex2)
           end if
       end if
   end if
   
   return -- nothing.
end customIterativeDualPivotQuicksort

property sort : customIterativeDualPivotQuicksort

(*
on demo1() -- Sort copies of a list in various ways.
   -- Custom comparer. Returns the opposite of the truth for a reverse sort.
   script descending
       on isGreater(a, b)
           return (a < b)
       end isGreater
   end script
   
   -- Set up five recognisably similar lists for demo purposes.
   set {list1, list2, list3, list4, list5} to {{}, {}, {}, {}, {}}
   repeat 25 times
       set n to (random number 1000)
       set end of list1 to n
       set end of list2 to n
       set end of list3 to n
       set end of list4 to n as text
       set end of list5 to n as real
   end repeat
   
   -- Sort list1 normally.
   sort(list1, 1, -1, {})
   
   -- Reverse sort items 1 thru 20 of list2, leaving items 21 thru 25 unsorted.
   sort(list2, 1, 20, {comparer:descending})
   
   -- Reverse sort list3, copying the moves in list4 and list5.
   sort(list3, 1, -1, {comparer:descending, slave:{list4, list5}})
   
   return {ascending:list1, |1thru20descending21thru25unsorted|:list2, descendingWithSlaveSorts:{list3, list4, list5}}
end demo1

on demo2() -- Sort a list of records in descending order by 'age' with ascending subsorts by 'surname' and 'christianName'.
   -- Custom comparer.
   script descendingByAgeAscendingByName
       on isGreater(a, b)
           if (a's age = b's age) then
               if (a's surname = b's surname) then
                   return (a's christianName > b's christianName)
               else
                   return (a's surname > b's surname)
               end if
           else
               return (a's age < b's age)
           end if
       end isGreater
   end script
   
   -- Set up a sample list of records.
   set surnames to {"Smith", "Jones", "Aardvark", "AppleScript", "Walker", "Diplodocus", "Klein", "Zozzle", "Lazenby"}
   set christianNames to {"John", "Bert", "James", "Fred", "Ella", "Mabel", "Wayne", "Peter", "Mabel", "Avarice", "Verisimilitude", "Dinsdale", "Isambard", "Henrietta", "Stephen", "Philip", "Ringo", "Sid", "Genghis", "Hayley", "Caligula"}
   set lst to {}
   repeat until ((count lst) is 25)
       set r to {age:(random number from 21 to 35), surname:(some item of surnames), christianName:(some item of christianNames)}
       if (lst does not contain r) then set end of lst to r
   end repeat
   
   -- Do the sort.
   sort(lst, 1, -1, {comparer:descendingByAgeAscendingByName})
   
   return lst
end demo2

demo1()
-- demo2()
*)

8th April 2018: Replaced the dummy slave handlers, which were used when other lists weren't being sorted in parallel with the main one, with conditional calls to the non-dummies. This makes sorts with parallel sorting very slightly less fast, but speeds up sorts without it (which are probably more usual) by a greater amount.
12th August 2018: Demos at end commented out for clarity.

Last edited by Nigel Garvey (2018-08-19 08:31:49 am)


NG

Offline

 

#3 2018-03-24 02:52:42 pm

Adam Bell
Administrator
From:: Nova Scotia, Canada
Registered: 2005-10-04
Posts: 4666

Re: Dual-pivot Quicksort

Wow!! It really is very fast. Thanks, Nigel.


Mac mini running 10.14.6, 2011 27" iMac as display.

Offline

 

#4 2018-08-17 09:30:04 am

kerflooey
Member
Registered: 2011-07-07
Posts: 165

Re: Dual-pivot Quicksort

Hi Nigel: I saw this article on timsort and thought you might be interested. I searched but didn't find anything on macscripter.net about it.

https://hackernoon.com/timsort-the-fast … b28417f399

Offline

 

#5 2018-08-17 02:54:57 pm

Nigel Garvey
Moderator
From:: Warwickshire, England
Registered: 2002-11-20
Posts: 5013

Re: Dual-pivot Quicksort

Hi kerflooey.

Thanks for the link! I've downloaded a couple of descriptions of timsort and will no doubt become engrossed in it when I get a moment or three.  smile


NG

Offline

 

#6 2018-08-17 04:44:54 pm

DJ Bazzie Wazzie
Member
From:: the Netherlands
Registered: 2004-10-20
Posts: 2809
Website

Re: Dual-pivot Quicksort

kerflooey wrote:

Hi Nigel: I saw this article on timsort and thought you might be interested. I searched but didn't find anything on macscripter.net about it.



Have been trying to speed up the natural merge sort, known as timsort in Python, but with no avail for unsorted data. Natural merge sort is of course the fastest when you have two ordered lists, even in AppleScript but I didn't bother to post it here (iirc, I think I've mentioned natural merge sort somewhere).

It is true that natural merger sort (timsort) is insanely fast in programming languages like C. It makes use of the fact that unsorted data doesn't exists, only worse case sorting. It works most efficiently using linked lists, a type of list AppleScript doesn't support (anymore).

Offline

 

#7 2018-08-18 08:18:09 am

Nigel Garvey
Moderator
From:: Warwickshire, England
Registered: 2002-11-20
Posts: 5013

Re: Dual-pivot Quicksort

Well. I've done an initial literal translation of the Timsort code in the hackernoon article into AppleScript, implementing a bug fix suggested by someone on GitHub, where it had also been posted. As one would expect, the result's a disaster speedwise, given all the repeated list accesses and concatentations. It's also nowhere near a full implementation of Timsort, since all it does is to grab runs of items which are already in order, sort each run (!) with a very inefficient insertion sort, and then merge the runs. If I've correctly understood Tim Peters's own description of the sort, it should also grab and reverse any runs of items which are exactly in reversed order, extend and presort runs to get optimal lengths for merging (hence the need for the insertion sort), and implement fiendishly complex optimisations of the merges themselves.  yikes

I may be gone some time….  wink

Last edited by Nigel Garvey (2018-08-18 08:18:47 am)


NG

Offline

 

Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)