Of Scripts and Script Libraries and Sorting Algorithms

Over the years Nigel Garvey and others have published some scripts with very fast and sophisticated sorting algorithms.
Over the last few years I’ve had several tries at incorporating his 2010 combined QuickSort+InsertionSort script into my other works but have previously had to give it up due strange behaviour and lack of debugging time.
Now that I am retired and have more time and focus, I have identified the issue.

In 2015 I started using a script library which I implemented as follows:

use U : script “Utilities_1501.scpt”

The script that I copied from Nigel Garvey has a “qsort()” handler (of course), within that handler there is a Script “o”, and within that script several handlers that use the variable “u” and assign various things to it.

 on qsort(theList, l, r) 	script o
 		property cutoff : 10 -- The Quicksort will ignore ranges containing this number of items or fewer.
 		property lst : theList
 		
		on qsrt(l, r) -- The Quicksort handler.
 			set pivot to my lst's item ((l + r) div 2)
 			
 			set i to l
 			set j to r
 			repeat until (i > j)
 				set u to my lst's item i   
-----               ^ here is the problem

Once qsort() has run, the handlers in the Utilities Library script aren’t accessible any more, and for example executing U’s GetTick_Now() results in the bizarre error message:
“5298 doesn’t understand GetTick_Now()”

On careful reading, it seems that this use of use statement creates a property

The fix is simple, a “local U” statement at the beginning of qsort() - I took it two steps further.

 on qsort(theList, l, r)
 	local M, C, G, l, P, U -- hide my Script Library Designators from qsort()
 	local cutoff, lst -- hide script o's properties from outside
	
	script o
 		property cutoff : 10 -- The Quicksort will ignore ranges containing this number of items or fewer.
 		property lst : theList
 		
		on qsrt(l, r) -- The Quicksort handler.
 			local pivot, i, j, u, w --protect the local variables
 			set pivot to my lst's item ((l + r) div 2)
1 Like

I probably should have all of the documentation. If I had followed Nigel’s advice below the problem would have never occured.

– Say you’ve decided to use qsort and it’s in a folder called “Sorts” in your user “Scripts” folder.

set sorter to (load script file ((path to scripts folder as text) & "Sorts:qsort.scpt"))
tell sorter to sort(myList, 1, -1) -- ie. sort items 1 thru -1 of myList.
1 Like

Hi @Ericnepean.

Thanks for your interest in Arthur’s and my Quicksort. I’m glad you were able to work out the cures for the problem you were having by yourself. Nowadays, of course, it’s possible to load the script with the “Libraries” system and a use command in the same way as you load your “Utilities_1501.scpt”, but I still use load script myself occasionally. By the way, the “.scpt” suffix can be omitted from the script name in a use line.

The link to ScriptBuilders in the 2010 addendum to my 2006 post is, as you may already have discovered, no longer valid. ScriptBuilders was a sister site to MacScripter where one could upload actual working files instead of just posting source code as here in Code Exchange. But literally two weeks after I uploaded my sort scripts to it, MacScripter’s then owner discontinued it and the scripts ended up (without my foreknowledge or permission) on some third-party software download site — of which I’ve long since lost track!

Hi Nigel
I belive that after Scriptbuilder went offline, you did update your Macscripters Quicksort post; I got that and I was also lucky enough to download and keep your “Dose of Sorts” compendium before it vanished.

Regarding the third-party software site with your stuff from ScriptBuilders, even Google can’t find that site, which I think is the modern day version of oblivion. :smiley: Serves them right.

1 Like

Is your qsort routine recursive. If so it will run into a recursion limit at around 510, so it would have problems with large lists.
I have a version that doesn’t use recursion and can handle lists of any size.

Hi Robert.

The version referred to above is indeed recursive, but leaves out the deepest few recursion levels, finishing off instead with an insertion sort of the entire sort range. An alternative, which may be slightly faster, is to do individual insertion sorts of the ranges for which recursion isn’t performed. Another alternative is to use recursion only on one of the two partitions at each level (preferably the shorter) and simply to repeat with changed range values for the other.

Quicksort’s recursion doesn’t go straight down, of course. It’s a “divide and conquer” algorithm, so recursion only covers part of the sort range at any one time. There’s a lot of to-ing and fro-ing between levels, but the depth of the deepest level isn’t necessarily as great as one might suppose. I’ve tested all my sorts with lists containing many thousands of items and none of the fully recursive ones have ever hit a recursion limit. The main sticking point has been generating such huge lists in the first place!

But please post your version here in Code Exchange if you think it’s any good. I’d be interested to see it if it’s both in-place and a fully iterative Quicksort as I was never able to produce such a thing myself! :slightly_smiling_face:

Here it is…

on quickSort for blist -- Non-Recursive FASTEST
	local px, stack, lo, hi, L, H, sw -- px means 'Pivot Index'
	script mL
		property alist : blist
		property stack : {{1, count blist}}
	end script
	repeat until (count mL's stack) = 0
		set lo to item 1 of item 1 of mL's stack
		set hi to item 2 of item 1 of mL's stack
		set px to item ((hi + lo) div 2) of mL's alist -- start partitionHoare
		set L to lo
		set H to hi
		repeat
			repeat while item L of mL's alist < px
				set L to L + 1
			end repeat
			repeat while item H of mL's alist > px
				set H to H - 1
			end repeat
			if L ≥ H then exit repeat
			set sw to item L of mL's alist
			set item L of mL's alist to item H of mL's alist
			set item H of mL's alist to sw
			set L to L + 1
			set H to H - 1
		end repeat
		set px to H -- end of partitionHoare
		set mL's stack to rest of mL's stack --items 2 thru -1 of stack
		if px + 1 < hi then set beginning of mL's stack to {px + 1, hi}
		if lo < px then set beginning of mL's stack to {lo, px}
	end repeat
end quickSort

Hi @Fredrik71.

For speed maybe, yes. The main reasons for using an AppleScript sort nowadays would be:

  1. You need an in-place sort.
  2. You need a customised sort that can’t be done with ASObjC.
  3. You just find the algorithm fascinating. :nerd_face: :wink:
1 Like

Hi @robertfern.

Yes. That’s certainly an in-place iterative Quicksort! :sunglasses: — although it’s arguably recursive by proxy, since it implements its own LIFO stack to store range indices instead of using the program stack. :wink: It’s not as fast as the Quicksort to which Ericnepean linked above, probably because of the building and breaking up of the stack and the fact it doesn’t have the insertion sort optimisation. But since your purpose is mainly to avoid any possible recursion limit, that’s probably not important to you.

1 Like

You are correct, SIR! ( to be read in the voice of Ed McMahon)

I also have a version that will sort a list of lists where you send it a sort list parameter that contains the item indexes to sort on and also add a minus sign to the ones you want sorted in reverse order. (I.e. descending)

I’ve been fooling with the iterative Quicksort @robertfern posted for the past couple of days, having just rediscovered it on my desktop where it’s been since November.

I’ve reworked it in my own “style” (for want of a better word) and have speeded it up a bit with insertion sorting for short partitions and a further reduction of stack work by simply resetting a range variable where possible instead of pushing a list onto the stack and immediately pulling it off again. I did think of adding median-of-3 pivot selection as well, but the code’s long and spindly enough as it is! :wink:

(*
	Iterative (ie. non-recursive) in-place Quicksort with insertion sorting of short partitions.
	Quicksort algorithm: Tony Hoare
	Insertion sort idea: probably Robert Sedgewick
*)

-- Sort items l thru r of theList in place.
on quicksort(theList, l, r)
	-- Check input.
	set listLen to (count theList)
	-- Negative and/or transposed range indices.
	if (l < 0) then set l to listLen + l + 1
	if (r < 0) then set r to listLen + r + 1
	if (l > r) then set {l, r} to {r, l}
	if ((l < 1) or (r > listLen)) then error "Quicksort: range parameter(s) out of — er — range."
	if (listLen < 2) then return
	
	script o -- Contains the list and the stack for speed and the insertion sort handler for convenience.
		property lst : theList
		property stack : {{l, r}}
		
		on isrt(l, r) -- Insertion sort.
			-- Set up a minor optimisation whereby the most recent instance of any value that's the highest so far
			-- in the insertion sort is only put back into the list when superseded or to complete the sort.
			set highestSoFar to my lst's item l
			set v to my lst's item (l + 1)
			if (highestSoFar > v) then
				set my lst's item l to v
			else
				set highestSoFar to v
			end if
			
			repeat with i from (l + 2) to r
				set v to my lst's item i
				if (highestSoFar > v) then
					repeat with j from (i - 2) to l by -1
						tell my lst's item j
							if (it > v) then
								set my lst's item (j + 1) to it
							else
								set j to j + 1
								exit repeat
							end if
						end tell
					end repeat
					set my lst's item j to v
				else
					set my lst's item (i - 1) to highestSoFar
					set highestSoFar to v
				end if
			end repeat
			set my lst's item r to highestSoFar
		end isrt
	end script
	
	set cutoff to 10 -- Partitions with this or fewer items are insertion sorted.
	set getFromStack to true -- Whether or not to pull a pair of range indices from the stack.
	repeat
		if (getFromStack) then
			if ((count o's stack) = 0) then exit repeat -- Finished when stack exhausted.
			set {l, r} to o's stack's beginning
			set o's stack to o's stack's rest
		else
			set getFromStack to true
		end if
		set pivot to o's lst's item ((l + r) div 2)
		set i to l
		set j to r
		-- Partition the current range into values ≤ 'pivot' and values ≥ 'pivot'.
		repeat while (j > i)
			set iv to o's lst's item i
			repeat while (iv < pivot)
				set i to i + 1
				set iv to o's lst's item i
			end repeat
			
			set jv to o's lst's item j
			repeat while (jv > pivot)
				set j to j - 1
				set jv to o's lst's item j
			end repeat
			
			if (i > j) then
			else
				set o's lst's item i to jv
				set o's lst's item j to iv
				set i to i + 1
				set j to j - 1
			end if
		end repeat
		-- Decide how to handle the partitions.
		if (r - i < cutoff) then -- 'cutoff' or fewer items in the hi partition.
			-- If more than 1 item, insertion sort the partition. Otherwise leave it.
			if (r > i) then o's isrt(i, r)
			if (j - l < cutoff) then
				-- Likewise with the lo partition if it has 'cutoff' or fewer items.
				if (j > l) then o's isrt(l, j)
			else
				-- Otherwise reset the right index variable ready to subpartition the lo partition next time round the repeat.
				set r to j
				set getFromStack to false
			end if
		else -- *More* than 'cutoff' items in the hi partition.
			if (j - l < cutoff) then -- 'cutoff' or fewer items in the lo partition.
				-- Insertion sort or leave the lo partition .
				if (j > l) then o's isrt(l, j)
				-- Reset the left index variable to subpartition the hi partition in the next iteration. 
				set l to i
			else -- More than 'cutoff' items in the lo partition.
				-- Stack the hi partition's range indices.
				set o's stack's beginning to {i, r}
				-- Reset the right index variable to subpartition the lo partition in the next iterationt.
				set r to j
			end if
			set getFromStack to false
		end if
	end repeat
	
	return -- nothing.
end quicksort

-- Demo:
set lst to {}
repeat 100 times
	set my lst's end to (random number 500)
end repeat
-- Sort items 1 thru -1 of lst in place.
quicksort(lst, 1, -1)
return lst

If anyone’s interested, one way to include median-of-3 pivot selection is to replace these four lines:

set pivot to o's lst's item ((l + r) div 2)
set i to l
set j to r
-- Partition the current range into values ≤ 'pivot' and values ≥ 'pivot'.

… with this:

(* MEDIAN-OF-3 PIVOT SELECTION *)
-- Reduce the chances of picking this range's lowest or highest value as the
-- pivot value by using the median of the range's first, middle, and last values.
-- Combine finding this with ordering the values in those slots so that
-- they won't need to be tested again in the partitioning repeat.
set m to (l + r) div 2
set pivot to o's lst's item m -- The default unless changed below.
set lv to o's lst's item l
set rv to o's lst's item r
if (lv > rv) then
	if (pivot > lv) then -- (pivot > lv > rv)
		set o's lst's item l to rv
		set o's lst's item m to lv
		set o's lst's item r to pivot
		set pivot to lv
	else if (rv > pivot) then -- (lv > rv > pivot)
		set o's lst's item l to pivot
		set o's lst's item m to rv
		set o's lst's item r to lv
		set pivot to rv
	else
		set o's lst's item l to rv
		set o's lst's item r to lv
	end if
else if (pivot > rv) then -- (pivot > rv ≥ lv)
	set o's lst's item m to rv
	set o's lst's item r to pivot
	set pivot to rv
else if (lv > pivot) then -- (rv ≥ lv > pivot)
	set o's lst's item l to pivot
	set o's lst's item m to lv
	set pivot to lv
end if
(* END *)
-- Partition the range into values ≤ 'pivot' and values ≥ 'pivot'. (End values already done.)
set i to l + 1
set j to r - 1

The speed improvement’s only marginal and is statistical rather than assured in any particular case.

Hi Nigel, thanks for the update. I do use your algorithms (especially the ones that implement parallel sorting) in improving the speed of some of my applications for CaptureOne.