How to sort a list of records by a specific property

Ah, the dreaded bubble sort. Thanks for the help, Mike. I’ll give this a go.

-Tim

Bubble sort is one of the least efficient sort algorithms ever invented and most AppleScript implementations I’ve seen are themselves sub-optimal, but it does the job and is fine for short lists.

Here, for the sake of it, is a customisable bubble sort. It does the actual sorting, but delegates the comparison of the items to a handler you pass to it in a script object. So instead of having to write separate sorts to arrange Tim’s records by age or by name, you just have one sort to which you pass the appropriate comparison method:

on customBubbleSort(thelist, compObj)
	script o
		property lst : thelist
	end script
	
	repeat with i from (count thelist) to 2 by -1
		set a to beginning of o's lst
		repeat with j from 2 to i
			set b to item j of o's lst
			if (compObj's isGreater(a, b)) then
				set item (j - 1) of o's lst to b
				set item j of o's lst to a
			else
				set a to b
			end if
		end repeat
	end repeat
end customBubbleSort

script byAge
	on isGreater(a, b)
		(a's age > b's age)
	end isGreater
end script

script byName
	on isGreater(a, b)
		(a's name > b's name)
	end isGreater
end script

set myList to {{age:23, name:"Bob"}, {age:14, name:"Susie"}, {age:37, name:"Jim"}}

customBubbleSort(myList, byAge)
myList --> {{age:14, name:"Susie"}, {age:23, name:"Bob"}, {age:37, name:"Jim"}}

customBubbleSort(myList, byName)
myList --> {{age:23, name:"Bob"}, {age:37, name:"Jim"}, {age:14, name:"Susie"}}

By the way, name is a reserved word and shouldn’t really be used for user record labels. Something like |name| would be “better AppleScript”.

13th September 2010: I’ve uploaded an improved version of this sort, as part of a collection, to ScriptBuilders.


tell application "Quicksilver" to show large type "Thank You!"

Hi Mikerickson, this is a short and sweet method which I am looking for. Thanks a lot!

However, I am stuck in creating the list with property: {{abc:value1, xyz:value2}, {abc:value3, xyz:value4}, …}, instead, I got {“abc:value1, xyz:value2”, “abc:value3, xyz:value4”, …} and not be able to sort as yours.

Please advise! TIA!

The property in the test needs to match the list

set myList to {{abc:"value44", xyz:"value2"}, {abc:"value3", xyz:"value4"}}

repeat with i from 1 to (count of myList) - 1
	repeat with j from i + 1 to count of myList
		if abc of item j of myList < abc of item i of myList then
			set temp to item i of myList
			set item i of myList to item j of myList
			set item j of myList to temp
		end if
		log myList
	end repeat
end repeat

return myList

It’s quite difficult to turn records into text by accident, so presumably you’re starting with text anyway. Is that the case? And what do you want when you’ve finished? Text or records?

A Quicksort handler with labeled property on which sort is done passed as a parameter (Thanks to Nigel Garvey for inspiration). Highly reusable code! Works more efficiently than BubbleSort, even more on long lists (less than 1/4 second for a list of 1000 records on a 2008 MacBook Pro, about 4.5 seconds for 10,000).Can also sort on descending order (reverse sort). Need Yosemite or later to compile the GetTick script.

on QuickSortLR(aList, Le, Ri, compObj) --> QuickSort for a list of records, based on original Tony Hoare quicksort 1959
	script Sal --> script object aList
		property Liszt : aList
	end script
	set {I, J} to {Le, Ri}
	set Piv to item ((Le + Ri) div 2) of Sal's Liszt --> pivot is choosen in the middle
	repeat while J > I
		repeat while compObj's isSmaller(item I of Sal's Liszt, Piv)
			set I to I + 1
		end repeat
		repeat while compObj's isLarger(item J of Sal's Liszt, Piv)
			set J to J - 1
		end repeat
		if not I > J then
			tell (a reference to Sal's Liszt) to set {item I, item J} to {item J, item I} --> let's swap (waltz for 2!)
			set {I, J} to {I + 1, J - 1}
		end if
	end repeat
	if Le < J then QuickSortLR(Sal's Liszt, Le, J, compObj)
	if Ri > I then QuickSortLR(Sal's Liszt, I, Ri, compObj)
end QuickSortLR

script byNickName --> one script needed for every labeled property on which sort is required
	on isLarger(x, y)
		(x's NickName > y's NickName)
	end isLarger
	on isSmaller(x, y)
		(x's NickName < y's NickName)
	end isSmaller
end script

script byAge --> one script needed for every labeled property on which sort is required
	on isLarger(x, y)
		x's Age > y's Age
	end isLarger
	on isSmaller(x, y)
		x's Age < y's Age
	end isSmaller
end script

set HowMany to 1000
set StartTick to GetTick's Now()
set MyList to MakeList(HowMany)
set MakeTime to (GetTick's Now()) - StartTick

set StartTick to GetTick's Now()
QuickSortLR(MyList, 1, HowMany, byNickName)
set QsTime to (GetTick's Now()) - StartTick

{MyList, MakeTime, QsTime} --> results

on MakeList(HowMany)
	set MyList to {}
	set UpCaseConsons to {"B", "C", "D", "F", "G", "H", "J", "K", "L", "M", "N", "P", "Q", "R", "S", "T", "V", "W", "X", "Y", "Z"}
	set LoCaseVoyels to {"a", "e", "i", "o", "u"}
	set LoCaseLetters to {"a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"}
	repeat with I from 1 to HowMany
		set the end of MyList to {NickName:item (random number from 1 to 21) of UpCaseConsons & ¬
			item (random number from 1 to 5) of LoCaseVoyels & ¬
			item (random number from 1 to 26) of LoCaseLetters, Age:random number from 1 to 100}
	end repeat
	return MyList
end MakeList

script GetTick --> for more precise timer calculations
	use framework "Foundation"
	on Now()
		return (current application's NSDate's timeIntervalSinceReferenceDate) as real
	end Now
end script

→ results

→ few records shown here (byByAge)
→ {{{NickName:“Yew”, Age:1}, {NickName:“Nas”, Age:1}, .{NickName:“Dap”, Age:39}, {NickName:“Hut”, Age:39},.{NickName:“Yik”, Age:100}, {NickName:“Zan”, Age:100}, {NickName:“Ruk”, Age:100}, {NickName:“Ded”, Age:100}}, 1.586159050465, 0.225466012955}

→ few records shown here (by NickName)
→ {{{NickName:“Bac”, Age:22}, {NickName:“Bai”, Age:54}, {NickName:“Bap”, Age:79}, {NickName:“Baq”, Age:37}, .{NickName:“Hur”, Age:9}, {NickName:“Hur”, Age:72}, {NickName:“Hus”, Age:10}, {NickName:“Hut”, Age:39}, {NickName:“Huw”, Age:7}, {NickName:“Huy”, Age:69}, .{NickName:“Zur”, Age:66}, {NickName:“Zut”, Age:1}, {NickName:“Zut”, Age:66}}, 1.64743500948, 0.213027954102}

Doing a reverse can be done by an extra compObj in this manner:

script byNickNameReverse --> one script needed for every label on which sort is required
	on isLarger(x, y)
		(x's NickName < y's NickName)
	end isLarger
	on isSmaller(x, y)
		(x's NickName > y's NickName)
	end isSmaller
end script

or by simply reversing the list after sort, AppleScript is very efficient at doing this (0.069 second for 10,000 items):

set MyList to every item of reverse of MyList

Hi.

Since isSmaller(x, y) is directly equivalent to isLarger(y, x), you don’t need both handlers in the script objects. Every sort algorithm I’ve encountered can be written to rely solely on isLarger() comparisons (or isGreater() in my own versions), which greatly simplifies the writing of customising script objects, which in turn can be used with any kind of sort algorithm. I found the hardest thing to be deciding whether to standardise on isGreater() or isLess()! :wink: I finally went for isGreater().

-- .

repeat while compObj's isLarger(Piv, item I of Sal's Liszt) -- instead of compObj's isSmaller(item I of Sal's Liszt, Piv)
	set I to I + 1
end repeat
repeat while compObj's isLarger(item J of Sal's Liszt, Piv)
	set J to J - 1
end repeat

-- .

script byNickName --> one script needed for every label on which sort is required¨
	on isLarger(x, y)
		(x's NickName > y's NickName)
	end isLarger
end script

script byAge --> one script needed for every label on which sort is required
¨	on isLarger(x, y)
		x's Age > y's Age
	end isLarger
end script

There are of course a few ways to optimise the Quicksort itself, both in the AppleScript implementation and as an algorithm, but the resulting code is pretty long!

Edit: By the way, to sort on age, subsorting on nickname:

script byAgeThenByNickName
	on isLarger(x, y)
		set ageX to x's Age
		set ageY to y's Age
		(ageX > ageY) or ((ageX = ageY) and (x's NickName > y's NickName))
	end isLarger
end script

In my archives I retrieved two interesting sort codes.

(* Tri par interclassement 
--> renvoie une copie triée de la liste originale.
Implémentation: L. Sebilleau & D. Varlet
*)

on intersort(laListe, fcmp)
	-- le deuxième paramètre est un pointeur sur la fonction de comparaison à employer
	
	script lstock -- l'intérêt de ce script est d'accélérer le traitement des listes
		property listeorg : laListe
		property liste1 : {}
		property liste2 : {}
		property listea : {}
		property listeb : {}
		property lister : {}
		property Compare : fcmp -- ce pointeur de fonction est stocké là 
	end script
	
	-- répartition des éléments à trier en sous-listes de longueur deux
	
	tell lstock
		set ct to (its listeorg)'s length
		set flag to false
		if (ct mod 2) = 1 then
			set ct to ct - 1
			set flag to true
		end if
		repeat with i from 1 to ct by 2
			set x to item i of its listeorg
			set y to item (i + 1) of its listeorg
			if Compare(x, y) then -- la fonction de comparaison est employée ici
				{x, y}
			else
				{y, x}
			end if
			set the end of its liste1 to result
		end repeat
		if flag then set the end of its liste1 to {item -1 of its listeorg}
		
		-- interclassement des listes	contenues dans liste1
		repeat
			set ct to (its liste1)'s length
			if ct = 1 then exit repeat -- tant qu'il me reste plusieurs listes, c'est pas fini
			set flag to false
			if (ct mod 2) = 1 then
				set ct to ct - 1
				set flag to true
			end if
			set its liste2 to {}
			repeat with i from 1 to ct by 2 -- on prend les listes 2 par 2	
				set its listea to item i of its liste1
				set its listeb to item (i + 1) of its liste1
				set its lister to {} -- pour n'en faire qu'une seule dans celle-là 
				set cta to (its listea)'s length
				set ctb to (its listeb)'s length
				set ya to 1
				set yb to 1
				repeat
					
					-- la fonction de comparaison est aussi employée ici:
					if Compare(item yb of its listeb, item ya of its listea) then -- comparaison des premiers éléments
						set the end of its lister to (item yb of its listeb) -- "extraction" du premier
						set yb to yb + 1
						if yb > ctb then -- si cette liste est maintenant vide, on arrête là .
							if ya ≤ cta then -- il peut rester des éléments dans l'une des listes
								set its lister to (its lister) & (items ya thru -1 of its listea)
							end if
							exit repeat
						end if
					else -- symétrique dans le cas inverse
						set the end of its lister to (item ya of its listea)
						set ya to ya + 1
						if ya > cta then
							if yb ≤ ctb then -- il peut rester des éléments dans l'une des listes
								set its lister to (its lister) & (items yb thru -1 of its listeb)
							end if
							exit repeat
						end if
					end if
				end repeat
				-- la nouvelle liste produite par la fusion est ajoutée en tant que liste au résultat
				set the end of its liste2 to its lister
			end repeat -- et on recommence tant qu'il reste des paires de listes
			-- sans oublier l'orpheline si le nombre de listes est impair: on l'ajoute simplement à la fin
			if flag then set the end of its liste2 to {} & item -1 of its liste1
			set its liste1 to its liste2 -- et c'est reparti pour un tour
		end repeat
		return first item of its liste1 -- parce que c'est une liste de listes, même si elle n'en contient plus qu'une seule
	end tell
end intersort

----------- les fonctions de comparaison ------------
on cmpasc(n1, n2) -- pour le tri ascendant des nombres et des chaînes
	return n1 < n2
end cmpasc

on cmpdesc(n1, n2) -- tri descendant des nombres et des chaînes
	return n1 > n2
end cmpdesc

on recmp(n1, n2) -- tri par noms
	return (nom of n1 < nom of n2)
end recmp

on agecmp(n1, n2) -- tri par âges
	return (Age of n1 < Age of n2)
end agecmp

---------------- Pour tester ----------------

set maListe to {1, 5, 15, 2, 7, 3, 23, 125, 4, 89, 15, 8, 7, 6, 5, 27, 35, 12, 22, 160}


repeat 7 times -- 2 560 éléments
	set maListe to maListe & maListe
end repeat

set aTime0 to current date
set maListe to intersort(maListe, my cmpasc)
set aTime1 to current date
set laps to aTime1 - aTime0

log maListe
log "Durée: " & laps & " secondes"

return

set maListe to {¬
	{id:1, nom:"totor", Age:20, sexe:"M", list_IDconjoint:{}, IDPlanete:1}, ¬
	{id:2, nom:"Zézette", Age:100, sexe:"F", list_IDconjoint:{}, IDPlanete:1}, ¬
	{id:3, nom:"Lulu", Age:256, sexe:"M", list_IDconjoint:{}, IDPlanete:2}, ¬
	{id:4, nom:"Minouchet", Age:201, sexe:"F", list_IDconjoint:{}, IDPlanete:2}, ¬
	{id:5, nom:"Grmblwz", Age:12, sexe:"N", list_IDconjoint:{}, IDPlanete:2} ¬
		}

log intersort(maListe, my recmp)
log maListe
log intersort(maListe, my agecmp)

(* tri par tas (arbre binaire complet ordonné)
Ce tri figure parmi les plus efficaces et n'exige aucune mémoire supplémentaire.
Implémentation: L. Sebilleau
*)

on heapsort(laListe, fcmp)
	set n to the count of laListe
	if n ≤ 1 then return laListe
	
	script lstock -- l'intérêt de ce script est d'accélérer le traitement des listes
		property liste : laListe
		property Compare : fcmp -- pointeur de fonction de comparaison
	end script
	
	-- création du tas  par insertion fictive des éléments présents dans la liste: ils sont
	--"phagocytés" un à un par la croissance du tas dont la dimension est i.
	
	repeat with i from 2 to n
		set j to i div 2 -- le premier parent du nouvel inséré
		set k to i
		set elem to item k of lstock's liste
		repeat while k > 1
			if lstock's Compare(item j of lstock's liste, elem) then
				set item k of lstock's liste to item j of lstock's liste
				set k to j
				set j to j div 2
			else
				exit repeat
			end if
		end repeat
		set item k of lstock's liste to elem -- installé dans sa position finale
	end repeat
	
	(* destruction du tas: l'extraction répétée donne la liste triée à l'envers:
 comme le tas diminue par la fin de la liste, on peut réinsérer les éléments extraits à rebours
dans les emplacements libérés. Ce retournement explique pourquoi un utilise un tas max pour
trier en ordre croissant et vice-versa. *)
	
	repeat with i from n to 2 by -1
		set elem to item i of lstock's liste -- diminution du tas
		set item i of lstock's liste to item 1 of lstock's liste -- extraction de la racine qui est
		-- recopiée dans la position libérée
		
		-- réorganisation du tas en réinsérant l'élément de queue à la racine
		set j to 1
		set k to j * 2
		repeat while k < i
			if (k + 1) < i then
				if lstock's Compare(item k of lstock's liste, item (k + 1) of lstock's liste) then
					set k to k + 1
				end if
			end if
			if lstock's Compare(elem, item k of lstock's liste) then
				set item j of lstock's liste to item k of lstock's liste
				set j to k
				set k to k * 2
			else
				exit repeat
			end if
		end repeat
		set item j of lstock's liste to elem
	end repeat
	--return lstock's liste -- c'est facultatif
end heapsort

----------- les fonctions de comparaison ------------
on cmpasc(n1, n2) -- pour le tri ascendant des nombres et des chaînes
	return n1 < n2
end cmpasc

on cmpdesc(n1, n2) -- tri descendant des nombres et des chaînes
	return n1 > n2
end cmpdesc

on recmp(n1, n2) -- tri par noms
	return (nom of n1 < nom of n2)
end recmp

on agecmp(n1, n2) -- tri par âges
	return (Age of n1 < Age of n2)
end agecmp

---------------- Pour tester ----------------

set maListe to {1, 5, 15, 2, 7, 3, 23, 125, 4, 89, 15, 8, 7, 6, 5, 27, 35, 12, 22, 160}


repeat 7 times -- 2 560 éléments
	set maListe to maListe & maListe
end repeat

set aTime0 to current date
heapsort(maListe, my cmpasc)
set aTime1 to current date
set laps to aTime1 - aTime0

log maListe
log "Durée: " & laps & " secondes"

return

set maListe to {¬
	{id:1, nom:"totor", Age:20, sexe:"M", list_IDconjoint:{}, IDPlanete:1}, ¬
	{id:2, nom:"Zézette", Age:100, sexe:"F", list_IDconjoint:{}, IDPlanete:1}, ¬
	{id:3, nom:"Lulu", Age:256, sexe:"M", list_IDconjoint:{}, IDPlanete:2}, ¬
	{id:4, nom:"Minouchet", Age:201, sexe:"F", list_IDconjoint:{}, IDPlanete:2}, ¬
	{id:5, nom:"Grmblwz", Age:12, sexe:"N", list_IDconjoint:{}, IDPlanete:2} ¬
		}

heapsort(maListe, my recmp)
log maListe
heapsort(maListe, my agecmp)
log maListe

Yvan KOENIG running Sierra 10.12.3 in French (VALLAURIS, France) mercredi 25 janvier 2017 11:52:09

Wow! Does planet 2 have a very healthy climate or a very short year? :slight_smile:

Hi!
Thanks for improving compactness of code. My script was a direct transcript of an older quicksort AppleScript for lists. Wanted to make it more readable, but dit not need to.
(From older script:)
repeat while Piv > item i of Sal’s Liszt is in fact equivalent to repeat while item i of Sal’s Liszt < Piv
and thus only “IsLarger” is required.
Of course “Greater” feels even bigger than “Larger”. I chose “isLarger” for legibility. Thanks for the subsort trick!

I read about improving Quicksort by choosing different pivot, number of pivots, etc. But this is already an overkill to sort 500 records. It is very efficient and quite small, I will keep it in my script library. Sorting 100,000 real numbers in 23 seconds is already amazingly fast for me. No surprise Quicksort has been a favorite alrounder for so long.

If you’re sorting large lists and speed matters, you’ll do better with AppleScriptObjC where possible. For example:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

-- build list of 100,000 reals
set theList to {}
repeat 10000 times
	set end of theList to random number from 0.0 to 100.0
end repeat
set newList to {}
repeat 10 times
	set newList to newList & theList
end repeat

-- sort them
set theStart to current application's NSDate's |date|()
set anArray to current application's NSArray's arrayWithArray:newList
set newList to (anArray's sortedArrayUsingSelector:"compare:") as list
set timeTaken to -(theStart's timeIntervalSinceNow())

That takes <0.4 seconds, and most of that time is spent converting to and from arrays/lists.

Hello Shane

Would be useful to explain how we may apply ASObjC to sort a list of records upon one of the properties included in the records.
After all it’s the original question of this thread.

Yvan KOENIG running SIerra 10.12.3 in French (VALLAURIS, France) jeudi 26 janvier 2017 11:44:17

ASObjC didn’t exist when the original question was asked :wink:

It’s pretty simple, but there are some provisos:

  • The record labels can’t be reserved words;

  • The values must be strings, numbers, or booleans, or dates in 10.11 and above.

Then it’s just a matter of making an array and thing called a sort descriptor, which describes what key/label you want to sort on, and how to sort them. So for the NickName/Age sample posted here yesterday, you would replace the call to QuickSortLR() with:

set anArray to current application's NSArray's arrayWithArray:MyList
set theDesc to current application's NSSortDescriptor's sortDescriptorWithKey:"NickName" ascending:true selector:"caseInsensitiveCompare:"
set newList to (anArray's sortedArrayUsingDescriptors:{theDesc}) as list

You can also sort by multiple keys. For example, if you were sorting by Age, you could have the entries where theAge matches be sorted by NickName:

set anArray to current application's NSArray's arrayWithArray:MyList
set theDesc to current application's NSSortDescriptor's sortDescriptorWithKey:"Age" ascending:true selector:"compare:"
set theDesc2 to current application's NSSortDescriptor's sortDescriptorWithKey:"NickName" ascending:true selector:"caseInsensitiveCompare:"
set newList to (anArray's sortedArrayUsingDescriptors:{theDesc, theDesc2}) as list

For all but very long lists, it’s probably going to be a bit slower. The reason is because converting between records and dictionaries is time-consuming, which in turn is because AS records are very inefficient. The actual sorting is lightning fast.

My feeling is that, although records/dictionaries are a good way to work with data, large numbers of them tend to be so slow in AS that it’s often better to stick to lists of lists in such cases, and so I’m not sure the need to sort large numbers of them happens much in the real world. But that’s probably generalizing.

Thanks Shane.

I missed that the thread was a really old one.

Reading your message refreshed my memory.
It seems that I already read such pieces of code.
I will search in my archives some of these days.

Yvan KOENIG running Sierra 10.12.3 in French (VALLAURIS, France) jeudi 26 janvier 2017 14:01:02

That’s the reason why list sorting in the first versions of AST was done without using an list but building an array of descriptors to make use of quick sort which can be found in stdlib.h. The overhead of building an array and re-create an AEList was way smaller to keep the performance up.

Version 2.0 and up uses CoreFoundation where sorting is actually a bit slower for large lists than earlier versions of AST. But sorting 40k fields with an average string length of 50 characters still takes 0.2 seconds (2.3mb large CSV file) or sorting 100k real numbers costs the same in execution time.

Yes, I tried various optimizations in ASObjC Runner early on, avoiding unpacking descriptors where possible. In the end, the complexity made it harder to maintain, and there didn’t seem enough benefit.

With ASObjC, though, you often avoid a lot of the overhead. Sure, converting a list of records, sorting them, and converting back takes time. But in real situations it’s often not like that – you might convert once, then perform several operations before converting back. The transaction overhead often comes right down.

So I am trying this with dates on 10.13 using SD and I get the following error:

[<__NSCFString 0x600000e9dbf0> valueForUndefinedKey:]: this class is not key value coding-compliant for the key aDate.

Here is a sample of my list (myList), which are selected messages in Mail:
{{“aDate:Wednesday, November 6, 2019 at 6:03:24 AM”, “Subject:Take control of your life”}, {“aDate:Wednesday, November 6, 2019 at 5:18:49 AM”, “Subject:Spam!!!”}, {“aDate:Wednesday, November 6, 2019 at 9:58:21 AM”, “Subject:NEW Savings Unwrapped for the Holidays!”}, {“aDate:Wednesday, November 6, 2019 at 7:23:20 AM”, “Subject:Fall Savings Going On NOW”}, {“aDate:Wednesday, November 6, 2019 at 5:45:05 AM”, “Subject:Check out these special offers”}, {“aDate:Tuesday, November 5, 2019 at 5:06:56 AM”, “Subject:Thanks for your recent purchases!”}, {“aDate:Wednesday, November 6, 2019 at 3:30:12 PM”, “Subject:Apple TV+ Review: ‘Dickinson’”}, {“aDate:Wednesday, November 6, 2019 at 9:00:28 AM”, “Subject:RFML Digest, Vol 6, Issue 106”}}

And here is the code:

set anArray to (current application’s NSArray’s arrayWithArray:myList)
set theDesc to (current application’s NSSortDescriptor’s sortDescriptorWithKey:“aDate” ascending:true selector:“compare:”)
set newList to (anArray’s sortedArrayUsingDescriptors:{theDesc}) as list

Does anyone have any idea what is going wrong? I did a google search with the error text but that pointed me at a bunch of ObjectiveC errors and fixes.

Model: 2018 Macbook Pro
AppleScript: 2.6
Browser: Safari 605.1.15
Operating System: macOS 10.13

Your data is a list of lists, not records.

Yep, records worked.

Thanks Shane!