That “optimized pivot” idea looks very interesting. Thanks for posting it. I’m going to have a happy couple of days trying to analyse it to see if Arthur’s and my effort can benefit from it!
Meanwhile, I’ve taken the liberty of applying a couple of optimisations to your Qpartition() handler. Basically, they use existing variable values to reduce the number of times the list is accessed. There’s also an exploitation of an AppleScript quirk whereby “simple” comparisons are faster than “compound” ones. It’s faster to do this .
if a > b then
else
-- Do something.
end if
. than this:
if a ≤ b then
-- Do something.
end if
Similarly with ≥ and not. It’s faster to test their antitheses than to test them. The difference is minimal, but every little helps in a large sort!
on Qpartition(leftIndex, rightIndex, pivot)
--private routine called by quickSort.
--do not call from your script!
set pivotValue to item pivot of bs's alist
--set temp to item pivot of bs's alist -- We already have this as pivotValue.
set item pivot of bs's alist to item rightIndex of bs's alist
--set item rightIndex of bs's alist to temp -- No need to put it back in the list.
set tempIndex to leftIndex
repeat with pointer from leftIndex to (rightIndex - 1)
set temp to item pointer of bs's alist -- temp set here to save getting the item twice.
if temp > pivotValue then --≤ pivotValue then
else
--set temp to item pointer of bs's alist -- Moved up.
set item pointer of bs's alist to item tempIndex of bs's alist
set item tempIndex of bs's alist to temp
set tempIndex to tempIndex + 1
end if
end repeat
--set temp to item rightIndex of bs's alist -- Already in pivotValue.
set item rightIndex of bs's alist to item tempIndex of bs's alist
set item tempIndex of bs's alist to pivotValue --temp
return tempIndex
end Qpartition
Funny you should mention. As I was writing this (and the other sorts - bubble, insertion, and merge) I found myself wondering if there was a difference, but I didn’t code it up to see if there was.
Hey everybody
as I’m not the perfect scripter as you guys, I’m having a hard time to transform the qsort routine into a qsort the would match my needs.
I’ve been using Kevins’s pivot version at the moment - and was also unsuccesfully trying to implement Nigel’s superfast version (I’m working in a tell application routine and I couldn’t call Nigel’s routine through that)
What I actually want to achieve is removing the duplicates, while sorting the list.
As I didn’t get that done, for the moment, I’m rearranging the sorted list like that:
set srtLst to my quickSort(Lst)
set lgthsrtLst to srtLst's length
set Lst to {}
repeat with x1 from 1 to lgthsrtLst
set TmpItm to item x1 of srtLst
if x1 < (lgthsrtLst) then
if TmpItm = item (x1 + 1) of srtLst then
else
set Lst to Lst & TmpItm
end if
end if
if x1 = (lgthsrtLst) then set Lst to Lst & TmpItm
end repeat
It would probably be much faster, if I’d erase the duplicates directly in the qsort handler.
could you give me a hint on how to get that done?
Thank you so very much!
And most important thank you so much for posting your wisdom in general!!
ekel75
PS. do you know which is faster, using a variable for 3 to 4 instances in a script
set lgthCalcNum to CalcNum's length
if (lgthCalcNum) > 6 then set CalcNum to (text 1 thru ((lgthCalcNum) - 6) of CalcNum) & "." & text ((lgthCalcNum) - 5) thru (lgthCalcNum) of CalcNum
or not!? ( like this )
if (CalcNum's length) > 6 then set CalcNum to (text 1 thru ((CalcNum's length) - 6) of CalcNum) & "." & text ((CalcNum's length) - 5) thru (CalcNum's length) of CalcNum
Model: Desktop G4 800MHz
AppleScript: AppleScript D1-1.9.3
Browser: Firefox 1.5.0.7
Operating System: Mac OS X (10.4)
I don’t think that combining a sort with removing the duplicates would actually be faster. Sorts do a large number of item comparisons “ often a very large number. The comparisons check if one item is greater or less than another. It would greatly increase the workload if each stage of the sort also included a check to see if the items were equal, and making a new list every time a duplicate was found would increase it even more. There may be an effective way to reduce the slow-down, but I can’t see it.
Your process of sorting the list first and then building another without the duplicates is probably the best way. I’d code it something like this:
-- A script object for speed of access to the lists.
script o
property sortedList : missing value
property prunedList : missing value
end script
-- Do the sort.
set o's sortedList to quickSort(myList)
-- Since the list has been sorted, equal values will be together, not randomly distributed.
-- Initialise a check value to the first item in the sortedList
set checkValue to item 1 of o's sortedList
-- Start off the pruned list with that value.
set o's prunedList to {checkValue}
repeat with i from 2 to (count o's sortedList)
set thisValue to item i of o's sortedList
if (thisValue is checkValue) then
-- If this value's a duplicate, skip it.
else
-- Otherwise, we've reached (a possible run of) a new value.
-- Append this value to the pruned list and make it the new check value.
set end of o's prunedList to thisValue
set checkValue to thisValue
end if
end repeat
set myList to o's prunedList
Another way, if the items in the list are all of the same class and you know when you write the script what that will be, is this:
-- A script object for speed of access to the list.
script o
property sortedList : missing value
end script
-- Do the sort.
set o's sortedList to quickSort(myList)
-- Since the list has been sorted, equal values will be together, not randomly distributed.
-- Initialise a check value to the first item in the sortedList
set checkValue to item 1 of o's sortedList
repeat with i from 2 to (count o's sortedList)
set thisValue to item i of o's sortedList
if (thisValue is checkValue) then
-- If this value's a duplicate, replace it with 'missing value'
set item i of o's sortedList to missing value
else
-- Otherwise, we've reached (a possible run of) a new value.
-- Make it the new check value.
set checkValue to thisValue
end if
end repeat
-- Assuming that this is a list of integers, return the integers (not the 'missing values').
set myList to o's sortedList's integers
I’ve just seen the addendum to your post. The version with the variable should be slightly faster, but the difference will be miniscule with just those two lines. Basically, retrieving a value from a variable is faster than calling a function or using a reference to work it out again.
thank you very much for your quick response and all your effort!
I like your routine very much. My approach has the same basic principle but your’s just got that neatly programmed twist to it.
Simply pro! Thank you very much for making me understand Applescript + programming more and more
So now I’m trying to implement your second version, since it’s all the same class.
I tried it with an empty script and worked like a charm.
Trying to implement it doesn’t work too well though
I’m in a < tell application > routine, and some of the stuff from Applescript has to be handled differently from within < tell application, >
which I haven’t fully figured out yet. (Which is why I’m not using your quicksort ) Calling subroutines works fine though,
so I tried to put your script into a new handler and calling it up like that:
tell application "Adobe InDesign CS2"
activate
tell document 1
...
set Lst to my pruneList(Lst)
...
end tell
end tell
--
--Subroutines
on quickSort(theLst)
...
end quickSort
on pruneList(thePruneLst)
script o
property sortedList : missing value
end script
set o's sortedList to my quickSort(thePruneLst)
-- Initialise a check value to the first item in the sortedList
set checkValue to item 1 of o's sortedList
repeat with i from 2 to (count o's sortedList)
set thisValue to item i of o's sortedList
if (thisValue is checkValue) then
-- If this value's a duplicate, replace it with 'missing value'
set item i of o's sortedList to missing value
else
-- Otherwise, we've reached (a possible run of) a new value.
-- Make it the new check value.
set checkValue to thisValue
end if
end repeat
return o's sortedList's text
end pruneList
But somehow I’m not getting it right. Am I not allowed to call the quickSort handle from within a second handler ?
Are some values not transported right!?!?
thank you very much for your help!
ekel75
–
– FOUND IT !?!
I guess I found it myself, though I still don’t get why it worked before!
the list “Lst” is not actual text but a reference to text within an Indesign Document…
it looks like this
{text from character 1 to character 7 of cell id 0 of table id 853847 of cell id 26 of table id 853696 of cell id 2 of table id 853446 of text frame id 852504 of page id 852502 of spread id 852497 of document “testdoc.indd” of application “Adobe InDesign CS2”, …}
so I included this:
set Lst to (search with find attributes {applied character style:CharStyle})
repeat with x from 1 to (count of Lst)
set item x of Lst to item x of Lst as text
end repeat
set Lst to my pruneList(Lst)
what do you think? you probably have some slick coded idea up your sleeve right as you’re reading this
I’m afraid I don’t have InDesign and am not familiar with scripting it. But it looks as though what you’re doing is the way to do it: get the references, coerce to text, sort, prune. (The pruning method relies on the fact that the texts are sorted.) If Lst contains a large number of items, a script object might speed up the coercion loop a little too:
script o
property Lst : missing value
end script
set o's Lst to (search with find attributes {applied character style:CharStyle})
repeat with x from 1 to (count o's Lst)
set item x of o's Lst to item x of o's Lst as text
end repeat
set theList to my quickSort(o's Lst)
set theList to my pruneList(theList)
thank you very much for your reply! Just found it after posting my last question… (** edit: deleted my last post**)
I changed the script in that way… every bit of a second counts with large lists!
your quicksort is a perfect example (still can’t call it from within the main script :()
my prune script is now calling the quicksort so I only call prunelist from the mainscript.
(cause I can’t prune without sorting first anyway)
… do you have a hint for my last question?! (** edit: deleted my last post**)
helplessy yours,
ekel75
** edit
* @Nigel:
I just noticed that your CustomQSort works as a handler, with which I’m not having problems calling from my main script!
So hoooooray! All my sort problems are gone again! Thank you sooo much!
Happy me!
*
** end of edit
*
This algorithm and its implementation is very nice !
I am using it often, but unfortunately I am not clever enough (okay, maybe too lazy …) to understand its structure.
It should be very easy to use this algorithm as a searching routine as well. As I understood the QuickSort algortihm, it is more less nothing other like a searching routine, or it should be quite easy to modify it for this use.
Is there anybody able and so kind to tell me how to use this QuickSort Algorith as a searching routine, i.e. searching for an item, giving out the (one and only, or first) index containig it ?
I wonder if Jens is thinking of a binary search, which is distantly related to QuickSort in that it’s a “divide and conquer” process. In AppleScript, you use ‘is in’ to determine which half of the list the item’s in, then which half of that half, and so on. It can be faster than checking each item from the beginning of the list, but not by very much. And (if I remember correctly) it depends on the length of the list and where the item actually is in it. An item near the beginning might be found more quickly by a linear search. And of course a binary search can use a lot of memory generating the half lists for checking.
on binarySearch(theList, value)
script o
property lst : theList
end script
set valueAsList to {value}
set l to 1
set r to (count theList)
if (valueAsList is in theList) then
repeat until (value is item l of o's lst)
set l to l + 1
set m to (l + r) div 2
if (valueAsList is in items l thru m of o's lst) then
set r to m - 1
else
set l to m + 1
end if
end repeat
return l
else
return 0
end if
end binarySearch
Any speed advantage comes from the fact that ‘is in’ loops through the list(s) and compares the items by means of its underlying C code, which is faster than the equivalent instructions at the AppleScript level. It has to be sufficiently faster here that, even when it’s done several times, and with all the other AppleScript commands involved, it’s still faster than looping through the list and comparing the items with AppleScript.