Monday, July 26, 2021

You are not logged in.

**Kevin Bradley****Administrator**- From:: Independence, MO
- Registered: 2006-03-13
- Posts: 548
- Website

**Finding What Sort Of Scripter You Are**

In my previous columns in this series, I showed you a technique to simplify your script creation (pseudocode) and how to use Applescript to create three of the basic data structures that are used in computer science (stacks, queues, and priority queues). The next usual topic in most computer science courses is sorting. Sorting a list of items is one of the most common actions in any programming language, and it is often given sort shrift by most programming language creators. While some languages provide a built-in sort function, many (including Applescript) do not. But when you are showing the user of your script a long "choose from list" dialog it is much easier for the user to select something if the list is sorted.**A Bubbly Personality**

Perhaps the oldest and easiest to understand sorting method is the bubble sort. Simply put, the bubble sort compares the first two items in the (unsorted) list. If the first item is greater than the second, the items are swapped. Then the second item (which was the first if there was a swap last time) and the third item are compared and swapped if necessary. The process continues until you have worked through all the times, finally placing the largest item in the last position.

However, we're not nearly done! Now, we perform the same operation again, sorting the item next-to-last in the list, and then again for the item before next-to-last; you can see that with each pass, 1 item finds its proper place, but if the list has n items, you must loop through the list (n-1)+(n-2)+(n-3)...down to..1 times! So to sort 10 items, you have to do 9+8+7+6+5+4+3+2+1=45 comparisons. In AppleScript, the bubble sort looks like this:

## Applescript:

on bubbleSort(theList)

-- defining an internal script makes for faster run times!

script bs

property alist : theList

end script

set theCount to length of bs's alist

if theCount < 2 then return bs's alist

repeat with indexA from theCount to 1 by -1

repeat with indexB from 1 to indexA - 1

if item indexB of bs's alist > item (indexB + 1) of bs's alist then

set temp to item indexB of bs's alist

set item indexB of bs's alist to item (indexB + 1) of bs's alist

set item (indexB + 1) of bs's alist to temp

end if

end repeat

end repeat

return bs's alist

end bubbleSort

While the bubble sort is okay for small lists (less than 20 items), it begins to bog down and get very slow for large numbers of items. The reason is that to sort the entire list (no matter how large it is) you have to do roughly (n*(n+1))/2 comparisons. To computer scientists and mathematicians, the important part of that equation is n*n or n squared.

Now, for 10 items we only did 45 comparisons, but 45 is "in the neighborhood" of 10 squared (or 100). For 100 items, we would do 99+98+97+...+1 comparisons, which works out to 5000 comparisons, which is "in the neighborhood" of 100 squared (or 10,000).

When comparing sorting algorithms or other processes, computer scientists look for the "order" of the operation, also called "the Big O" (for "order"). By this they mean, is the amount of processing required related to the number of inputs by simply **n** itself, **n** squared, or some other factor? In the case of the bubble sort, the operation is "big O of **n** squared." We can do better. Although the bubble sort is good for learning to sort, it is basically useless for any real sorting. There is a variation of the bubble sort where you keep track of whether you've had to swap items during a loop through the data, and if not, you're done, but even this doesn't help the bubble sort much.

Any time you have a repeat loop inside another repeat loop, the total number of times the program will loop is the number of times through the outer loop MULTIPLIED BY the number of times through the inner loop. This is why bubble sort is Big O **n** squared. Keep this in mind any time you are writing nested repeats!**Insert Tab A into Slot B**

I don't know about you, but when I actually sort something by hand, I start a sorted stack (initially empty) and pull items from the unsorted bunch and insert the new item in its proper place in the sorted stack. If I'm short on workspace, I might even just arrange the items in order in the same stack of items. These are both examples of an insertion sort. For worst case (totally random sort), the bubble sort (**n** squared over 2) and the insertion sort (**n** squared over 4) are of about the same order. But in the situation where the data is basically in some order, then the insertion sort performs fewer swaps and is of the order (**n**+**d**) where **d** is the number of swaps performed. This is considerably better than **n** squared! Add to that the feature that you can add items spontaneously and they will be inserted in the proper place, and you have a much stronger sort than the bubble sort, which must operate on a fixed number of items.

Notice also that even though this sort has nested repeat loops, the inner loop only looks until it finds the spot for the sorted item, not through the entire list of items like bubble sort!

## Applescript:

on insertSort(theList)

--Insertion sort using the same list

--(as opposed to using a second list)

script bs

property alist : theList

end script

set theLength to length of bs's alist

if theLength > 1 then

set insertHere to 2

repeat while insertHere â‰¤ theLength

if last item of bs's alist â‰¤ first item of bs's alist then

set bs's alist to {last item of bs's alist} & items 1 thru (theLength - 1) of bs's alist

else

set lookfor to 1

repeat while lookfor < insertHere and item lookfor of bs's alist â‰¤ last item of bs's alist

set lookfor to lookfor + 1

end repeat

set bs's alist to items 1 thru (lookfor - 1) of bs's alist & last item of bs's alist & items (lookfor) thru (theLength - 1) of bs's alist

end if

set insertHere to insertHere + 1

end repeat

end if

return bs's alist

end insertSort

**The Urge to Merge**

The merge sort is based on a totally different idea. You first compare item 1 and item 2 and put them in the correct order, forming a sublist of 2 items. You do the same for items 3 and 4, making another sublist, and so on until you reach the end, creating two-item sublists that are in order within. Then you begin comparing the first item of two adjacent sublists. So you compare item 1 to item 3 (the first item of list A to first item of list B) and if they need to swap, do so. If they don't, then compare item 3 to item 2, swapping if necessary. When you are done with item 3, then do item 4 and now you've merged the two small two-item lists into a larger, four-item list. You continue this logic for all items, making larger sublists until you have two sublists containing half of the total items each, and then you merge those two lists.

The Applescript implementation of the merge sort requires us to create a handler ** inside** the main sort handler, since this is a recursive algorithm. While the merge() handler has a repeat loop, it does NOT have any nested repeats. That alone should tell us that this sort is faster! In fact, the merge sort is of the order

## Applescript:

on mergeSort(theList)

--the public routine, to be called from your script

script bs

property alist : theList

on merge(leftSide, rightSide)

--private routine called by mergeSort.

--do not call from your script!

set newList to {}

set theLeft to leftSide

set theRight to rightSide

set leftLength to length of theLeft

set rightLength to length of theRight

repeat while leftLength > 0 and rightLength > 0

if first item of theLeft â‰¤ first item of theRight then

set newList to newList & first item of theLeft

set theLeft to (rest of theLeft)

else

set newList to newList & first item of theRight

set theRight to rest of theRight

end if

set leftLength to length of theLeft

set rightLength to length of theRight

end repeat

if leftLength > 0 then set newList to newList & theLeft

if rightLength > 0 then set newList to newList & theRight

return newList

end merge

end script

set midList to 0

set leftList to {}

set rightList to {}

set listLength to length of bs's alist

if listLength â‰¤ 1 then

return bs's alist

else

set midList to listLength div 2

repeat with pointer from 1 to midList

set leftList to leftList & item pointer of bs's alist

end repeat

repeat with pointer from (midList + 1) to listLength

set rightList to rightList & item pointer of bs's alist

end repeat

set leftList to mergeSort(leftList)

set rightList to mergeSort(rightList)

end if

return bs's merge(leftList, rightList)

end mergeSort

**The Quick and the Dead**

In an amazing display of common sense, the fastest sort that is widely used is called quicksort. The quicksort algorithm is a bit more complex than the other sorts, but the results are nothing short of amazing. Using quicksort, large lists of items that are unsortable in an amount of time that a user is willing to wait are sortable in a fraction of the time.

The idea of the quicksort is to choose a midpoint item and all the items larger than that item (called the "pivot") go to the right of it and the smaller items go to the left. These two sublists are then sorted the same way: choose a pivot, sort the items smaller to one side, larger to the other. It sounds simple enough, but there have been all kinds of methods for choosing the pivot item. The simplest example of quicksort just chooses the item in the middle of the list. The version I've listed below uses an internal handler to optimize the selection of the pivot value:

## Applescript:

on quickSort(theList)

--public routine, called from your script

script bs

property alist : theList

on Qsort(leftIndex, rightIndex)

--private routine called by quickSort.

--do not call from your script!

if rightIndex > leftIndex then

set pivot to ((rightIndex - leftIndex) div 2) + leftIndex

set newPivot to Qpartition(leftIndex, rightIndex, pivot)

set theList to Qsort(leftIndex, newPivot - 1)

set theList to Qsort(newPivot + 1, rightIndex)

end if

end Qsort

on Qpartition(leftIndex, rightIndex, pivot)

--private routine called by quickSort.

--do not call from your script!

set pivotValue to item pivot of bs's alist

set temp to item pivot of bs's alist

set item pivot of bs's alist to item rightIndex of bs's alist

set item rightIndex of bs's alist to temp

set tempIndex to leftIndex

repeat with pointer from leftIndex to (rightIndex - 1)

if item pointer of bs's alist â‰¤ pivotValue then

set temp to item pointer of bs's alist

set item pointer of bs's alist to item tempIndex of bs's alist

set item tempIndex of bs's alist to temp

set tempIndex to tempIndex + 1

end if

end repeat

set temp to item rightIndex of bs's alist

set item rightIndex of bs's alist to item tempIndex of bs's alist

set item tempIndex of bs's alist to temp

return tempIndex

end Qpartition

end script

if length of bs's alist > 1 then bs's Qsort(1, length of bs's alist)

return bs's alist

end quickSort

Some versions of quicksort just grab the first item for the pivot, but if the list is already somewhat sorted, then the benefits of quicksort evaporate and you're stuck with a sort that is Big O **n** squared again! To get around this fault in quicksort, some prefer to choose the middle item in the list or the middle value of the first, last and middle items. This version starts with the middle value and adjusts it when creating the partitions to find an optimal value. By doing this, this sort is almost always on the order of **n***log(**n**) even in the worst possible case.**Out of Sorts**

Sorting is something all of us use (or want to use) at one time or another. I hope this has helped show the pitfalls of sorting so you can select a sort that is appropriate for your use. I originally started working on sorting because I had an iTunes script that I wanted to display a list of my albums and the initial list was an unsorted mess. In researching sorts, I found that some are so slow that the are completely unusable. Others were only slow because I'd coded them in such a way that Applescript was dragging its feet, and recoding them speeded them up greatly. The quicksort above is VERY usable, even on an old G3 iMac (like mine) with an iTunes library of about 1000 songs. The "choose from list" dialog pops up very fast!

There is also a great thread in the MacScripter bbs' Code Exchange forum on sorting, and I'd like to thank everyone who participated in that discussion, as it not only helped me to write these sort routines, they also helped me improve these routines for speed! All the sort routines in this article, plus the versions of quicksort optimized by Adam Bell and by Nigel Garvey that are in the Code Exchange thread, are in my Nite Flite library's sortLib on ScriptBuilders.net as well.**More Information**

For more information about sorting:

Wikipedia - Sorting Algorithms

Wikipedia - Bubble Sort

Wikipedia - Insertion Sort

Wikipedia - Merge Sort

Wikipedia - Quicksort

Nitewing '98

--

I distrust morning people, largely because I suspect them of getting together early one day while the rest of us were asleep and setting up the rules of civilization.

Offline