Saturday, November 18, 2017

#1 2015-08-21 06:13:15 pm

Nigel Garvey
Moderator
From:: Warwickshire, England
Registered: 2002-11-20
Posts: 4420

Two FULLY in-place merge sorts

One of the "disadvantages" often quoted for merge sorts is that they use a lot of memory. This is because they're usually implemented recursively and because each recursion step involves the creation of an additional list into which (or from which) that particular merge is done.

But iterative merge sorts are very obvious and simple to implement and, since they perform all the merges of a particular size in the same pass, they can use just one additional list for an entire sort, merging back and forth between it and the original. (See this, for example.)

Here though, just for the hell of it, are two merge sorts which use no additional lists at all. Obviously the problem to be solved has been what to do with values currently sitting in slots where other values have to be inserted during the merge.

The first script uses the simple expedient of right-shifting all the "left" values between a "right" value and its target slot. In this respect, it resembles an insertion sort, but it has merge sort's binary structure and comparison regime and is slightly faster than an insertion sort. (Faster than my own insertion sort, that is, and given an initially chaotic list).

Applescript:


(* Merge sort
Merge sort algorithm: John von Neumann, 1945.
AppleScript implementation: Nigel Garvey, 2007. Iterative version without auxiliary list: 2015.

Parameters: (list, range index 1, range index 2)
*)


on sort(theList, l, r)
   script o
       property lst : theList
   end script
   
   -- Process the input parmeters.
   set listLen to (count theList)
   if (listLen < 2) then return
   -- Negative and/or transposed range indices.
   if (l < 0) then set l to listLen + l + 1
   if (r < 0) then set r to listLen + r + 1
   if (l > r) then set {l, r} to {r, l}
   
   -- The sort.
   set sortRangeLen to r - l + 1
   set mergeSize to 1
   -- Repeat with the merge size doubling on each pass.
   repeat while (mergeSize < sortRangeLen)
       set leftLen to mergeSize
       set mergeSize to mergeSize * 2
       -- Iterate through the sort range's merge divisions. Odd items at the end are a division if there are enough for a full "left" component and at least one "right" item.
       repeat with leftL from l to (r - leftLen) by mergeSize
           -- Initialise tracking indices for this division's "left" and "right" components and an end index for the right.
           set i to leftL
           set j to i + leftLen
           set rightR to leftL + mergeSize - 1
           if (rightR > r) then set rightR to r
           
           set lv to item i of o's lst
           -- Repeat until the right index reaches the end of the right items.
           repeat with j from j to rightR
               set rv to item j of o's lst
               -- Advance through the left values until either the left index catches up with the right or a left value is found which is greater than the current right value.
               repeat until ((i = j) or (lv > rv))
                   set i to i + 1
                   set lv to item i of o's lst
               end repeat
               -- If the left index has caught up, we've run out of left values, so finish early.
               if (i = j) then exit repeat
               -- Otherwise shift all the intervening values one slot to the right and plonk the right value where the left value was.
               repeat with x from j to (i + 1) by -1
                   set item x of o's lst to item (x - 1) of o's lst
               end repeat
               set item i of o's lst to rv
               -- Adjust the left index to the left value's new position.
               set i to i + 1
           end repeat
       end repeat
   end repeat
   
   return -- nothing.
end sort

-- (* Demo:
set lst to {}
repeat 1000 times
   set end of my lst to (random number 1000)
end repeat
log lst

-- Sort items 1 thru -1 of lst.
sort(lst, 1, -1)
lst
-- *)

The second script is more complex, but quite a bit faster. It treats the area between the placed values on the left and the not-yet-used "right" values as a buffer containing the not-yet-used "left" values and two "vacant" slots whose contents have been assigned to variables for comparison. The buffer migrates to the right as the merge progresses and the queue of unused left values inside it loops round on itself, the leftmost value being moved to the other end where possible instead of the entire queue being moved aside. The complex dynamic of the two ends of the buffer and different parts of the queue advancing at different rates and at different times means there's often no free space to the right of the queue and it's still necessary to shift part of the queue aside. However, it's only a part of the queue  which gets moved — not the whole thing as in the first script — and the script always chooses the shorter part.

Although considerably faster than the first script, this one's not particuarly exciting speedwise, although it's certainly usable. I've had lots of great ideas for speeding it up, but this seems to be the best compromise between speed and the sort actually working. If you need a stable vanilla sort which only uses the same amount of memory as an insertion sort, but which is much faster given an initially chaotic list, this could be for you. Just don't expect a ternary version any time soon.  wink

Applescript:


(* Merge sort
Merge sort algorithm: John von Neumann, 1945.
AppleScript implementation: Nigel Garvey, 2007. Iterative version without auxiliary list: 2015.

Parameters: (list, range index 1, range index 2)
*)


on sort(theList, l, r)
   script o
       property lst : theList
   end script
   
   -- Process the input parmeters.
   set listLen to (count theList)
   if (listLen < 2) then return
   -- Negative and/or transposed range indices.
   if (l < 0) then set l to listLen + l + 1
   if (r < 0) then set r to listLen + r + 1
   if (l > r) then set {l, r} to {r, l}
   
   -- The sort.
   set sortRangeLen to r - l + 1
   set mergeSize to 1
   -- Repeat with the merge size doubling on each pass.
   repeat while (mergeSize < sortRangeLen)
       set leftLen to mergeSize
       set mergeSize to mergeSize * 2
       -- Iterate through the sort range's merge divisions. The items at the end are a division if there are enough for a full "left" component and at least one "right" item.
       repeat with leftL from l to (r - leftLen) by mergeSize
           -- Derive indices for the beginning of this division's "right" component and for the ends of both.
           set rightL to leftL + leftLen
           set leftR to rightL - 1
           set rightR to leftR + leftLen
           if (rightR > r) then set rightR to r
           
           -- Any values on the right greater than or equal to the last on the left are already in place for this merge. If they're all greater or equal, the merge isn't needed at all.
           set lv to item leftR of o's lst
           repeat until ((rightR = leftR) or (lv > item rightR of o's lst))
               set rightR to rightR - 1
           end repeat
           if (rightR > leftR) then
               -- A merge is happening. Any values on the left which are less than or equal to the first on the right are already in place too.
               set lv to item leftL of o's lst
               set rv to item rightL of o's lst
               repeat until (lv > rv)
                   set leftL to leftL + 1
                   set lv to item leftL of o's lst
               end repeat
               -- So now the first value actually placed in the merge will be the first from the right and the last will be the last from the left.
               
               -- Initialise the left queue indices.
               set ql to leftL + 1 -- Left of queue.
               set qe to rightL -- End of queue + 1.
               set qr to rightL -- Right of queue + 1.
               set i to ql -- Front of queue.
               -- And the right queue index.
               set j to rightL + 1
               -- The current lowest left and right values are in the variables lv and rv, leaving two buffer slots (at leftL and rightL) available for writing.
               
               set k to leftL -- k is the traversal index for the merge.
               -- Immediately insert the first right value into the first slot.
               set item k of o's lst to rv
               -- If more than one value remains to be inserted, do the business below. Otherwise skip to the placing of the remaining left value.
               if (rightR - leftL > 1) then
                   -- Get the next right value. Are there any more?
                   set rightsLeft to (j ≤ rightR)
                   if (rightsLeft) then
                       -- Yes. Get it from the right in the normal way.
                       set rv to item j of o's lst
                       set j to j + 1
                   else
                       -- No. Start to use up the remaining left values. Set rv to the current left value and get a new left value from the queue.
                       set rv to lv
                       set lv to item i of o's lst
                       set ql to ql + 1
                       set i to i + 1
                   end if
                   
                   -- Traverse the merge division to at most two slots before the end. The last two items will be set from rv and lv, in that order.
                   set kMax to rightR - 2
                   repeat while (k < kMax)
                       set k to k + 1
                       
                       -- Move the leftmost queued value out of harm's way if necessary.
                       if (k = ql) then
                           -- There are three possibilities:
                           if (qr < j) then
                               -- There's a vacant buffer slot to the right of the queue. Move the value there.
                               set item qr of o's lst to item ql of o's lst
                               if (k = i) then set i to qr -- Change front index if relevant.
                               set ql to ql + 1
                               set qr to qr + 1
                           else if (qr - i < qe - ql) then
                               -- No vacant slot to the right of the queue and the right queue segment's shorter than the left. Shift the entire right segment two places to the left …
                               repeat with x from i to (qr - 1)
                                   set item (x - 2) of o's lst to item x of o's lst
                               end repeat
                               set i to i - 2
                               set qr to qr - 2
                               -- … then move the leftmost value to the first of the now-vacant slots on the right.
                               set item qr of o's lst to item ql of o's lst
                               set ql to ql + 1
                               set qr to qr + 1
                           else
                               -- No vacant slot to the right and the left segment's shorter than or the same length as the right. Shift the entire left segment two places to the right, leaving two vacant spaces to the left.
                               repeat with x from (qe - 1) to ql by -1
                                   set item (x + 2) of o's lst to item x of o's lst
                               end repeat
                               set ql to ql + 2
                               set qe to qe + 2
                           end if
                       end if
                       
                       -- Select and insert the next value to be placed in the merge.
                       if (rightsLeft) then
                           if (lv > rv) then
                               -- The right candidate's less than the left. Use it.
                               set item k of o's lst to rv
                               -- If it's the last right value, start to fast-track the remaining lefts through the queue-management process.
                               set nextValueFromLeft to (j > rightR)
                               if (nextValueFromLeft) then
                                   set rightsLeft to false
                                   set rv to lv
                               end if
                           else
                               -- The left candidate's less than or equal to the right. Use it.
                               set item k of o's lst to lv
                               set nextValueFromLeft to true
                           end if
                       else
                           -- All the right values were previously used up. Insert another left without comparing it and request another.
                           set item k of o's lst to rv
                           set rv to lv
                           set nextValueFromLeft to true
                       end if
                       
                       -- Fetch the next value from the side decided above.
                       if (nextValueFromLeft) then
                           -- From the buffered queue.
                           set lv to item i of o's lst
                           if (i = ql) then set ql to ql + 1
                           set i to i + 1
                           if (i = qr) then
                               -- The right queue segment's now used up. The left's the entire queue.
                               set i to ql
                               set qr to qe
                               if (ql = qr) then
                                   -- The entire queue's gone. Shift any unfetched right values one slot to the left and exit early to have rv (lowest remaining value) and lv (highest) inserted at either end of them.
                                   repeat with x from (k + 3) to rightR
                                       set item (x - 1) of o's lst to item x of o's lst
                                   end repeat
                                   exit repeat
                               end if
                           end if
                       else
                           -- From the right.
                           set rv to item j of o's lst
                           set j to j + 1
                       end if
                       
                       -- If the left queue segment's been used up, the right's the entire queue.
                       if (ql < qe) then
                       else
                           set ql to i
                           set qe to qr
                       end if
                   end repeat
                   
                   -- After leaving the merge repeat, insert the remaining right value (or penultimate fast-tracked left) into the slot after k.
                   set item (k + 1) of o's lst to rv
               end if
               -- Insert the remaining left value into the end slot.
               set item rightR of o's lst to lv
           end if
       end repeat
   end repeat
   
   return -- nothing.
end sort

-- (* Demo:
set lst to {}
repeat 1000 times
   set end of my lst to (random number 1000)
end repeat
log lst

-- Sort items 1 thru -1 of lst.
sort(lst, 1, -1)
lst
-- *)


NG

Offline

 

Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)