# [AHK_L]Rosetta Code - Merge Sort

11 replies to this topic

### #1 rbrtryn

rbrtryn
• Members
• 790 posts

Posted 15 October 2011 - 06:34 PM

Most of the Rosetta Code examples for AHK are extremely dated, so I decided update one of the easier ones.

This code implements the Merge Sort.

I'll post this at Rosetta Code, after it's vetted here.

```#NoEnv

Test := []
Loop 100 {
Random n, 1, 1000
Test.Insert( n )
}
Result := MergeSort( Test )
Loop % Result.MaxIndex() {
MsgBox, 1, , % Result[A_Index]
IfMsgBox Cancel
Break
}
Return

/*
Function MergeSort
Sorts an array by first recursively splitting it down to its
individual elements and then merging those elements in their
correct order.

Parameters
Array   The array to be sorted

Returns
The sorted array
*/
MergeSort( Array )
{
; Return single element arrays
If ( ! Array.HasKey( 2 ) )
Return Array

; Split array into Left and Right halfs
Left := []
Right := []
Middle := Array.MaxIndex() // 2
While ( Array.MaxIndex() )
If ( A_Index <= Middle )
Left.Insert( Array.Remove( 1 ) )
Else
Right.Insert( Array.Remove( 1 ) )

; Recurse into Left and Right
Left  := MergeSort( Left )
Right := MergeSort( Right )

Return ___Merge( Left, Right )
}

/*
Function ___Merge
Merges two sorted arrays so that the result is sorted

Parameters
Left    The first array
Right   The second array

Returns
The sorted merging of Left and Right
*/
___Merge(Left, Right)
{
Result := []

; If all the Right values are greater than all the
; Left values, just append Right at the end of Left.
If ( Left[Left.MaxIndex()] <= Right[1] ) {
While ( Right.MaxIndex() )
Left.Insert( Right.Remove( 1 ) )
Return Left
}

; Loop until one of the arrays is empty
While( Left.MaxIndex() and Right.MaxIndex() )
If ( Left[1] <= Right[1] )
Result.Insert( Left.Remove( 1 ) )
Else
Result.Insert( Right.Remove( 1 ) )

While ( Left.MaxIndex() )
Result.Insert( Left.Remove( 1 ) )

While ( Right.MaxIndex() )
Result.Insert( Right.Remove( 1 ) )

Return Result
}```

Hopefully this will inspire others to update Rosetta Code using AHK_L syntax.
_________________________
Composed/Posted with WYSIWYG BBCode Editor

### #2 Uberi

Uberi
• Fellows
• 1046 posts

Posted 17 October 2011 - 08:11 PM

Here's a single function version that's a bit faster:

```SortObject(InputObject)
{
MaxIndex := ObjMaxIndex(InputObject), (MaxIndex = "") ? (MaxIndex := 0) : ""
If (MaxIndex < 2)
Return, InputObject
Middle := MaxIndex >> 1, SortLeft := Object(), SortRight := Object()
Loop, %Middle%
ObjInsert(SortLeft,InputObject[A_Index]), ObjInsert(SortRight,InputObject[Middle + A_Index])
If (MaxIndex & 1)
ObjInsert(SortRight,InputObject[MaxIndex])
SortLeft := SortObject(SortLeft), SortRight := SortObject(SortRight), MaxRight := MaxIndex - Middle, LeftIndex := 1, RightIndex := 1, Result := Object()
Loop, %MaxIndex%
{
If (LeftIndex > Middle)
ObjInsert(Result,SortRight[RightIndex]), RightIndex ++
Else If ((RightIndex > MaxRight) || (SortLeft[LeftIndex] < SortRight[RightIndex]))
ObjInsert(Result,SortLeft[LeftIndex]), LeftIndex ++
Else
ObjInsert(Result,SortRight[RightIndex]), RightIndex ++
}
Return, Result
}```

### #3 rbrtryn

rbrtryn
• Members
• 790 posts

Posted 17 October 2011 - 09:36 PM

How much faster is it? I don't have any reliable means to measure this at my disposal.

The extra speed may come at a cost. If I am reading your algorithim correctly, your function requires 3n locations to implement. This is conciderably worse than the 2n locations normally required to implement this sort.

Merge sort also has some demerits. One is its use of 2n locations; the additional n locations were needed because one couldn't reasonably merge two sorted sets in place

MergeSort uses AHK's feature of releasing array space at run time to only use n locations.

### #4 Uberi

Uberi
• Fellows
• 1046 posts

Posted 17 October 2011 - 09:53 PM

Benchmark results:

```Benchmark:       Merge Sort implementations
Iterations:      10000
Control Run:     4.566790 ms.  (0.000457 ms. per run)
Original:        41.607291 ms. (0.004161 ms. per run)
Uberi's version: 27.878923 ms. (0.002788 ms. per run)```

Summary: the one I posted is about 1.4924677188 times faster.

Edit: Forgot to mention I used an array with 100 items as in the included example.

### #5 rbrtryn

rbrtryn
• Members
• 790 posts

Posted 18 October 2011 - 03:07 PM

Summary: the one I posted is about 1.4924677188 times faster.

Well that is significant. Could you try this version to see if it is any faster?

[*:24f8fcak]Borrowed your idea of only looping to the middle when splitting.
[*:24f8fcak]Changed most of the While statements to Loop.
[*:24f8fcak]Combined the two functions into oneCan you see any other improvements I can make, while still maintaining the {1n} space requirement?
```#NoEnv

Test := []
Loop 100 {
Random n, 0, 999
Test.Insert(n)
}
Result := MergeSort(Test)
Loop % Result.MaxIndex() {
MsgBox, 1, , % Result[A_Index]
IfMsgBox Cancel
Break
}
Return

/*
Function MergeSort
Sorts an array by first recursively splitting it down to its
individual elements and then merging those elements in their
correct order.

Parameters
Array   The array to be sorted

Returns
The sorted array
*/
MergeSort(Array)
{
; Return single element arrays
Return Array

; Split array into Left and Right halfs
Left := [], Right := [], Middle := Array.MaxIndex() // 2
Loop % Middle
Right.Insert(Array.Remove(Middle-- + 1)), Left.Insert(Array.Remove(1))
If (Array.MaxIndex())
Right.Insert(Array.Remove(1))

Left := MergeSort(Left), Right := MergeSort(Right)

; If all the Right values are greater than all the
; Left values, just append Right at the end of Left.
If (Left[Left.MaxIndex()] <= Right[1]) {
Loop % Right.MaxIndex()
Left.Insert(Right.Remove(1))
Return Left
}
; Loop until one of the arrays is empty
While(Left.MaxIndex() and Right.MaxIndex())
Left[1] <= Right[1] ? Array.Insert(Left.Remove(1))
: Array.Insert(Right.Remove(1))

Loop % Left.MaxIndex()
Array.Insert(Left.Remove(1))

Loop % Right.MaxIndex()
Array.Insert(Right.Remove(1))

Return Array
}
```

### #6 Uberi

Uberi
• Fellows
• 1046 posts

Posted 19 October 2011 - 02:08 AM

Benchmarked again:

```Benchmark:       Merge sort 100 item array
Iterations:      1000
Control Run:     4.100883 ms.    (0.004101 ms. per run)
Original:        3456.203593 ms. (3.456204 ms. per run)
Uberi's version: 2741.191417 ms. (2.741191 ms. per run)```

Seems like the last test was faulty sure to MergeSort() modifying the source object. I've added ObjClone(Test) to ensure that does not affect results. It is actually only 1.2608402698 times faster. (Edit: I got it up as high as 1.3547413737 times faster, but it was so terse as to be unreadable. I don't think that would be very suitable for RosettaCode, so I won't post it here)

I realize that memory usage is very important, but all those array manipulations have a significant cost proportional to O(log(n)) since arrays are binary trees.

My version would gain a significant speed advantage given "real" arrays, contiguous memory that can be accessed by an offset. However, your code is much more space efficient. It's a tradeoff.

### #7 rbrtryn

rbrtryn
• Members
• 790 posts

Posted 20 October 2011 - 07:20 PM

Sorry for not getting back to you sooner. I want to thank for your time and help with this

What benchmarking software do you use?

I realize that memory usage is very important, but all those array manipulations have a significant cost proportional to O(log(n)) since arrays are binary trees.

:oops: I should have realised that was how they were implemented. In that case adding/deleting nodes is going to be a real time suck. This version avoids that as much as possible and it reuses the input array as the output. No sense in building a whole new array just to contain the result. It uses 2n locations, which is typical for this type of sort.
```/*
Function MergeSort
Sorts an array by first recursively splitting it down to its
individual elements and then merging those elements in their
correct order.

Parameters
Array   The array to be sorted

Returns
The sorted array

Reference:
http://www.sorting-algorithms.com/merge-sort
*/
MergeSort(Array)
{
; Base case
If ( !Array.MaxIndex() or Array.MaxIndex() = 1)
Return Array

; Split array into Left and Right halfs
Left := [], Right := [], Middle := Array.MaxIndex() >> 1
Loop % Middle
Left[A_Index] := Array[A_Index], Right[A_Index] := Array[Middle + A_Index]
If (Array.MaxIndex() & 1)
Right.Insert(Array[Array.MaxIndex()])

Left := MergeSort(Left), Right := MergeSort(Right)

Index := LIndex := RIndex := 1
While (LIndex <= Left.MaxIndex() and RIndex <= Right.MaxIndex())
Array[Index++] := Left[LIndex] <= Right[RIndex] ? Left[LIndex++] : Right[RIndex++]

While (LIndex <= Left.MaxIndex())
Array[Index++] := Left[LIndex++]

While (RIndex <= Right.MaxIndex())
Array[Index++] := Right[RIndex++]
Return Array
}```

### #8 Uberi

Uberi
• Fellows
• 1046 posts

Posted 21 October 2011 - 01:58 AM

What benchmarking software do you use?

It's a custom script that I haven't released since it's quite similar to Leo Xiong/Nameless-exe's Script Benchmarker. I can post it here if you like, but the script I linked to works quite nicely as well.

If you're going for maximum time and memory efficiency, I think this may be it:

```SortArray(InputArray)
{
MaxIndex := ObjMaxIndex(InputArray), (MaxIndex = "") ? (MaxIndex := 0) : ""
If (MaxIndex < 2)
Return, InputArray
Middle := MaxIndex >> 1
SortLeft := ObjClone(InputArray), SortRight := ObjClone(InputArray)
ObjRemove(SortLeft,Middle + 1,MaxIndex), ObjRemove(SortRight,1,Middle)
SortArray(SortLeft), SortArray(SortRight), MaxRight := MaxIndex - Middle, LeftIndex := 1, RightIndex := 1
Loop, %MaxIndex%
InputArray[A_Index] := (LeftIndex <= Middle && (RightIndex > MaxRight || SortLeft[LeftIndex] < SortRight[RightIndex])) ? SortLeft[LeftIndex ++] : SortRight[RightIndex ++]
}```

It's 1.883936 times faster than the version you have posted immediately above:

```Uberi's version: 867.160864
Original:        1528.096053```

Also, it is an in-place sort now:

```Array := ["A","B","C"]
SortArray(Array)
;array is now sorted```

However, since it is somewhat less readable than the previous examples, I'd understand if you want to use your version, since RosettaCode entries are more for educational purposes anyways.

### #9 Guests

• Guests

Posted 03 November 2011 - 07:36 AM

Uberi, excellent. I tried your function with huge amount of data and it runs fine. I wonder why your function does not cause stack overflow in spite of the recursive function calls although AHK has a limitation with it.

### #10 Uberi

Uberi
• Fellows
• 1046 posts

Posted 14 January 2012 - 01:30 AM

I've come across this code again recently, and having learned more about sorting in the meantime, I've made it faster once again:

```SortArray(InputArray)
{
MaxIndex := ObjMaxIndex(InputArray), (MaxIndex = "") ? (MaxIndex := 0) : ""
If MaxIndex <= 8
{
If MaxIndex <= 1
Return, InputArray
Loop, % MaxIndex - 1
{
Index := A_Index
While, Index > 0 && InputArray[Index] > InputArray[Index + 1]
Value := InputArray[Index + 1], InputArray[Index + 1] := InputArray[Index], InputArray[Index] := Value, Index --
}
}
Else
{
Middle := MaxIndex >> 1
SortLeft := ObjClone(InputArray), ObjRemove(SortLeft,Middle + 1,MaxIndex)
SortRight := ObjClone(InputArray), ObjRemove(SortRight,1,Middle)
SortArray(SortLeft), SortArray(SortRight), MaxRight := MaxIndex - Middle, LeftIndex := 1, RightIndex := 1
Loop, %MaxIndex%
InputArray[A_Index] := (LeftIndex <= Middle && (RightIndex > MaxRight || SortLeft[LeftIndex] < SortRight[RightIndex])) ? SortLeft[LeftIndex ++] : SortRight[RightIndex ++]
}
}```

Now the function switches to the insertion sort algorithm if the number of elements is less than or equal to 8, which is the threshold after which insertion sort is faster than merge sort.

While insertion sort is known to be quadratic in time complexity, for small arrays it is actually more efficient than many other algorithms. Switching to the right algorithm for the given number of elements combines the advantages of both.

### #11 Uberi

Uberi
• Fellows
• 1046 posts

Posted 14 January 2012 - 01:36 AM

Oh and here are the benchmarks:

```Benchmark:        Sorting functions
Iterations:       1000
Control Run:      0.863407 ms.    (0.000863 ms. per run)
Original version: 3004.968792 ms. (3.004969 ms. per run)
Uberi's version:  1383.388197 ms. (1.383388 ms. per run)```

It is now 2.172180 times faster than the original version.

### #12 Deo

Deo
• Members
• 199 posts

Posted 16 January 2012 - 09:10 AM

cool, thanks for your great work