Sorting

Squirm
05-15-2003, 05:57 PM
One of the most frequently used procedures we need to implement in our programs is sorting. This is an area of computing which has been intensively studied, and there are many sorting algorithms to choose between. These next few posts will attempt to explain and demonstrate some of the most common:


Bubble sort (http://www.visualbasicforum.com/showpost.php?postid=386990&postcount=2)
Selection sort (http://www.visualbasicforum.com/showpost.php?postid=386991&postcount=3)
Insertion sort (http://www.visualbasicforum.com/showpost.php?postid=386992&postcount=4)
Quick sort (http://www.visualbasicforum.com/showpost.php?postid=386993&postcount=5)
Merge sort (http://www.visualbasicforum.com/showpost.php?postid=386994&postcount=6)
Heap sort (http://www.visualbasicforum.com/showpost.php?postid=386995&postcount=7)
Preliminary Results (http://www.visualbasicforum.com/showpost.php?postid=386996&postcount=8)
BubbleExit Sort (http://www.visualbasicforum.com/showpost.php?postid=388008&postcount=9)
Comb Sort (http://www.visualbasicforum.com/showpost.php?postid=388009&postcount=10)
Shell Sort (http://www.visualbasicforum.com/showpost.php?postid=388012&postcount=11)
Shaker Sort (http://www.visualbasicforum.com/showpost.php?postid=388014&postcount=12)
Radix Sort (http://www.visualbasicforum.com/showpost.php?postid=388016&postcount=13)
Results (http://www.visualbasicforum.com/showpost.php?postid=388023&postcount=14)
Project Files (http://www.visualbasicforum.com/showpost.php?postid=388040&postcount=15)
Visual Sorting (http://www.visualbasicforum.com/showpost.php?postid=403676&postcount=16)

Squirm
05-15-2003, 05:57 PM
Bubble (or Sinking) Sort
This is perhaps the most widely known and used sort, and is certainly one of the simplest. The algorithm consists of a series of passes through the data, with the larger values bubbling up to one end and at the same time smaller values sinking to the other end.

Example (underline shows where values are being compared):
Array of data [6,7,9,3,2,5]

Bubbling pass 1
[6,7,9,3,2,5]
[6,7,9,3,2,5]
[6,7,9,3,2,5]
[6,7,3,9,2,5]
[6,7,3,2,9,5]

Bubbling pass 2
[6,7,3,2,5,9]
[6,7,3,2,5,9]
[6,3,7,2,5,9]
[6,3,2,7,5,9]

Bubbling pass 3
[6,3,2,5,7,9]
[3,6,2,5,7,9]
[3,2,6,5,7,9]

Bubbling pass 4
[3,2,5,6,7,9]
[2,3,5,6,7,9]

Bubbling pass 5
[2,3,5,6,7,9]

Finish [2,3,5,6,7,9]

The code to implement this is fairly simple. As you might have already spotted, a nested loop of some kind is in order. I choose to use 2 nested For..Next loops:

Public Sub BubbleSort(ByRef lngArray() As Long)
Dim iOuter As Long
Dim iInner As Long
Dim iLBound As Long
Dim iUBound As Long
Dim iTemp As Long

iLBound = LBound(lngArray)
iUBound = UBound(lngArray)

'Which bubbling pass
For iOuter = iLBound To iUBound - 1
'Which comparison
For iInner = iLBound To iUBound - iOuter - 1

'Compare this item to the next item
If lngArray(iInner) > lngArray(iInner + 1) Then
'Swap
iTemp = lngArray(iInner)
lngArray(iInner) = lngArray(iInner + 1)
lngArray(iInner + 1) = iTemp
End If

Next iInner
Next iOuter
End Sub

Squirm
05-15-2003, 05:58 PM
Selection Sort
This sort works on a simple procedure - take an array position and then look through the remaining items to see which one fits. This item then swaps places with the one which is already there. Do this for every position in the array and by the time we've run out of positions, the array is fully sorted.

Example:
Array of data [6,7,9,3,2,5]

[6,7,9,3,2,5]
^ Which element goes here?
[6,7,5,3,2,9]
^ Which element goes here?
[6,2,5,3,7,9]
^ Which element goes here?
[3,2,5,6,7,9]
^ Which element goes here?
[3,2,5,6,7,9]
^ Which element goes here?

Finish [2,3,5,6,7,9]

Again nested loops are in order, one for each iteration, and one to find the largest remaining value:

Public Sub SelectionSort(ByRef lngArray() As Long)
Dim iOuter As Long
Dim iInner As Long
Dim iLBound As Long
Dim iUBound As Long
Dim iTemp As Long
Dim iMax As Long

iLBound = LBound(lngArray)
iUBound = UBound(lngArray)

For iOuter = iUBound To iLBound + 1 Step -1

iMax = 0

'Find the largest value in the subarray
For iInner = iLBound To iOuter
If lngArray(iInner) > lngArray(iMax) Then iMax = iInner
Next iInner

'Swap with last slot of the subarray
iTemp = lngArray(iMax)
lngArray(iMax) = lngArray(iOuter)
lngArray(iOuter) = iTemp

Next iOuter
End Sub

Squirm
05-15-2003, 05:58 PM
Insertion Sort
The insertion sort is the exact opposite of the selection sort. Instead of selecting a position then finding the value which fits, we instead select a value and then find a position for it within the other sorted values. Any values which are in the way are shifted one by one. The number of sorted elements gradually increases until the whole array is sorted.

Example:
Array of data [6,7,9,3,2,5]

[6][7,9,3,2,5]
^ Where does this value go in [6]?
[6,7][9,3,2,5]
^ Where does this value go in [6,7]?
[6,7,9][3,2,5]
^ Where does this value go in [6,7,9]?
[3,6,7,9][2,5]
^ Where does this value go in [3,6,7,9]?
[2,3,6,7,9][5]
^ Where does this value go in [2,3,6,7,9]?

Finish [2,3,5,6,7,9]

Once more, nested loops are the order of the day.

Public Sub InsertionSort(ByRef lngArray() As Long)
Dim iOuter As Long
Dim iInner As Long
Dim iLBound As Long
Dim iUBound As Long
Dim iTemp As Long

iLBound = LBound(lngArray)
iUBound = UBound(lngArray)

For iOuter = iLBound + 1 To iUBound

'Get the value to be inserted
iTemp = lngArray(iOuter)

'Move along the already sorted values shifting along
For iInner = iOuter - 1 To iLBound Step -1
'No more shifting needed, we found the right spot!
If lngArray(iInner) <= iTemp Then Exit For

lngArray(iInner + 1) = lngArray(iInner)
Next iInner

'Insert value in the slot
lngArray(iInner + 1) = iTemp
Next iOuter
End Sub

Squirm
05-15-2003, 05:59 PM
This is where things start getting complicated - and a lot faster. The main problem with the first 3 are that almost every value must be compared with almost every other, resulting in a lot of comparisons. If we break the sorts down into a number of smaller sorts and then combine them, we might speed things up.

Quick Sort
The quick sort works by gradually dividing each sort into 2 smaller sorts. The first thing we do is select a pivot value from our values. Using this, we swap elements over so that all the values less than the pivot are on the left and all the values greater than the pivot are on the right. The individual left and right subgroups are still unsorted within themselves, but in relation to each other the groups are correct. Once we have done this, we call the procedure recursively on the left group and the right group.

Example (pivot values underlined):
Array of data [2,8,4,9,3,5,7,1,0,6]


[5,8,3,9,4,1,7,0,2,6]
| Recursive step
[2,3,0,4,1] [5] [7,9,8,6]
| | Recursive step
[1,0][2][4,3] [6][7][8,9]
| | | | Recursive step
[0,1] [3,4] [6] [8,9]
| | | |
[0,1,2,3,4] [6,7,8,9]
| |
[0,1,2,3,4,5,6,7,8,9]


Finish [0,1,2,3,4,5,6,7,8,9]

Yes I have chosen nice pivot values which create balanced divisions between the left and right, but that was really just so I could illustrate the workings of the function. My code uses 2 procedures, the inner recursive sort and the outer wrapper function.

Public Sub QuickSort(ByRef lngArray() As Long)
Dim iLBound As Long
Dim iUBound As Long
Dim iTemp As Long
Dim iOuter As Long
Dim iMax As Long

iLBound = LBound(lngArray)
iUBound = UBound(lngArray)

'Dont want to sort array with only 1 value
If (iUBound - iLBound) Then

'Move the largest value to the rightmost position, otherwise
'we need to check that iLeftCur does not exceed the bounds of the
'array on EVERY pass (time consuming)

For iOuter = iLBound To iUBound
If lngArray(iOuter) > lngArray(iMax) Then iMax = iOuter
Next iOuter

iTemp = lngArray(iMax)
lngArray(iMax) = lngArray(iUBound)
lngArray(iUBound) = iTemp

'Start quicksorting
InnerQuickSort lngArray, iLBound, iUBound
End If
End Sub

Private Sub InnerQuickSort(ByRef lngArray() As Long, ByVal iLeftEnd As Long, ByVal iRightEnd As Long)
Dim iLeftCur As Long
Dim iRightCur As Long
Dim iPivot As Long
Dim iTemp As Long

If iLeftEnd >= iRightEnd Then Exit Sub

iLeftCur = iLeftEnd
iRightCur = iRightEnd + 1
iPivot = lngArray(iLeftEnd)

'Arrange values so that < pivot are on the left and > pivot are on the right
Do
'Find >= value on left side
Do
iLeftCur = iLeftCur + 1
Loop While lngArray(iLeftCur) < iPivot

'Find <= value on right side
Do
iRightCur = iRightCur - 1
Loop While lngArray(iRightCur) > iPivot

'No more swapping to do
If iLeftCur >= iRightCur Then Exit Do

'Swap
iTemp = lngArray(iLeftCur)
lngArray(iLeftCur) = lngArray(iRightCur)
lngArray(iRightCur) = iTemp
Loop

'Call quicksort recursively on left and right subarrays
lngArray(iLeftEnd) = lngArray(iRightCur)
lngArray(iRightCur) = iPivot

InnerQuickSort lngArray, iLeftEnd, iRightCur - 1
InnerQuickSort lngArray, iRightCur + 1, iRightEnd
End Sub

Squirm
05-15-2003, 05:59 PM
Merge Sort
At a first glance the merge sort appears to be identical to the quick sort. However, rather than separating the values into distinct left and right groups, we simply sort them straight away with no pivot values. The value comparisons are done when we come to merge the two sorts together, but since the groups are ordered, we need only compare the smallest of one group with the smallest of the other.

Example:
Array of data [2,8,4,9,3,5,7,1,0,6]

[2,8,4,9,3,5,7,1,0,6]
| | Recursive step
[2,8,4,9,3] [5,7,1,0,6]
| | | | Recursive step
[2,8] [4,9,3] [5,7] [1,0,6]
| | | | | | Recursive step
[2,8] [4][3,9] [5,7] [1][0,6]
| | | | | | Merge step
[2,8] [3,4,9] [5,7] [0,1,6]
| | | | Merge step
[2,3,4,8,9] [0,1,5,6,7]
| | Merge step
[0,1,2,3,4,5,6,7,8,9]

Finish [0,1,2,3,4,5,6,7,8,9]

The merge sort is different in other ways. For one, it requires a second array to hold the data, as we merge from one into the other. Another difference is that it can easily be implemented non-recursively, as I have done in the code below. The MergeSort procedure calls the InnerMergePass sub with larger and larger segment sizes.

Public Sub MergeSort(ByRef lngArray() As Long)
Dim arrTemp() As Long
Dim iSegSize As Long
Dim iLBound As Long
Dim iUBound As Long

iLBound = LBound(lngArray)
iUBound = UBound(lngArray)

ReDim arrTemp(iLBound To iUBound)

iSegSize = 1
Do While iSegSize < iUBound - iLBound

'Merge from A to B
InnerMergePass lngArray, arrTemp, iLBound, iUBound, iSegSize
iSegSize = iSegSize + iSegSize

'Merge from B to A
InnerMergePass arrTemp, lngArray, iLBound, iUBound, iSegSize
iSegSize = iSegSize + iSegSize

Loop
End Sub

Private Sub InnerMergePass(ByRef lngSrc() As Long, ByRef lngDest() As Long, ByVal iLBound As Long, iUBound As Long, ByVal iSegSize As Long)
Dim iSegNext As Long

iSegNext = iLBound

Do While iSegNext <= iUBound - (2 * iSegSize)
'Merge 2 segments from src to dest
InnerMerge lngSrc, lngDest, iSegNext, iSegNext + iSegSize - 1, iSegNext + iSegSize + iSegSize - 1

iSegNext = iSegNext + iSegSize + iSegSize
Loop

'Fewer than 2 full segments remain
If iSegNext + iSegSize <= iUBound Then
'2 segs remain
InnerMerge lngSrc, lngDest, iSegNext, iSegNext + iSegSize - 1, iUBound
Else
'1 seg remains, just copy it
For iSegNext = iSegNext To iUBound
lngDest(iSegNext) = lngSrc(iSegNext)
Next iSegNext
End If

End Sub

Private Sub InnerMerge(ByRef lngSrc() As Long, ByRef lngDest() As Long, ByVal iStartFirst As Long, ByVal iEndFirst As Long, ByVal iEndSecond As Long)
Dim iFirst As Long
Dim iSecond As Long
Dim iResult As Long
Dim iOuter As Long

iFirst = iStartFirst
iSecond = iEndFirst + 1
iResult = iStartFirst

Do While (iFirst <= iEndFirst) And (iSecond <= iEndSecond)

'Select the smaller value and place in the output
'Since the subarrays are already sorted, only one comparison is needed
If lngSrc(iFirst) <= lngSrc(iSecond) Then
lngDest(iResult) = lngSrc(iFirst)
iFirst = iFirst + 1
Else
lngDest(iResult) = lngSrc(iSecond)
iSecond = iSecond + 1
End If

iResult = iResult + 1
Loop

'Take care of any leftover values
If iFirst > iEndFirst Then
'Got some leftover seconds
For iOuter = iSecond To iEndSecond
lngDest(iResult) = lngSrc(iOuter)
iResult = iResult + 1
Next iOuter
Else
'Got some leftover firsts
For iOuter = iFirst To iEndFirst
lngDest(iResult) = lngSrc(iOuter)
iResult = iResult + 1
Next iOuter
End If
End Sub

Squirm
05-15-2003, 06:00 PM
Heap Sort
The heap sort is very different to all the others, and before I explain the sort itself, we need to know what exactly a heap is. A heap is a balanced binary tree where every parent node is greater than its children. The result is that the value at the top of any subtree will be the largest value in the tree.

The sort itself comprises of 2 steps - Firstly, the elements of the array are reordered so that it becomes a heap. This involves checking each child against its parent and moving values up the tree as needed. Once the heap is ready, we can start to read off values from it one at a time, so an extra array is required. Once each value has been removed, we restore the heap properties by moving the last element in the array to the front and then shifting it down as required.

Example:
Array of data [2,8,4,9,3,5,7,1,0,6]

As it stands, the data, when viewed as a tree, looks like this:
2
/ \
8 4
/ \ / \
9 3 5 7
/ \ /
1 0 6

First, we restore the heap properties by shuffling children up and parents down:

9
/ \
8 7
/ \ / \
6 3 5 4
/ \ /
1 0 2

Now we read off the top value and remove it, placing the last value at the top:

2
/ \
8 7
/ \ / \
6 3 5 4
/ \
1 0

Restore the heap properties:

8
/ \
6 7
/ \ / \
2 3 5 4
/ \
1 0

Continue until the heap is empty.

Finish [0,1,2,3,4,5,6,7,8,9] (place the items in the array in reverse order)

And here is the code:

Public Sub HeapSort(ByRef lngArray() As Long)
Dim iLBound As Long
Dim iUBound As Long
Dim iArrSize As Long
Dim iRoot As Long
Dim iChild As Long
Dim iElement As Long
Dim iCurrent As Long
Dim arrOut() As Long

iLBound = LBound(lngArray)
iUBound = UBound(lngArray)
iArrSize = iUBound - iLBound

ReDim arrOut(iLBound To iUBound)

'Initialise the heap
'Move up the heap from the bottom
For iRoot = iArrSize \ 2 To 0 Step -1

iElement = lngArray(iRoot + iLBound)
iChild = iRoot + iRoot

'Move down the heap from the current position
Do While iChild < iArrSize

If iChild < iArrSize Then
If lngArray(iChild + iLBound) < lngArray(iChild + iLBound + 1) Then
'Always want largest child
iChild = iChild + 1
End If
End If

'Found a slot, stop looking
If iElement >= lngArray(iChild + iLBound) Then Exit Do

lngArray((iChild \ 2) + iLBound) = lngArray(iChild + iLBound)
iChild = iChild + iChild
Loop

'Move the node
lngArray((iChild \ 2) + iLBound) = iElement
Next iRoot

'Read of values one by one (store in array starting at the end)
For iRoot = iUBound To iLBound Step -1

'Read the value
arrOut(iRoot) = lngArray(iLBound)
'Get the last element
iElement = lngArray(iArrSize + iLBound)

iArrSize = iArrSize - 1
iCurrent = 0
iChild = 1

'Find a place for the last element to go
Do While iChild <= iArrSize

If iChild < iArrSize Then
If lngArray(iChild + iLBound) < lngArray(iChild + iLBound + 1) Then
'Always want the larger child
iChild = iChild + 1
End If
End If

'Found a position
If iElement >= lngArray(iChild + iLBound) Then Exit Do

lngArray(iCurrent + iLBound) = lngArray(iChild + iLBound)
iCurrent = iChild
iChild = iChild + iChild

Loop

'Move the node
lngArray(iCurrent + iLBound) = iElement
Next iRoot

'Copy from temp array to real array
For iRoot = iLBound To iUBound
lngArray(iRoot) = arrOut(iRoot)
Next iRoot
End Sub

Squirm
05-15-2003, 06:01 PM
Still with me? The end is in sight. :)

The question remains : Which sort is the fastest?
Well after benchmarking, here are the results (bold = best, italic = worst):


100 unsorted numbers - 10000 iterations
Bubble sort total : 1373.5787648989 avg : 0.13735787648989
Selection sort total : 809.819125056374 avg : 0.0809819125056374
Insertion sort total : 487.648290495019 avg : 0.0487648290495019
Quick sort total : 172.716364789376 avg : 0.0172716364789376
Merge sort total : 324.510238033039 avg : 0.0324510238033039
Heap sort total : 361.514941144754 avg : 0.0361514941144754

100 sorted numbers - 10000 iterations
Bubble sort total : 693.941827802071 avg : 0.0693941827802071
Selection sort total : 723.852612552755 avg : 0.0723852612552755
Insertion sort total : 32.7346073313766 avg : 0.00327346073313766
Quick sort total : 416.318833818243 avg : 0.0416318833818243
Merge sort total : 247.43479967424 avg : 0.024743479967424
Heap sort total : 373.118929919866 avg : 0.0373118929919866

1000 unsorted numbers - 1000 iterations
Bubble sort total : 13516.7103640267 avg : 13.5167103640267
Selection sort total : 7099.89593649473 avg : 7.09989593649473
Insertion sort total : 4447.33339013758 avg : 4.44733339013758
Quick sort total : 220.904332813249 avg : 0.220904332813249
Merge sort total : 384.005229714951 avg : 0.384005229714951
Heap sort total : 477.081305026198 avg : 0.477081305026198

1000 sorted numbers - 1000 iterations
Bubble sort total : 6719.69482154855 avg : 6.71969482154855
Selection sort total : 6814.77757647969 avg : 6.81477757647969
Insertion sort total : 24.1133998874158 avg : 0.0241133998874158
Quick sort total : 3320.86284709369 avg : 3.32086284709369
Merge sort total : 283.286842322139 avg : 0.283286842322139
Heap sort total : 483.537153465036 avg : 0.483537153465036

10000 unsorted numbers - 10 iterations
Bubble sort total : 13713.7742366697 avg : 1371.37742366697
Selection sort total : 7200.55705403899 avg : 720.055705403899
Insertion sort total : 4632.18006757842 avg : 463.218006757842
Quick sort total : 28.4362956744502 avg : 2.84362956744502
Merge sort total : 53.1626226238251 avg : 5.31626226238251
Heap sort total : 62.5252650825733 avg : 6.25252650825733

10000 sorted numbers - 10 iterations
Bubble sort total : 6728.83704493169 avg : 672.883704493169
Selection sort total : 7171.04632013287 avg : 717.104632013287
Insertion sort total : 2.34918125068968 avg : 0.234918125068968
Quick sort total : 3166.00545600069 avg : 316.600545600069
Merge sort total : 40.2176812974833 avg : 4.2176812974833
Heap sort total : 63.0778492797269 avg : 6.30778492797269


So it seems for elements which are already quite well sorted, the Insertion Sort takes the prize, whereas the Quick Sort is best at sorting unsorted data (but chokes when its already well ordered). The Merge Sort and Heap Sort are good all-rounders and are better with large amounts of data. Avoid the Selection Sort and Bubble Sort where possible.

Well, there you have it. Find the project files attached.

Thanks to OnErr0r for pointing out my obvious blunder

Squirm
05-17-2003, 10:58 AM
Not more sorting you cry! :eek:

BubbleExit Sort
This isnt really another sort, it is a simple alteration to the basic bubble sort. It allows us to exit the sort prematurely if the array becomes sorted. This will mean better performance on almost-sorted lists, but will usually hinder performance on totally unordered lists, due to the extra checks involved.

Public Sub BubbleExitSort(ByRef lngArray() As Long)
Dim iOuter As Long
Dim iInner As Long
Dim iLBound As Long
Dim iUBound As Long
Dim iTemp As Long
Dim iFinished As Long

iLBound = LBound(lngArray)
iUBound = UBound(lngArray)

'Which bubbling pass
For iOuter = iLBound To iUBound - 1

iFinished = 1

'Which comparison
For iInner = iLBound To iUBound - iOuter - 1

'Compare this item to the next item
If lngArray(iInner) > lngArray(iInner + 1) Then
'Swap
iTemp = lngArray(iInner)
lngArray(iInner) = lngArray(iInner + 1)
lngArray(iInner + 1) = iTemp

'Not finished
iFinished = 0
End If

Next iInner

If iFinished Then Exit For
Next iOuter
End Sub

Squirm
05-17-2003, 10:59 AM
Comb Sort
The main problem with the bubblesort is something called the rabbit-turtle effect. Large values at the bottom bubble up very quickly (rabbits) but small values at the top take many many passes to sink (turtles). The comb sort overcomes this by comparing non-adjacent values. This means turtles have the chance to jump down the list much quicker. We repeat the process with smaller and smaller spacings until we finish with a simple bubblesort.

Example:
Array of data [6,7,9,3,2,5]

[6,7,9,3,2,5]
^ Potential turtle

Combing pass with a spacing of 4
[6,7,9,3,2,5]
[2,7,9,3,6,5]

Combing pass with a spacing of 3
[2,5,9,3,6,7]
[2,5,9,3,6,7]
[2,5,9,3,6,7]

Combing pass with a spacing of 2
[2,5,7,3,6,9]
[2,5,7,3,6,9]
[2,3,7,5,6,9]
[2,3,6,5,7,9]

Combing pass with a spacing of 1
[2,3,6,5,7,9]
[2,3,6,5,7,9]
[2,3,6,5,7,9]
[2,3,5,6,7,9]
[2,3,5,6,7,9]

Finish [2,3,5,6,7,9]

If you trawl the web for a bit you'll find this sort, like every other, has been well studied. It has been found that if the spacing ever reaches 11, the remaining few passes are much more efficient than if 9 or 10 are used. The result is often called the Comb11 Sort:

Public Sub CombSort(ByRef lngArray() As Long)
Dim iSpacing As Long
Dim iOuter As Long
Dim iInner As Long
Dim iTemp As Long
Dim iLBound As Long
Dim iUBound As Long
Dim iArrSize As Long
Dim iFinished As Long

iLBound = LBound(lngArray)
iUBound = UBound(lngArray)

'Initialise comb width
iSpacing = iUBound - iLBound

Do
If iSpacing > 1 Then
iSpacing = Int(iSpacing / 1.3)

If iSpacing = 0 Then
iSpacing = 1 'Dont go lower than 1
ElseIf iSpacing > 8 And iSpacing < 11 Then
iSpacing = 11 'This is a special number, goes faster than 9 and 10
End If
End If

'Always go down to 1 before attempting to exit
If iSpacing = 1 Then iFinished = 1

'Combing pass
For iOuter = iLBound To iUBound - iSpacing
iInner = iOuter + iSpacing

If lngArray(iOuter) > lngArray(iInner) Then
'Swap
iTemp = lngArray(iOuter)
lngArray(iOuter) = lngArray(iInner)
lngArray(iInner) = iTemp

'Not finished
iFinished = 0
End If
Next iOuter

Loop Until iFinished
End Sub

Squirm
05-17-2003, 11:00 AM
Shell Sort
Suppose we apply this variable-spacing idea to other algorithms. For example, the Insertion Sort. Since this sort is like a special case of the bubble sort (each value is selected then bubbled into place), we can apply the same procedure. The result is the Shell Sort.

Example:
Array of data [6,7,9,3,2,5]

Shell with spacing of 4
[6,7,9,3,2,5]
^ Where does this fit in [6]?

Shell with spacing of 3
[2,7,9,3,6,5]
^ Where does this fit in [2]?

Shell with spacing of 2
[2,7,9,3,6,5]
^ Where does this fit in [2]?
[2,7,9,3,6,5]
^ Where does this fit in [2,9]?

Shell with spacing of 1
[2,7,6,3,9,5]
^ Where does this fit in [2]?
[2,7,6,3,9,5]
^ Where does this fit in [2,7]?
[2,6,7,3,9,5]
^ Where does this fit in [2,6,7]?
[2,3,6,7,9,5]
^ Where does this fit in [2,3,6,7]?
[2,3,6,7,9,5]
^ Where does this fit in [2,3,6,7,9]?

Finish [2,3,5,6,7,9]

Of course with the actual sort we dont use so many spacings, and the amount the array changes in each pass is more dramatic. Here is the code:

Public Sub ShellSort(ByRef lngArray() As Long)
Dim iSpacing As Long
Dim iOuter As Long
Dim iInner As Long
Dim iTemp As Long
Dim iLBound As Long
Dim iUBound As Long
Dim iArrSize As Long

iLBound = LBound(lngArray)
iUBound = UBound(lngArray)

'Calculate initial sort spacing
iArrSize = (iUBound - iLBound) + 1
iSpacing = 1

If iArrSize > 13 Then
Do While iSpacing < iArrSize
iSpacing = (3 * iSpacing) + 1
Loop

iSpacing = iSpacing \ 9
End If

'Start sorting
Do While iSpacing

For iOuter = iLBound + iSpacing To iUBound

'Get the value to be inserted
iTemp = lngArray(iOuter)

'Move along the already sorted values shifting along
For iInner = iOuter - iSpacing To iLBound Step -iSpacing
'No more shifting needed, we found the right spot!
If lngArray(iInner) <= iTemp Then Exit For

lngArray(iInner + iSpacing) = lngArray(iInner)
Next iInner

'Insert value in the slot
lngArray(iInner + iSpacing) = iTemp
Next iOuter

'Reduce the sort spacing
iSpacing = iSpacing \ 3
Loop

End Sub

Squirm
05-17-2003, 11:01 AM
Shaker Sort
The Comb Sort improved upon the Bubble Sort and the Shell Sort improved upon the Insertion Sort. What can we do to the Selection Sort? The Shaker Sort is a very very simple variation. As we move through the array selecting the largest value, we also select the smallest value, and then move that into position at the start. Simple, eh?

Example:
Array of data [6,7,9,3,2,5]

[6,7,9,3,2,5]
Which element goes here? ^ ^ Which element goes here?

[2,7,5,3,6,9]
Which element goes here? ^ ^ Which element goes here?

[2,3,5,6,7,9]
Which element goes here? ^ ^ Which element goes here?

Finish [2,3,5,6,7,9]

Pretty simple, yes? Here's the code:

Public Sub ShakerSort(ByRef lngArray() As Long)
Dim iLower As Long
Dim iUpper As Long
Dim iInner As Long
Dim iLBound As Long
Dim iUBound As Long
Dim iTemp As Long
Dim iMax As Long
Dim iMin As Long

iLBound = LBound(lngArray)
iUBound = UBound(lngArray)

iLower = iLBound - 1
iUpper = iUBound + 1

Do While iLower < iUpper

iLower = iLower + 1
iUpper = iUpper - 1

iMax = iLower
iMin = iLower

'Find the largest and smallest values in the subarray
For iInner = iLower To iUpper
If lngArray(iInner) > lngArray(iMax) Then
iMax = iInner
ElseIf lngArray(iInner) < lngArray(iMin) Then
iMin = iInner
End If
Next iInner

'Swap the largest with last slot of the subarray
iTemp = lngArray(iMax)
lngArray(iMax) = lngArray(iUpper)
lngArray(iUpper) = iTemp

'Swap the smallest with the first slot of the subarray
iTemp = lngArray(iMin)
lngArray(iMin) = lngArray(iLower)
lngArray(iLower) = iTemp

Loop
End Sub

Squirm
05-17-2003, 11:02 AM
Radix (or Count or Histogram) Sort
This a completely new sort, pointed out to me by OnErr0r. It is quite different to the other sorts, but its a very very cool algorithm. What we do is allocate a series of 'buckets' into which we put values. We sort them into buckets by checking the value of a specific digit in the number. We start at the rightmost digit and repeat the sort through to the leftmost digit. By the end, the array is fully sorted.

Example:
Array of data [5,530,64,203,99,331,4,692,362,34]

Radix with right-most digit
Bucket [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ][ 8 ][ 9 ]
530 331 692 203 64 5 99
362 4
34

Array now [530,331,692,362,203,64,4,34,5,99]

Radix with middle digit
Bucket [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ][ 8 ][ 9 ]
203 530 362 692
4 331 64 99
5 34

Array now [203,4,5,530,331,34,362,64,692,99]

Radix with left-most digit
Bucket [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ][ 8 ][ 9 ]
4 203 331 530 692
5 362
34
64
99

Array now [4,5,34,64,99,203,331,362,530,692]

Finish [4,5,34,64,99,203,331,362,530,692]

That was a demonstration. The actual code will not work with decimal digits but with groups of 2 hexadecimal digits (8 bits). This requires 256 buckets labelled 0 to FF (255). We do as many sorts as needed, which will be no more than 4 for Long values. The actual radix sort comprises 3 steps - first, count the elements going into each bucket, then work out where the buckets should be in the array, then finally move the values into their buckets.

Public Sub RadixSort(ByRef lngArray() As Long)
Dim arrTemp() As Long
Dim iLBound As Long
Dim iUBound As Long
Dim iMax As Long
Dim iSorts As Long
Dim iLoop As Long

iLBound = LBound(lngArray)
iUBound = UBound(lngArray)

'Create swap array
ReDim arrTemp(iLBound To iUBound)

iMax = &H80000000
'Find largest
For iLoop = iLBound To iUBound
If lngArray(iLoop) > iMax Then iMax = lngArray(iLoop)
Next iLoop

'Calculate how many sorts are needed
Do While iMax
iSorts = iSorts + 1
iMax = iMax \ 256
Loop

iMax = 1

'Do the sorts
For iLoop = 1 To iSorts

If iLoop And 1 Then
'Odd sort -> src to dest
InnerRadixSort lngArray, arrTemp, iLBound, iUBound, iMax
Else
'Even sort -> dest to src
InnerRadixSort arrTemp, lngArray, iLBound, iUBound, iMax
End If

'Next sort factor
iMax = iMax * 256
Next iLoop

'If odd number of sorts we need to swap the arrays
If (iSorts And 1) Then
For iLoop = iLBound To iUBound
lngArray(iLoop) = arrTemp(iLoop)
Next iLoop
End If
End Sub

Private Sub InnerRadixSort(ByRef lngSrc() As Long, ByRef lngDest() As Long, ByVal iLBound As Long, ByVal iUBound As Long, ByVal iDivisor As Long)
Dim arrCounts(255) As Long
Dim arrOffsets(255) As Long
Dim iBucket As Long
Dim iLoop As Long

'Count the items for each bucket
For iLoop = iLBound To iUBound
iBucket = (lngSrc(iLoop) \ iDivisor) And 255
arrCounts(iBucket) = arrCounts(iBucket) + 1
Next iLoop

'Generate offsets
For iLoop = 1 To 255
arrOffsets(iLoop) = arrOffsets(iLoop - 1) + arrCounts(iLoop - 1) + iLBound
Next iLoop

'Fill the buckets
For iLoop = iLBound To iUBound
iBucket = (lngSrc(iLoop) \ iDivisor) And 255
lngDest(arrOffsets(iBucket)) = lngSrc(iLoop)
arrOffsets(iBucket) = arrOffsets(iBucket) + 1
Next iLoop
End Sub

Squirm
05-17-2003, 11:13 AM
And the benchmarking. A few changes from the first benchmark:

Exactly the same start array for each sort, so they all get the same data - more accurate comparions
4 of the Advanced Compiler Optimisations checked
Scrapped the ordered list sorting - its not a good method of benchmarking sorting algorithms


And the results:
Sort Total Time Best Time Worst Time Average Time Standard Deviation

10 numbers - 100000 iterations
Bubble sort 152.96469244 0.00111746 3.78316239 0.00152965 0.01336597
BubbleExit sort 152.61800033 0.00083810 4.94643872 0.00152618 0.01678431
Comb sort 176.76213038 0.00139683 2.33046379 0.00176762 0.00807599
Selection sort 161.91359516 0.00111746 1.84353039 0.00161914 0.01007209
Shaker sort 152.79120670 0.00111746 1.57533988 0.00152791 0.00668775
Insertion sort 132.75374384 0.00083810 3.45714330 0.00132754 0.01305081
Shell sort 201.02862235 0.00111746 50.44300323 0.00201029 0.16445773
Quick sort 161.20205222 0.00111746 0.99118743 0.00161202 0.00699075
Merge sort 282.97562958 0.00251429 1.26887635 0.00282976 0.00854858
Heap sort 291.11185919 0.00223492 4.95901015 0.00291112 0.02393678
Radix sort 589.53190978 0.00530794 2.23380346 0.00589532 0.01503699

100 numbers - 10000 iterations
Bubble sort 487.72371908 0.04190477 1.87817167 0.04877237 0.03345189
BubbleExit sort 495.88201852 0.04078731 11.25450302 0.04958820 0.11487248
Comb sort 148.41942202 0.01201270 9.07573449 0.01484194 0.09514210
Selection sort 399.61671106 0.02626032 108.24364549 0.03996167 1.08312136
Shaker sort 238.25849375 0.02234921 2.44416539 0.02382585 0.02610057
Insertion sort 294.96598031 0.01257143 124.98096825 0.02949660 1.24997237
Shell sort 240.92140202 0.01145397 109.40217262 0.02409214 1.09396767
Quick sort 111.51780464 0.00977778 2.21787965 0.01115178 0.02350238
Merge sort 315.22721463 0.01480635 148.60603792 0.03152272 1.48662178
Heap sort 185.44703307 0.01732064 0.71210168 0.01854470 0.01226964
Radix sort 138.20276041 0.01257143 2.17038758 0.01382028 0.02847120

1000 numbers - 1000 iterations
Bubble sort 5302.19259710 4.61371487 109.40915675 5.30219260 6.61005244
BubbleExit sort 5015.94699885 4.64360694 107.50137238 5.01594700 3.39177600
Comb sort 227.38839713 0.19164447 19.90504380 0.22738840 0.62692812
Selection sort 2787.95461434 2.16591774 162.10663646 2.78795461 7.76453592
Shaker sort 1952.84791782 1.80050817 104.74292124 1.95284792 3.25793136
Insertion sort 1480.41524831 1.26887635 106.66858497 1.48041525 3.32987222
Shell sort 212.30547458 0.19890796 0.78166359 0.21230547 0.03312238
Quick sort 160.96738552 0.12515557 23.60076490 0.16096739 0.74193239
Merge sort 191.77856404 0.17655875 2.01813359 0.19177856 0.08281318
Heap sort 236.91055707 0.22963812 0.80010169 0.23691056 0.03317844
Radix sort 265.12755113 0.15420954 105.92351821 0.26512755 3.34292368

10000 numbers - 100 iterations
Bubble sort 50333.40773758 469.18365323 629.35484817 503.33407738 32.63603076
BubbleExit sort 50254.40635612 474.52148248 648.95706019 502.54406356 33.35413768
Comb sort 295.44118037 2.84142258 4.78636251 2.95441180 0.28131880
Selection sort 22926.43282875 211.54560147 427.96835911 229.26432829 33.55725020
Shaker sort 18732.40090570 175.40916513 279.57519741 187.32400906 17.63665114
Insertion sort 14423.08980611 132.25340092 283.51927410 144.23089806 20.54782472
Shell sort 346.19232333 3.20292104 13.82214779 3.46192323 1.05030563
Quick sort 177.20408599 1.67842561 3.38590519 1.77204086 0.18563264
Merge sort 257.45478825 2.48858444 3.36439408 2.57454788 0.10004760
Heap sort 316.26505603 3.05262261 5.96416584 3.16265056 0.31633640
Radix sort 168.49264362 1.54796210 4.22986720 1.68492644 0.38273633

100000 numbers - 10 iterations
Bubble sort 584843.81283096 56844.45062152 60948.44400615 58484.38128310 1232.66414597
BubbleExit sort 593540.00578286 57006.79890880 61420.89232011 59354.00057829 1507.49203488
Comb sort 568.36553249 52.60026065 82.00260089 56.83655325 8.53268726
Selection sort 300871.93316469 29088.35508424 32581.69599768 30087.19331647 917.47691053
Shaker sort 211152.33332728 20092.12679265 22482.32722315 21115.23333273 790.97190654
Insertion sort 165414.38115738 15595.91177091 18918.27480867 16541.43811574 866.39375119
Shell sort 530.32857528 50.22426035 56.98126438 53.03285753 2.01733692
Quick sort 230.06974350 22.02849804 24.76013013 23.00697435 0.96052316
Merge sort 375.08705715 35.73275374 41.81173864 37.50870572 1.91683785
Heap sort 507.13847710 48.37095217 56.82118817 50.71384771 2.44218241
Radix sort 275.80432709 26.36396525 29.33333706 27.58043271 1.14522550

Squirm
05-17-2003, 11:58 AM
Project files attached.

Squirm
06-06-2003, 08:27 PM
To finalise this sorting demonstration I have created the attached application which allows you to view the sorts in action - up to 4 simultaneously. The code in the sorting class is very different from that posted above, to allow this many-sorts-at-once effect to be produced. The code is not commented, nor is it particularly fast, and the relative speeds do not reflect those shown above. However, its purpose is merely as a demonstration, to provide a clearer picture of how these sorts work. Some are more easily visualised than others, the radix and heap sorts dont 'look' very good.

Thats about it, have fun. :)

EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum