iNET Interactive - Online Advertising Agency
          
Go Back  Xtreme Visual Basic Talk > Legacy Visual Basic (VB 4/5/6) > General > Sorting Algorithm


Reply
 
Thread Tools Display Modes
  #1  
Old 01-24-2003, 07:58 PM
Optikal's Avatar
Optikal Optikal is offline
Codeaholic

Preferred language:
Retired Leader
* Guru *
 
Join Date: Oct 2002
Location: Winnipeg, MB, Canada
Posts: 4,543
Default Sorting Algorithm

I got bored at work, so I decided to implement a sorting algorithm in VB6 and see how fast I could get it running. I actually implemented a whole bunch of common sorting algorithms, and tried to optimize them all. My optimized version of QuickSort turned out to be the fastest by a long shot. I pasted the code below, I was hoping somebody could try and optimize it a bit more, or come up with a sort algorithm that is faster. I am attempting to sort a large array of random integers.

Code:
Public Sub QuickSort(ByRef MyArray() As Long, ByVal lngLBound As Long, ByVal lngUBound As Long) Dim lngPivot As Long Dim k As Long Dim p As Long Dim lngTemp As Long If lngLBound >= lngUBound Then Exit Sub End If k = lngLBound + 1 If lngUBound = k Then If MyArray(lngLBound) > MyArray(lngUBound) Then 'swap MyArray(lngLBound) and MyArray(lngUBound) lngTemp = MyArray(lngLBound) MyArray(lngLBound) = MyArray(lngUBound) MyArray(lngUBound) = lngTemp End If Exit Sub End If lngPivot = MyArray(lngLBound) p = lngUBound Do Until (MyArray(k) > lngPivot) Or (k >= lngUBound) k = k + 1 Loop Do Until MyArray(p) <= lngPivot p = p - 1 Loop Do While k < p 'swap MyArray(k) and MyArray(p) lngTemp = MyArray(k) MyArray(k) = MyArray(p) MyArray(p) = lngTemp Do k = k + 1 Loop Until MyArray(k) > lngPivot Do p = p - 1 Loop Until MyArray(p) <= lngPivot Loop 'swap MyArray(p) and MyArray(lngLBound) lngTemp = MyArray(p) MyArray(p) = MyArray(lngLBound) MyArray(lngLBound) = lngTemp QuickSort MyArray, lngLBound, p - 1 QuickSort MyArray, p + 1, lngUBound End Sub


The times I got for an array of 100,000 random integers (on a PII 233 computer) are:
Selection Sort: 7min 18sec
Insertion Sort: 7min 19sec
Bubble Sort: 15min 55sec
Merge Sort: 3min 28sec
Quick Sort: 0.2sec
Reply With Quote
  #2  
Old 01-24-2003, 08:38 PM
diver diver is offline
Senior Contributor

Preferred language:
* Expert *
 
Join Date: Jun 2001
Location: Illinois
Posts: 865
Default

wow, this blows away the bubble sort.
I think the bubble sort was always somewhat popular because it was so simple and easy to understand.

diver
__________________
Sometimes I do not get notification of your post from the forum, which causes a delay. I am apologizing in advance!
Reply With Quote
  #3  
Old 01-24-2003, 08:55 PM
noRulez's Avatar
noRulez noRulez is offline
Contributor
 
Join Date: Aug 2002
Location: Kansas
Posts: 502
Default

pretty much everything blows away the bubble sort. If you want to get fired from your job quickly, then implement a bubble sort to sort 500,000 items

I have my own quicksort class that i wrote in c++, maybe I'll benchmark yours beside mine.

Edit: Can you give me the driver you used for this so I don't have to write one...I'm very lazy
__________________
"As far as the laws of mathematics refer to reality, they are not certain; as far as they are certain, they do not refer to reality."--Albert Einstein

Last edited by noRulez; 01-24-2003 at 09:04 PM.
Reply With Quote
  #4  
Old 01-24-2003, 09:03 PM
Deadalus Deadalus is offline
Promising Talent

Preferred language:
Retired Moderator
* Guru *
 
Join Date: May 2002
Location: Brussels
Posts: 3,598
Default

The function really is fast for an array of random integers, but it seems to get a whole lot slower when the array is already sorted or almost sorted. Is that normal?
__________________
I do not endorse any advertisements that appear in my contribution and detest their placement against my will.
Reply With Quote
  #5  
Old 01-24-2003, 09:21 PM
Machaira's Avatar
Machaira Machaira is offline
Jedi Coder

Preferred language:
* Expert *
 
Join Date: Aug 2002
Location: Abingdon, MD
Posts: 3,438
Default

Here's some code I had from an old book called Ready-to-Run VB Algorithms.

The CountingSort came out fastest at .16 secs for 100K items on my PIII 600.
Attached Files
File Type: zip sorting.zip (130.6 KB, 139 views)
Reply With Quote
  #6  
Old 01-24-2003, 10:19 PM
BillSoo's Avatar
BillSoo BillSoo is offline
Code Meister

Preferred language:
Retired Moderator
* Guru *
 
Join Date: Aug 2000
Location: Vancouver, BC, Canada
Posts: 10,441
Default

The basic algorithm behind quicksort is to pick a pivot value, then split the array in two parts: those values ABOVE the pivot value and those values BELOW the pivot value.

Ideally, the array should be split into two sub arrays of the same size for efficiency. If the numbers are random, then odds are fairly good that any value will be near the median. If the numbers are sorted, or nearly sorted, then picking a RANDOM number in the array as a pivot value should work well too.

But it seems to me from a quick glance at the code, that you use a value at the end of the array as a pivot value. If the array is already sorted, this will be the worst case performance for Quicksort; ie every split will be one number on one side and all the rest on the other. It would then have performance about equal to insertion sort (which it would resemble).
__________________
"I have a plan so cunning you could put a tail on it and call it a weasel!" - Edmund Blackadder
Reply With Quote
  #7  
Old 01-25-2003, 07:01 AM
Optikal's Avatar
Optikal Optikal is offline
Codeaholic

Preferred language:
Retired Leader
* Guru *
 
Join Date: Oct 2002
Location: Winnipeg, MB, Canada
Posts: 4,543
Default

I was trying to squeeze some more speed out of my function, and I thought that by making the 4 local variables (lngPivot, k, p, lngTemp) Static it should speed up performance. Because then each time it recurses there should be 4 less stack allocations. However, it ended up being more than twice as slow. Does anybody know why it would be slower? I also tried making them public in a code module (.bas) and it had the same effect.
__________________
There are 10 types of people in this world, those that understand binary, and those that don't.
Reply With Quote
  #8  
Old 01-25-2003, 07:40 AM
Flyguy's Avatar
Flyguy Flyguy is offline
Lost Soul

Preferred language:
Super Moderator
* Guru *
 
Join Date: May 2001
Location: Vorlon
Posts: 17,571
Default

The sort algo you haven't tested is the ShellSort algo.
This is algorithm is also fast when the data is already (almost) sorted, which gives performance problems with Quicksort.
Code:
Public Sub ShellSortNumbers(vArray() As Long) Dim lLoop1 As Long Dim lHold As Long Dim lHValue As Long Dim lTemp As Long Dim lUBound As Long, lLBound As Long lUBound = UBound(vArray) lLBound = LBound(vArray) lHValue = lLBound Do lHValue = 3 * lHValue + 1 Loop Until lHValue > lUBound Do lHValue = lHValue / 3 For lLoop1 = lHValue + lLBound To lUBound lTemp = vArray(lLoop1) lHold = lLoop1 Do While vArray(lHold - lHValue) > lTemp vArray(lHold) = vArray(lHold - lHValue) lHold = lHold - lHValue If lHold < lHValue Then Exit Do Loop vArray(lHold) = lTemp Next lLoop1 Loop Until lHValue = LBound(vArray) End Sub
__________________
Ego quoque nescio quid hic scriptum est
Reply With Quote
  #9  
Old 01-25-2003, 09:00 AM
C64's Avatar
C64 C64 is offline
Regular
 
Join Date: Jan 2003
Location: Persian Gulf
Posts: 50
Default

I guess it would be more optimized for space (not sure if it will speed up the sort ) if you tried to implement it in non-recursive (iterative) fashion

also, if you are sorting number, here is a way you SWAB two numbers without the need of a third variable

a = a + b
b = a - b
a = a - b
Reply With Quote
  #10  
Old 01-25-2003, 10:43 AM
OnErr0r's Avatar
OnErr0r OnErr0r is offline
Obsessive OPtimizer

Preferred language:
Administrator
* Guru *
 
Join Date: Jun 2002
Location: Debug Window
Posts: 13,226
Default

Another interesting sorting algo is the RadixSort. It takes a little more memory than the other routines as a temporary array the same size as the array to be sorted is needed. Its equally as fast QuickSort in C++ (slightly slower in VB due to lack of various operators)

Random Data (100K items)
QuickSort 33 ms
RadixSort 38 ms
ShellSort 93ms

Sorted Data (100K items)
RadixSort 39ms
ShellSort 48ms
QuickSort 68000ms

One nice thing about RadixSort is it performs almost the same with data in a sorted state. QuickSort ran for several minutes and I eventually killed it when testing with 1M items.

Since VB lacks true shifts, its makes the code a bit longer to be fast. A lookup table could probably be used with decent performance to combine radix0-3 into one prodecure.

Code:
Option Explicit Public Declare Sub RtlZeroMemory Lib "KERNEL32" (dest As Any, ByVal numBytes As Long) Const ITEMS As Long = 256 Private count(ITEMS - 1) As Long Private index(ITEMS - 1) As Long Private Sub ZeroArrays() RtlZeroMemory count(0), ITEMS * 4 RtlZeroMemory index(0), ITEMS * 4 End Sub Public Sub RadixSort(ByRef lSrc() As Long, ByRef lDest() As Long, ByVal lNumber As Long) radix0 lNumber, lSrc, lDest radix1 lNumber, lDest, lSrc radix2 lNumber, lSrc, lDest radix3 lNumber, lDest, lSrc End Sub Public Sub radix0(ByVal lNumber As Long, ByRef lSrc() As Long, ByRef lDest() As Long) Dim i As Long ZeroArrays For i = 0 To lNumber - 1 count(lSrc(i) And &HFF) = count(lSrc(i) And &HFF) + 1 Next i index(0) = 0 For i = 1 To 255 index(i) = index(i - 1) + count(i - 1) Next i For i = 0 To lNumber - 1 lDest(index(lSrc(i) And &HFF)) = lSrc(i) index(lSrc(i) And &HFF) = index(lSrc(i) And &HFF) + 1 Next i End Sub Public Sub radix1(ByVal lNumber As Long, ByRef lSrc() As Long, ByRef lDest() As Long) Dim i As Long ZeroArrays For i = 0 To lNumber - 1 count((lSrc(i) \ 256) And &HFF) = count((lSrc(i) \ 256) And &HFF) + 1 Next i index(0) = 0 For i = 1 To 255 index(i) = index(i - 1) + count(i - 1) Next i For i = 0 To lNumber - 1 lDest(index((lSrc(i) \ 256) And &HFF)) = lSrc(i) index((lSrc(i) \ 256) And &HFF) = index((lSrc(i) \ 256) And &HFF) + 1 Next i End Sub Public Sub radix2(ByVal lNumber As Long, ByRef lSrc() As Long, ByRef lDest() As Long) Dim i As Long ZeroArrays For i = 0 To lNumber - 1 count((lSrc(i) \ 65536) And &HFF) = count((lSrc(i) \ 65536) And &HFF) + 1 Next i index(0) = 0 For i = 1 To 255 index(i) = index(i - 1) + count(i - 1) Next i For i = 0 To lNumber - 1 lDest(index((lSrc(i) \ 65536) And &HFF)) = lSrc(i) index((lSrc(i) \ 65536) And &HFF) = index((lSrc(i) \ 65536) And &HFF) + 1 Next i End Sub Public Sub radix3(ByVal lNumber As Long, ByRef lSrc() As Long, ByRef lDest() As Long) Dim i As Long ZeroArrays For i = 0 To lNumber - 1 count((lSrc(i) \ 16777216) And &HFF) = count((lSrc(i) \ 16777216) And &HFF) + 1 Next i index(0) = 0 For i = 1 To 255 index(i) = index(i - 1) + count(i - 1) Next i For i = 0 To lNumber - 1 lDest(index((lSrc(i) \ 16777216) And &HFF)) = lSrc(i) index((lSrc(i) \ 16777216) And &HFF) = index((lSrc(i) \ 16777216) And &HFF) + 1 Next i End Sub

There is good information about RadixSort at:

http://www.cubic.org/~submissive/sourcerer/radix.htm

(I ported his sample code to VB)
__________________
Quis custodiet ipsos custodues.
Reply With Quote
  #11  
Old 01-25-2003, 10:46 AM
OnErr0r's Avatar
OnErr0r OnErr0r is offline
Obsessive OPtimizer

Preferred language:
Administrator
* Guru *
 
Join Date: Jun 2002
Location: Debug Window
Posts: 13,226
Default

Btw, if you'd rather not use memory manipulation to clear the arrays, just Dim the count and index arrays in Radix0-3 instead of a module wide scope.
__________________
Quis custodiet ipsos custodues.
Reply With Quote
  #12  
Old 01-25-2003, 10:51 AM
Thinker Thinker is offline
Iron-Fisted Programmer

Preferred language:
Retired Moderator
* Guru *
 
Join Date: Jul 2001
Location: Fayetteville Arkansas USA
Posts: 18,127
Default

Quote:
Originally posted by Optikal
I was trying to squeeze some more speed out of my function, and I thought that by making the 4 local variables (lngPivot, k, p, lngTemp) Static it should speed up performance. Because then each time it recurses there should be 4 less stack allocations. However, it ended up being more than twice as slow. Does anybody know why it would be slower? I also tried making them public in a code module (.bas) and it had the same effect.
IIRC, static variables are stored on the heap, which is slower
than the local stack. But I didn't think it would be that much
slower.
__________________
Posting Guidelines
Reply With Quote
  #13  
Old 01-25-2003, 11:34 AM
ICeMaN_179 ICeMaN_179 is offline
Restricted
 
Join Date: Oct 2002
Location: UK
Posts: 682
Default

how do you time how quick your Algorithm is ?
Reply With Quote
  #14  
Old 01-25-2003, 01:12 PM
Iceplug's Avatar
Iceplug Iceplug is offline
MetaCenturion

Preferred language:
Retired Moderator
* Guru *
 
Join Date: Aug 2001
Location: Hawaii, USA
Posts: 16,584
Default

Using your favorite timing API function (whatever that might be)...

Code:
Dim T1 As Long T1 = timeGetTime 'call whatever you want Debug.Print timeGetTime - T1 'This will be the time length of whatever you called. 'if you called a function, sub, a function inside of a loop, or just 'decided to put a sleep call in here... this should be okay to some 'accuracy... however, [api]QueryPerformanceCounter[/api] works a bit 'faster.
__________________

Iceplug, USN
Quadrill 1 Quadrill 2 (full) Quadrill 3 JumpCross .NET Website is ALIVE! - DL Platform Tour for VB.NET! Posting Guidelines Hint: Specify your location in your user cp profile if you want compassion!
Reply With Quote
  #15  
Old 01-25-2003, 08:29 PM
Optikal's Avatar
Optikal Optikal is offline
Codeaholic

Preferred language:
Retired Leader
* Guru *
 
Join Date: Oct 2002
Location: Winnipeg, MB, Canada
Posts: 4,543
Default

Daedalus: There is a change you can make to my qsort to make it run fast on already sorted (or almost sorted) lists. My array picks the first item in the array as the pivot. If you change it to pick the middle value in the array as the pivot, that will ensure good performance on already sorted arrays (but will add an extra divide operation to each recursion).

I did some benchmarks with OnErrors RadixSort and found that it runs almost twice as fast as my QSort. I made some code optimization to his code (just use some temp vars to precalculate stuff, and juggled around some of the for loops) and managed to squeeze another 30% speed boost out of it.

machaira: I looked at the counting sort in that zip, and the problem with that is, that unless the range of numbers to be sorted is relatively small, it is much too memory intensive. If I wished to use it on an array of longs, where there is no restrictions on the value of the array elements (anywhere from -2^31 to +2^31) then the CountingSort would need approx. 16GB of memory to operate. There are however some sorts in there that I forgot about, such as heapsort.

I'm gonna add HeapSort, and ShellSort to my little benchmark program, and see if I can optimize the hell out of them. I'll post my benchmarks results when I get it all done (yes, I know I'm a dork).

Also, I'm gonna try making a hybrid QSort that will call use another sort algorithm when the subarray size gets small (I looked at the code machaira posted and it does that, and it made me remember my prof mentioning that in my data structs. and algo. class). I'm also gonna try re-implementing my qsort as iterative and see if that is any faster (at the least it will prevent stack errors for large arrays).
__________________
There are 10 types of people in this world, those that understand binary, and those that don't.
Reply With Quote
  #16  
Old 01-25-2003, 08:32 PM
Optikal's Avatar
Optikal Optikal is offline
Codeaholic

Preferred language:
Retired Leader
* Guru *
 
Join Date: Oct 2002
Location: Winnipeg, MB, Canada
Posts: 4,543
Default

On a side note, I'm currently just using good ol' GetTickCount() to do my benchmarking. What api's do I need to look into so that I can get a true 'cpu time' for my code rather than just the elapsed time (the same time that taskmon in NT displays)?
__________________
There are 10 types of people in this world, those that understand binary, and those that don't.
Reply With Quote
  #17  
Old 01-25-2003, 08:48 PM
John's Avatar
John John is offline
Bit Flipper
 
Join Date: Feb 2002
Location: The Inner Loop
Posts: 5,550
Default

I think the QueryPerformanceCounter along with QueryPerformanceFrequency will be about as accurate as you will find. Iceplug mentioned it above.

Orbity
__________________
Subclassing|Magnetic Forms|Operator Overloading (VB2K5)|QuickSnip.NET

"These Patriot playoff wins are like Ray Charles songs, Nantucket sunsets, and hot fudge sundaes. Each one is better than the last." - Dan Shaughnessy
Reply With Quote
  #18  
Old 01-25-2003, 09:20 PM
darkforcesjedi's Avatar
darkforcesjedi darkforcesjedi is offline
Trust me, I'm an

Preferred language:
* Expert *
 
Join Date: Apr 2001
Location: In ur base, pwnin d00dz
Posts: 1,961
Default

If you use the Format function to set your random integers to fixed length text (Number(1) = Format(Number(1), "0000000000")), then this seems to be faster (it's also alot simpler):

Code:
Public Sub SortArray(ByRef iArray() As String) On Error GoTo errs Dim temp() As String, GreaterThans As Long ReDim temp(UBound(iArray)) For i = 0 To UBound(iArray) GreaterThans = 0 For j = 0 To UBound(iArray) If StrComp(iArray(i), iArray(j), vbTextCompare) = 1 Then GreaterThans = GreaterThans + 1 Next Do If temp(GreaterThans) = "" Then temp(GreaterThans) = iArray(i) Exit Do Else GreaterThans = GreaterThans + 1 End If Loop Next For i = 0 To UBound(temp) iArray(i) = temp(i) Next errs: End Sub
__________________
To err is human; to debug, divine.
Reply With Quote
  #19  
Old 01-25-2003, 09:22 PM
Optikal's Avatar
Optikal Optikal is offline
Codeaholic

Preferred language:
Retired Leader
* Guru *
 
Join Date: Oct 2002
Location: Winnipeg, MB, Canada
Posts: 4,543
Default

Orbity: yes I know about that, but that does not account for the cpu-time actually spent running my code. For example, if I have another process running at the same time as my benchmark, and this other process sucks up some of the cpu it will affect my benchmarks results. If this other process sucks up cpu-time only during my quicksort benchmark, but not during the bublesort benchmark, it will make it look as if the quicksort was slower than it actually was.

I know its possible to retrieve the actual cpu-time spent executing my code, because the taskmon in NT does just this. I just don't know the appropriate api's...
__________________
There are 10 types of people in this world, those that understand binary, and those that don't.
Reply With Quote
  #20  
Old 01-25-2003, 09:49 PM
Optikal's Avatar
Optikal Optikal is offline
Codeaholic

Preferred language:
Retired Leader
* Guru *
 
Join Date: Oct 2002
Location: Winnipeg, MB, Canada
Posts: 4,543
Default

darkforces: I benchmarked that algorithm you posted, and unfortunately it is much slower than the other ones.

With 10,000 random items on a PII 233 machine:
Your string algorithm: 154.2 secs
Bubble Sort: 8.7 secs
Insertion Sort: 4.2 secs
Selection Sort: 4.2 secs
Merge Sort: 2.2 secs
Quick Sort: 0.2 secs
Radix Sort: 0.1 secs

* the time for the string sort does not include the time needed to create and format the string array from the array of longs.
__________________
There are 10 types of people in this world, those that understand binary, and those that don't.
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump

Advertisement: