
12202003, 04:17 PM


Algorithms 'R' Us
Forum Leader * Guru *


Join Date: Jun 2002
Location: Canberra
Posts: 4,130


Combinatorial Formula (optimising)
C(n, k)  A Suitable case for Optimisation
The number of ways to choose K objects from N objects is given by the Combinatorial Formula:  C(n, k) = n! / (k! * (nk)!)
This formula is most convenient for analysis, but not for computation. With a VB "Long" variable, the highest N whose factorial you can calculate is just 12! If you have 13 objects, bad luck (13?? now there's a coincidence!)
Let's say I want to know the chances in a lotto game with 100 balls. Now I can tell you that choosing 6 lucky numbers from 100 has exactly 1 chance in  C(100, 6) = 1,192,052,400
That result just fits in a 32bit Long, but how can we calculate it? Especially when 13! can't be calculated, let alone 100! ?
Factorising
This formula is ideal for factorising, as it can be expressed as the ratio of two products, each having the same number of terms (N  K):
Code:
C(n, k) = A / B
Where
A = K+1 * K+2 * K+3 * ... * N1 * N
B = 1 * 2 * 3 * ... * NK
The formula always yields an integer result, so we know that all the B factors MUST divide into A, whose factors we also know.
So all we need to do is to go through list B, and "cancel" it from list A.
Cancelling Process  Example
Code:
C(49,6) = (44 x 45 x 46 x 47 x 48 x 49) / 6!
A = list of products remaining in numerator
B = " factors " denominator
Start =>
A: 49 48 47 47 45 44
B: 1 2 3 4 5 6
Cancel 6
A: 49 8 47 46 45 44
B: 2 3 4 5
Cancel 5
A: 49 8 47 46 9 44
B: 2 3 4
Cancel 4
A: 49 8 47 46 9 11
B: 2 3
Cancel 3
A: 49 8 47 46 3 11
B: 2
Cancel 2
A: 49 x 8 x 47 x 23 x 3 x 22
Result = 13,983,816
After a single pass through list B, we have eliminated ALL terms in the B list without having to do any multiplications.
To get our answer we just multiply the reduced list A terms together. This can only overflow if the answer itself is too big.
GCD function vs Mod operator
When we are looking for terms in A that we can divide by a factor in list B, we are not guaranteed a match every time.
Example:
A: 48, 3, 30
B: 9
The problem here is that reducing "9" can only be done if we multiply the A list together, as it is partly in "3" and partly in "30".
But we don't want to do the multiplication before we eliminate the B list completely, or we risk overflow preventing us getting a valid answer.
Clearly, a simple test of divisibility like
Code:
If A Mod B = 0 Then Cancel A, B
is not good enough.
The answer is to use the GCD function (greatest common divisor) for each A,B combination. If the two terms have any common factors, this gives us the best one, and we can reduce the A term, and also the B term. If the B term is greater than 1 it remains in the loop as an outstanding factor. It's residual value will now match some other term in A (or have a common factor).
It's guaranteed!
Extending the Precision of the Result
Our list of values to multiply can yield results much bigger than our Long variable type  we can make it a Decimal (Variant, CDec) so it can return a 96bit value, up to the maximum:  Max VB Decimal integer = 79,228,162,514,264,337,593,543,950,335
. This lets us compute the following values without overflow!
 C(98,49) = 25,477,612,258,980,856,902,730,428,600
 C(99,49) = 50,445,672,272,782,096,667,406,248,628
 C(100,46) = 73,470,998,190,814,997,343,905,056,800 ' close to MAX
 C(100,47) = overflow! (can't do much about this one!)
 C(1000,6) = 1,953,840,414,726,664,053,684,327,000
 C(2000,11) = 49,912,598,548,413,742,767,706,482,000
 C(10000, 8) = 2,473,222,266,965,964,208,068,123,750
 C(100000, 6) = 1,388,680,567,360,798,614,916,650,000
 C(100000, 7) = you guessed it! kaboom!
Efficiency?
I won't bore you with the details, but it's fast  very, very fast, as you'll see!

__________________
Cogito, ergo codo

12202003, 04:22 PM


Algorithms 'R' Us
Forum Leader * Guru *


Join Date: Jun 2002
Location: Canberra
Posts: 4,130


Here's the code, selfcontained. The function returns a Decimal variant value. The GCD function only needs to operate on integer values, it uses Euclid's algorithm, very fast.
Code:
Function Comb(ByVal N As Long, ByVal K As Long) As Variant
'==============================================================
'
' Comb(N, K) = (K+1 x K+2 x .... N) / C!
' where C = N  K
'
' [email]MathImagics@yahoo.co.uk[/email] Dec 2003
'
' Method:
' . Make lists of the numerator (A) and denominator (B) terms
' . Eliminate B factors by cancellation
' . Multiply residual A terms
'
' RETURNS: Variant (Decimal, 96bits)
'
' Max capacity:
' Limited only by magnitude of final result, regardless
' of input parameters
'
' Decimal max = 79,228,162,514,264,337,593,543,950,335
'
' Comb(100, 46) = 73,470,998,190,814,997,343,905,056,800
' Comb(1000, 6) = 1,953,840,414,726,664,053,684,327,000
' Comb(2000,11) = 49,912,598,548,413,742,767,706,482,000
' Comb(10000, 8) = 2,473,222,266,965,964,208,068,123,750
' Comb(100000, 6) = 1,388,680,567,360,798,614,916,650,000
'
'==============================================================
Dim C As Long
C = N  K
If C > K Then C = K ' swap so K > C, reduces number of factors
K = N  C
ReDim A(C) As Long, B(C) As Long
Dim I As Integer
Dim Verified As Boolean ' when true, all denominator factors
' have been eiliminated
'
' Set factor lists, A = Numerator, B = Denominator
'
For I = 1 To C
B(I) = I ' 1 2 3 ... NK = C terms
A(I) = K + I ' K+1, K+2, K+3 ... N = C terms
Next
'===========================================================
' Reduce B factors wherever a B factor has GCD(B,A) > a
'
' Uses Dr memory's Decimalversion GCD function test
'
'===========================================================
Dim d As Integer, j As Integer
Dim M As Long, Q As Long
Verified = True
For d = C To 2 Step 1
If B(d) > 1 Then
For j = 1 To C
M = GCD(B(d), A(j)) ' can terms be reduced?
If M > 1 Then
A(j) = A(j) / M
B(d) = B(d) / M
End If
Next
End If
If B(d) > 1 Then Verified = False ' this factor still pending
Next
If Not Verified Then
MsgBox "Failed reduction of factors!", vbExclamation
Exit Function
End If
On Error GoTo Overflow ' final result can overflow
Comb = CDec(A(1)) ' do final product in decimal mode
For I = 2 To C
If A(I) > 1 Then Comb = Comb * A(I)
Next
Exit Function
Overflow:
If Verified Then
MsgBox "Answer Overflows", vbExclamation
Else
MsgBox "Answer unobtainable", vbExclamation
End If
End Function
Function GCD(ByVal X As Long, ByVal Y As Long) As Long
'
' [email]MathImagics@yahoo.co.uk[/email]
'
' Euclid's method to find GCD(x,y)
'
Dim M As Long
If X > Y Then M = X: X = Y: Y = M ' ensure X < Y to begin with
Do
M = Y Mod X
If M = 0 Then GCD = X: Exit Function
Y = X
X = M
Loop
End Function

__________________
Cogito, ergo codo

05162004, 08:38 AM


Algorithms 'R' Us
Forum Leader * Guru *


Join Date: Jun 2002
Location: Canberra
Posts: 4,130


Minor bug in Combinations() function
Erratum
The Combinations() function above does not correctly handle the case where N = K.
Insert the following line at the beginning of the function:
Code:
If K = N Then Combinations = CDec(1): Exit Function ' C(a, a) = 1


05162004, 09:09 AM


Algorithms 'R' Us
Forum Leader * Guru *


Join Date: Jun 2002
Location: Canberra
Posts: 4,130


CombinationtoIndex, IndextoCombination
Knowing that there are 17.3 billion ways to choose 10 balls from 100 different ones, is one thing, but clearly we can't list them all (in our lifetimes).
If you are doing Lotto game analysis (a task which may provide you with intellectual stimulation, but is unlikely to make you any money), you may want instead to list a SECTION of the list of possible combinations.
That's done with an Index2Combination function  for example, Index2Combination(100, 10, 42) returns the 42nd member of an ordered list of combinations of 10 numbers from 100 (i.e. "1, 2, 3, 4, 5, 6, 7, 8, 9, 51")
You'll also want the INVERSE function, Combination2Index, that returns the list index number of a given combination. That is,
Combination2Index(100,10, "1, 2, 3, 4, 5, 6, 7, 8, 9, 51") returns 42.
Here are the first and last 3 members of the (100, 10) combination list:
Code:
#1 = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
#2 = 1, 2, 3, 4, 5, 6, 7, 8, 9, 11
#3 = 1, 2, 3, 4, 5, 6, 7, 8, 9, 12
....
....
#17,310,309,456,438 = 90, 91, 93, 94, 95, 96, 97, 98, 99, 100
#17,310,309,456,439 = 90, 92, 93, 94, 95, 96, 97, 98, 99, 100
#17,310,309,456,440 = 91, 92, 93, 94, 95, 96, 97, 98, 99, 100


05162004, 10:45 AM


Algorithms 'R' Us
Forum Leader * Guru *


Join Date: Jun 2002
Location: Canberra
Posts: 4,130


Index2Combination
The most common use of Index2Combination is to "select" (or list) a small section of combinations chosen from some part of the list.
Let's say I want to print a block of 100 combinations (from the N=100, K=10 list), starting with combination #123,456,789,001. Here's the code:
Code:
Dim CNO As Variant, i As Long
CNO = 123456789000
Debug.Print "Combinations List extract, start at #" & format(CNO, "#,0")
For i = 1 to 100
CNO = CNO + 1
Debug.Print " "; Index2Combination(100, 10, CNO)
Next
Note that the function returns a string, with the values separated by commas. It's declaration is:
Code:
Function Index2Combination(ByVal N As Long, ByVal K As Long, ByVal CNO) As String
Here's the results:
Code:
Combinations List extract, start at #123,456,789,000
1, 2, 19, 27, 32, 38, 57, 59, 93, 96
1, 2, 19, 27, 32, 38, 57, 59, 93, 97
1, 2, 19, 27, 32, 38, 57, 59, 93, 98
1, 2, 19, 27, 32, 38, 57, 59, 93, 99
1, 2, 19, 27, 32, 38, 57, 59, 93, 100
1, 2, 19, 27, 32, 38, 57, 59, 94, 95
....
....
How it works
We first ask "what must the first value be?". Is it 1, or 2, or 3, etc??? The number of combinations beginnng with 1 is C(99, 9), because there are 9 places left to fill, and 99 values to choose from.
That means the index value of the first combination beginning with 2 must be C(99,9) + 1 = 1,731,030,945,644 . Our test index value is less than that so the first number must be one.
Now we look at the second value, and work it out the same way  we find the highest value it could be. Note that, at position P, the number of combinations that begin with the current set of P values is given by:
Combinations(N  V(P), K  P)
where V(P) is the P'th value.
So it is simply a matter of converging on the desired value. Here is the function:
Code:
Function Index2Combination(ByVal N As Long, ByVal K As Long, ByVal CNO) As String
'
' by Dr Memory (MathImagics) May 2004
'
' N = # of values (balls)
' K = # selected to combine
' CNO = index number of combination to return
' 1 <= CNO <= Combinations(N, K)
'
'=============================================
Dim NC, V, cCount ' Variants (Decimal)
Dim Wheel As Long, W As Long, Combo As String
NC = Ctable(N, K)
If CNO < 1 Or CNO > NC Then Exit Function
Wheel = 1
cCount = 0# ' running count of combinations
For W = 1 To K  1
Do
V = Ctable(N  Wheel, K  W) ' how many till next value?
If V = 0 Then Exit Do
If cCount + V >= CNO Then Exit Do
cCount = cCount + V
Wheel = Wheel + 1
Loop
If W = 1 Then Combo = Wheel Else Combo = Combo & ", " & Wheel
Wheel = Wheel + 1
Next
Index2Combination = Combo & ", " & Wheel + (CNO  cCount  1)
End Function
For super efficiency, we prestore all the required Combinations() values in a table that we initialize in Form_Load:
Code:
Dim i As Long, K As Long
For i = 2 To N
For K = 1 To i
If K > MAXK Then Exit For
Ctable(i, K) = Combinations(i, K) ' ** Note! needs fix (see Post #3 above!!!)
Next
Next
Global declarations (put in the module with the Combn functions):
Code:
Public Const N = 45 ' example only
Public Const MAXK = 6 ' max value of K
Public Ctable(N, MAXK) As Variant

Last edited by MathImagics; 05162004 at 11:29 AM.

05162004, 11:22 AM


Algorithms 'R' Us
Forum Leader * Guru *


Join Date: Jun 2002
Location: Canberra
Posts: 4,130


Combination2Index
Here's the inverse function  I'll leave it as an exercise to figure out how it works!
Code:
Function Combination2Index(ByVal N As Long, ByVal K As Long, Combn As String) As Variant
'
' by Dr Memory (MathImagics) May 2004
'
' N = # of values (balls)
' K = # selected to combine
' Combn= list of values, e.g. "2, 7, 9, 23, 40, 45"
' must be in ascending order
'=============================================
ReDim token(K) As String
token = Split(Combn, ",")
If UBound(token) <> K  1 Then Exit Function
Dim NC, wcount, Wheel() As Long, W As Long
Dim V As Long, msg As String
ReDim Wheel(K)
Combination2Index = CDec(0)
For W = 1 To K
Wheel(W) = Val(token(W  1))
If Wheel(W) <= Wheel(W  1) Then Exit Function
If Wheel(W) > N Then Exit Function
Next
For W = 1 To K  1
For V = Wheel(W  1) + 1 To Wheel(W)  1
Combination2Index = Combination2Index + Ctable(N  V, K  W)  1
Next
Next
Combination2Index = Combination2Index + Wheel(K)  K + 1
End Function
What index number (in 100,10) would "10,20,30,40,50,60,70,80,90,100" be? The answer is 11,347,920,542,199
Code:
?Combination2Index(100,10,"10,20,30,40,50,60,70,80,90,100" )
11347920542199
?Index2Combination(100,10, 11347920542199 ) ' Verify
10, 20, 30, 40, 50, 60, 70, 80, 90, 100
Cheers!
Dr Memory


05162004, 11:24 AM


Algorithms 'R' Us
Forum Leader * Guru *


Join Date: Jun 2002
Location: Canberra
Posts: 4,130


For Litazen:
here is some verification for the case N = 3000, K = 10
Code:
?Format(Combinations(3000,10),"#,0")
16,029,803,915,594,860,452,884,242,200
?Index2Combination(3000,10, cdec("16029803915594860452884242200"))
2991, 2992, 2993, 2994, 2995, 2996, 2997, 2998, 2999, 3000
?Combination2Index(3000,10, "2991, 2992, 2993, 2994, 2995, 2996, 2997, 2998, 2999, 3000")
16029803915594860452884242200


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)


Thread Tools 

Display Modes 
Linear Mode

Posting Rules

You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off





