Array-UDF weirdness

Cas
05-20-2008, 08:11 AM
I'm experimenting with UDFs receiving and returning arrays. The receiving part is no problem, once one understands how Excel represents row and column vectors. But I'm having a problem with the returning part in a certain situation. The following illustrates it with complexity reduced as far as possible:

=SUM({1,2})
produces 3, as expected
=SUM({1,2}+1)
produces 5, as expected
=SUM(udfAssign({1,2}))
produces 3, as expected
=SUM(udfAssign({1,2})+1)
produces 2, NOT as expected
=SUM(udfAssign({1,2}+1))
produces 5, as expected

Now, at this point, I should probably show what udfAssign does. The answer is that it doesn't do anything:
Public Function udfAssign(whichInput As Variant) As Variant
udfAssign = whichInput
End Function

I tried changing both the input and output Variant to variant arrays, but neither combination changes the results in any of the cases.

So, to elaborate, the unexpected behaviour occurs when

returning an array from a UDF, then
inputting this into a worksheet function like "+" which ordinarily should return an array if either parameter is an array, and finally
inputting the result of this into a worksheet function which aggregates the array into a result.

Somewhere between the second and third step, the array is silently replaced by its first element. If any of the three steps are left out, the problem doesn't occur, i.e.

=SUM({1,2}+1)
=SUM(udfAssign({1,2}))
=udfAssign({1,2})+1

all work fine, where the last one is entered as an array formula in a 1-by-2 range.

Further, either evaluating the problematic expression manually using F9 or entering it as an array formula correct the problem. That last part is especially peculiar - entering it as a normal formula makes it behave like the first element of an array formula (returns sum(1+1)=2), while entering it as an array formula makes it behave like a normal formula (returns sum({1,2}+1)=5 in each cell).

I'm pretty convinced that this behaviour can't be by design, but maybe I'm overlooking something? Or maybe there's a type of return array that fixes this?

What this means, in effect, is that it isn't safe to daisy-chain non-aggregating UDFs and worksheet functions. The obvious workaround is to provide UDF-alternatives for all worksheet functions that are of interest in this scenario, possibly using Application.Evaluate (yes, this fixes the problem). Rather annoying. :mad:

ETA: Oh, the above applies to Excel 2002. I'd be very interested to hear if later versions produce identical results. :)

Colin Legg
05-20-2008, 12:36 PM
Hi Cas,

Now, at this point, I should probably show what udfAssign does. The answer is that it doesn't do anything:
Public Function udfAssign(whichInput As Variant) As Variant
udfAssign = whichInput
End Function


Actually, I think it does a lot....let me explain:

You are passing a horizontal array constant into your UDF: your UDF is returning a multiple result array and not an array constant.
This completely changes the dynamics of the formula from that point onwards: Excel will not iterate through the multiple-result array as is unless the formula is CSE entered.

To use a couple of examples:
In this formula (a version of example 4 you gave above so let's call it (4a)):

=SUM(udfassign({4,8})+1)

The array constant is passed directly into the UDF. The UDF returns a multiple-result array of 4,8.
You then want Excel to iterate through the multiple-result array and add 1 to each element.
It won't do it as is unless it is array-entered: you will just get the SUM of 4+1 = 5


In this formula (let's call this one (5a)):

=SUM(udfassign({4,8}+1))

Here the iteration through the array constant will be performed because the array constant is still intact when the additive operation is assessed: the result is 5,9.
This is then passed into the UDF and the UDF then returns a multiple-result array of 5,9 which is then summed to give 14. Here no iteration is required on the mulitple-result array returned from your UDF. That is the difference between the two.



To correct the problem with 4a, you have 3 obvious choices:

=SUM(udfassign({4,8})+1) 'C+S+E entered.



=SUMPRODUCT(udfassign({4,8})+1)



=SUM(N(udfassign({4,8}))+1) 'nasty - avoid this.


HTH! :)
Colin

Cas
05-20-2008, 01:30 PM
Thanks for the reply Colin, I was hoping you as the resident Array-Formula guru would let yourself be dragged into this quagmire, because - alas - I'm not convinced yet that it isn't one. :)

your UDF is returning a multiple result array and not an array constant.
This completely changes the dynamics of the formula from that point onwards
Yes, I'm with you so far. That Excel makes a difference between arrays and array constants becomes evident already when one enters "={1,2}+1" and "=A1:B1+2" without CSE - the former returns an array, the latter returns an error.
Excel will not iterate through the multiple-result array as is unless the formula is CSE entered.
That, however, isn't correct, or I'm not clear on the terminology here. Compare these two:

=SUM(SQRT({1,4})+1)
=SUM(udfAssign({1,2})+1)

The first produces 5, entered with or without CSE, the second only when entered with CSE. Surely the result of the worksheet SQRT() falls under "multiple result array"? So there has to be more to it than that.

=SUMPRODUCT(udfassign({4,8})+1)

Good call, I never thought to try that. So, what does this mean? Presumably that the problem isn't that udfAssign()+1 doesn't produce an array, but that SUM doesn't aggregate the array for some reason. See also the following post.

=SUM(N(udfassign({4,8}))+1) 'nasty - avoid this.

This is also very interesting, however nasty. Should one interpret this as N() in effect producing an array constant?
HTH!
Absolutely, but more is needed! :p

Cas
05-20-2008, 02:00 PM
Okay, I just revisited Colin's excellent Array Formula KB thread and came across something that I don't get at all and that may well be related to the issue described above.

Set up the following: cell A1 with value 1, cell A2 with value 2.
Now, try out the following formulas, each both with (in a 1-by-2 range) and without CSE:

=OFFSET(A1,0,{0,1})
=SUM(OFFSET(A1,0,{0,1}))
=SUMPRODUCT(OFFSET(A1,0,{0,1}))
=SUBTOTAL(9,OFFSET(A1,0,{0,1}))
=SUM(SUBTOTAL(9,OFFSET(A1,0,{0,1})))
=SUMPRODUCT(SUBTOTAL(9,OFFSET(A1,0,{0,1})))

SubTotal(9) is the sum-setting, max or min will do just as well.

For me, this yields the following:

Offset on its own doesn't work with CSE at all.
Sum() can't aggregate Offset(), but works as a multi-result formula with CSE.
SumProduct() doesn't work with Offset().
Wrapping Offset() with Subtotal() makes everything peachy - multi-result with CSE, aggregation with Sum() and SumProduct().

So, somehow, Offset produces an array type that can't be aggregated, even with CSE. And somehow Subtotal turns this into an array type that can be aggregated, even without CSE.

I tried to repeat the same thing with Index() in place of Offset(), but Index({1,2},1,{1,2}) can't be used with Subtotal() and Index(A1:A2,1,{1,2}) simply doesn't produce an array at all, so all subsequent operations just give 1. Doesn't add anything to the Offset() experiment, in other words.

I'm sure I don't know what to make of this, but maybe it'll spark thoughts in someone else's head.

This really seems to be a special case, see final footnote here (http://www.tushar-mehta.com/excel/tips/array_formulas.htm). One less mystery to be solved, one less clue to follow up. :-\

Colin Legg
05-21-2008, 03:20 AM
Hello again Cas,

Thanks for the reply Colin, I was hoping you as the resident Array-Formula guru would let yourself be dragged into this quagmire,
You're doing a disservice to a couple of XVBT members there... but thank you for the compliment. ;)

because - alas - I'm not convinced yet that it isn't one. :)

No problem, I'm sure we can clear it up for you.... to a certain extent, anyway.


That, however, isn't correct, or I'm not clear on the terminology here. Compare these two:

=SUM(SQRT({1,4})+1)
=SUM(udfAssign({1,2})+1)

The first produces 5, entered with or without CSE, the second only when entered with CSE. Surely the result of the worksheet SQRT() falls under "multiple result array"? So there has to be more to it than that.
Okay, I think you've misunderstood me / confused the terminology but thank you for keeping this in the context of the SUM function. Your example actually supports what I said and the output of the SQRT function is different to your UDF; maybe things will be clearer if you examine the output of these functions yourself:

=TYPE({1,4})
=TYPE(SQRT({1,4})+1)
=TYPE(udfassign({1,2})+1) 'entered normally.
=TYPE(udfassign({1,2})+1) 'entered as an array formula.





=SUM(N(udfassign({4,8}))+1) 'nasty - avoid this.

Should one interpret this as N() in effect producing an array constant?

Well, again you can see for yourself:

=TYPE(N(udfassign({4,8}))+1)
=TYPE(udfassign({4,8})+1)


As far as I'm aware there is no documentation on this kind of thing which is why I'm being very cagey about giving my own interpretation / understanding as a generic rule.


=SUMPRODUCT(udfassign({4,8})+1)Good call, I never thought to try that. So, what does this mean? Presumably that the problem isn't that udfAssign()+1 doesn't produce an array, but that SUM doesn't aggregate the array for some reason.
The syntax on SUMPRODUCT is SUMPRODUCT(Array1, Array2, .....) so I think there is an implicit data type conversion occurring behind the scenes when you pass your UDF result into a SUMPRODUCT parameter - similar to the explicit conversion I showed you with the N worksheetfunction. This is why you don't have to press C+S+E with SUMPRODUCT but you do with SUM. Again, I am not aware of any documentation on this so it is my interpretation.

=TYPE(udfassign({1,2})+1) '---> 1
=SUM(TYPE(udfassign({1,2})+1)) '---> 1 (entered as standard formula)
=SUMPRODUCT(TYPE(udfassign({1,2})+1)) '---> 64



HTH some more...!

Colin

EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum