[PD] variance from mapping library
Mathieu Bouchard
matju at artengine.ca
Thu Mar 19 17:50:26 CET 2009
On Thu, 19 Mar 2009, Oded Ben-Tal wrote:
>> abs(a)+abs(b), or something like that. But 100000*100000 = 10000000000, and
>> if you divide that by 2^24 = 16777216, you get about 596, which is an upper
>> bound for the amount of error: so, the error is surely between -596 and
>> +596.
> I trust your math here but just notice that your example converges to -691.
That's because 100000*100000 is only one value. Then, the second [mean_n]
has to process + 90000*90000 + 80000*80000 + 70000*70000 + ... so the
theoretical error maximum is much more than 596 but much less than 596*10.
In practice, much of the individual errors are not that big and perhaps
some of them cancel each other.
But to find the reason for why -691 precisely, would take a long time and
would not be any insightful.
> But if I understand you correctly 'filtering' the input data through [int]
> should make variance error free (we hope).
no, it won't, because still all of the other objects process floats. The
reason why ints wouldn't have that problem is because they have fixed
precision, that is, the step between two adjacent numbers is 1, whereas
for floats it's roughly proportional to the numbers themselves. For
integers you will hit an overflow problem quite quickly, and so, for
example, if you remake that abstraction using 32-bit integers (for
example, using the GridFlow library) then you can already get an overflow
by using random 5-digit numbers, but at least, it goes back to normal when
given a more modest sequence, whereas for floats it gets stuck and needs
to be reset (recreated).
Using int64 you could get perfect results for most uses, but I don't
recall whether the bugs in GridFlow's int64 support were fixed or not... I
never quite had a use for int64 in the end.
For the "mapping" library, there isn't much of a choice but to remake it
with a slower algorithm, unless someone knows a magic trick for cancelling
almost all of the error while not running so slow. Actually, it already
runs in linear time, so it wouldn't be such a big loss if the complete sum
was recomputed at every step, because it would still be linear. It would
be a big loss if it could run in constant time (e.g. using an array for
the queue) and had to be switched to linear time.
_ _ __ ___ _____ ________ _____________ _____________________ ...
| Mathieu Bouchard - tél:+1.514.383.3801, Montréal, Québec
More information about the Pd-list
mailing list