[PD] CPU cost II

Mon Nov 13 03:03:14 CET 2006

you are right, the big cpu eaters are graphics and fft-objects (which 
depend on the window size.)
but for example, when I plan to use a minimac, I would like to know how 
many videos can I add with which resolution.
I also know that with a good graphic card I can render more GEM objects 
than with some big cpus, but a crappy gfx card.
I was also thinking of having a [dsp] object interact with the framerate 
within a patch.
thanks, anyway for thinking about that problem!
marius.

Mathieu Bouchard schrieb:
> On Thu, 9 Nov 2006, Marius Schebella wrote:
> 
>> for me that's a really important topic, I often run into problems with 
>> slow machines not fast enough to play patches.
> 
> With video this happens often, even on fast machines, and especially 
> with GridFlow: e.g. it's not possible to use [#fft] at 30fps unless your 
> resolution is really small.
> 
>> I wonder if it is possible to calculate something like flops/ FLOating 
>> Point OPerations per object
> 
> It wouldn't be just a count of flops; that's a rather useless unit of 
> measure unless you know that all your flops take the same amount of 
> time, and what you care about is the time. In Numerical Analysis, 
> multiplications and additions are usually counted separately, because 
> they're expected to be in two different classes of speed.
> 
>> and have a list for all the pd objects.
> 
> This would have to be parametrized according to some things, like length 
> of list arguments, block size, and possibly a lot of arguments.
> 
> Things like [fft~] does more work per sample when the blocksize is 
> larger; i suspect that fiddle's situation is at least somewhat similar, 
> but I haven't tested.
> 
> GEM/PDP would be harder due to framesize differences and to how the !@#$ 
> one is supposed to measure time spent on the GPU.
> 
> I expect GridFlow to be a lot harder to measure; e.g. while pix_convolve 
> will take time that's about the size of the picture (in pixels) times 
> the size of the kernel (in pixels), in GridFlow you should only consider 
> the number of nonzero entries in the kernel (!!). And then [#convolve] 
> has special options like "op" and "fold" which aren't in any other 
> implementation of convolution that I've seen in pd, and that can change 
> the run time radically. And then [#convolve] supports *any* number of 
> channels, while [pix_convolve] is up to only 4. And so on...
> 
>> it really would be great to know the benchmarks of different 
>> hardwaresystems. marius.
> 
> Even though it's impossible to get a complete picture about the speed of 
> each class, I think that it's worth trying. However, this may require 
> some modifications to Pd. It's possible to make benchmarks in pure pd, 
> but this would require a big mess of [timer] and [t] objects in order to 
> prevent sent messages to be counted as part of the object's running 
> time. If it were done in C in a similar way, it would be much faster, 
> which would be important in order to have sufficiently accurate figures.
> 
> Even then, I fear that it wouldn't be that accurate, when lots of short 
> operations are made. In that case, a statistical profiler would be more 
> appropriate.
> 
>  _ _ __ ___ _____ ________ _____________ _____________________ ...
> | Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju
> | Freelance Digital Arts Engineer, Montréal QC Canada