[PD] Making a Realtime Convolution External

Tue Apr 5 05:08:37 CEST 2011

Dear Seth,

Seth Nickell wrote:
> Another question on similar lines...
> 
> Are the DSP calls liable to vary t_signal->s_n (block size) without
> notification? 64 samples, apparently the default on pd-extended, is
> doable without buffering for partitioned convolution on a modern
> computer, but it exacts a pretty high CPU toll, and if I have to
> handle random blocksize changes, it gets more expensive.
They cannot vary by themselves, but what is usually done (e.g. with 
FFTs), is to place an signal (tilde ~) object in a subpatch and resize 
the blocksize for that blocksize using the [switch~] or [block~] 
objects. You might consider using this very approach.
> 
> Also, since convolution is much more efficient around block sizes of
> 256 or 512, perhaps I should default to one of these, buffer a little,
> and have a "runatpdblocksize" message or somesuch?
I still have not understood if/how the user can set the duration of the 
first partition of you partitioned convolution, and how these partitions 
are structured in their (possibly increasing) sizes. Since this first 
paramter will define the latency-vs-CPU tradeoff it should not be preset 
by the developers.

P.

PS: Pd and Pd-extended use the same core, audio engine. You might want 
to consider Pd-extended as vanilla Pd with a folder full of precompiled 
externals.

> 
> On Mon, Apr 4, 2011 at 7:48 PM, Seth Nickell <seth at meatscience.net> wrote:
>>>> 2) Anyone have requests for features/api? Its currently simplistic:
>>>>   - takes a "read FILENAME" message, loads the file, does a test
>>>> convolution against pink noise to normalize the gain to something sane
>>> Is this done within the main Pd audio thread?
>> The convolution engine has support for doing it either on the calling
>> thread, or a background thread. I'm thinking of default to a
>> background thread. That seem like the right move?
>>
>>>>   - caches the last N impulse responses, as the test convolution
>>>> takes a little time
>>>>   - allows setting the cache size with a "cachesize N" message
>>> To make sure I understood this: cachesize is not the size of the first
>>> partition of the partitioned convolution, but the cache that tries to avoid
>>> audio dropouts when performing the test convolution?
>> The convolution engine can swap-in a pre-loaded ('cached') IR in
>> realtime without glitching... but it means keeping 2x the Impulse
>> Response data in RAM. To keep the default API simple but useful, I'm
>> defaulting to caching only the last 5 impulse responses in RAM.
>> "cachesize N" lets you increase that number.... lets say in a
>> performance you wanted to use 30 different impulse responses and you
>> have 2GB of ram... should be nbd.
>>
>>>>   - disable normalization with "normalize 0" or "normalize 1"
>>> Yes, disabling this could be a good idea! You could also add a "gain 0-1"
>>> message for manual control.
>> Its worth noting that impulse responses are usually whack without gain
>>  normalization.... like factors of hundreds to millions off a usable
>> signal.
>>
>>>>  Features I'm considering (let me know if they sound useful):
>>>>    - load from an array instead of from disk (no gain normalization?)
>>> Very good.
>>>>    - It wouldn't be hard to enable MxN convolution if that floats
>>>> somebody's boat.
>>> I am sure if you come up with a convolution as efficient and flexible as
>>> jconv by Fons within Pd, then soon a multichannel use and hence request will
>>> come up fast.
>> I'd be interested in what flexibility means in this context, it might
>> give me some good ideas for features to add. Efficiency-wise, last
>> time I benchmarked its more efficient than jconv, but the difference
>> is offset by less graceful degradation under CPU load (I convolve in
>> background threads to preserve realtime in the main thread while
>> avoiding an irritating patent that's going to expire soon...).
>>
>> WRT to Pd's audio scheduling... are Pd signal externals held to
>> realtime or can my dsp call vary the number of cycles it takes by 100%
>> from call to call? VST seems to do ok with this, but AudioUnits get
>> scheduled to run at the very last instant they possibly could. If Pd
>> can have some variance, I can drop the threads and improve the
>> external's degradation under high CPU load.
>>
>> thanks for the feedback (also, is the best list for this kind of feedback?),
>>
>> -Seth
>>