[PD] is this a spectral gate?

Frank Barknecht fbar at footils.org
Tue Feb 20 12:54:21 CET 2007


Hallo,
Kevin McCoy hat gesagt: // Kevin McCoy wrote:

> I am still pretty new at FFT things but I am having a lot of fun.  I know
> Tom Erbe's soundhack has something called a "spectral gate" so I thought I'd
> give it a shot and try to make my own in Pd after reading about it.  Doesn't
> sound all that great, it actually ends up sounding like a really low quality
> wma file or something :)
> 
> Is this technically a spectral gate?  I'm using [>~] from zexy which in my
> mind says, "Look at all of the frequencies in the block and only allow those
> which are above value x to pass through."  I've attached the patch here -
> any info or guidance is much appreciated.  Any sound that goes through it
> pretty much loses all definition and clarity - is there a fix for this?

I'm not really sure what spectral gate does, but you've probably seen
doc/3.audio.examples/I03.resynthesis.pd which is a kind of equalizer
or multiband-filter. 

Maybe you've taken this patch as a model for your patch. But then some
things are wrong in your adaption. First: [tabreceive $0-hann] will
not receive anything, because [table $0-hann] is missing. So it will
only output zeros. Either remove the multiplication with $0-hann, or
add the [pd Hann-window] to the patch.

But the real (and imaginary) mistake is the actual "gating" with [>~].
If you open the fft~-help.pd file (Help on rfft~), and print~ what
comes out of rfft~'s outlets, you will see something like this: 

real:
0.00016968 -4.6019e-07 -2.1632e-07 -3.2469e-07 -9.2026e-07 32       -7.499e-07 -3.2501e-07
-3.1411e-07 -9.6657e-07 -6.2494e-05 -1.4388e-06 -5.035e-07 -5.0472e-07 -1.665e-06 -2.3571e-05
-5.8313e-07 -7.5066e-08 -6.2174e-07 -2.1764e-06 -1.0339e-05 -1.8991e-06 -3.5305e-07 1.376e-07
-2.2529e-06 -4.486e-06 -4.1878e-07 -1.9073e-06 -5.2361e-07 -1.7626e-06 -2.8447e-06 -2.943e-07
-7.6485e-08 0        0        0        0        0        0        0
0        0        0        0        0        0        0        0
0        0        0        0        0        0        0        0
0        0        0        0        0        0        0        0
imaginary:
0        -2.7462e-09 1.9217e-07 -1.4215e-06 -1.0745e-07 -0.00013828 1.4264e-07 -2.5163e-07
1.0069e-07 3.7871e-07 -5.0489e-06 2.0548e-06 -1.8181e-08 -4.5923e-07 1.5607e-07 -6.6416e-06
-2.0862e-07 -5.8076e-07 6.2139e-08 -3.0756e-07 -4.1732e-06 6.2431e-07 -2.9043e-07 -5.2873e-07
-2.2713e-07 -3.1568e-06 -1.1066e-07 -0       -2.0931e-07 -3.162e-07 -6.2198e-07 -2.3119e-07
0        0        0        0        0        0        0        0
0        0        0        0        0        0        0        0
0        0        0        0        0        0        0        0
0        0        0        0        0        0        0        0

Now if you clip this with [>~ 0.8] you will get a series of 0 and 1 in
each of these numbers. It might sound funky, but it's not what you
want to achieve.

So lets first have a look at what [rfft~] does: It will give you two
signals. One is called the real, the other the imaginary part, but
lets forget about this for now and look at it from a bit afar:

Generally a FFT will do a spectral analysis. It will calculate, what
sine waves you need to add up to get the same signal as that played in
the current signal block. Basically it will tell you the frequencies
and phases (first and second inlets) and amplitudes of a lot of [osc~]
objects that, if you add them all up, would resynthesize your current
signal. (You cannot directly use these [osc~] objects to resynthesize
what comes from rfft~ but lets for a moment assume that we could.)

How many [osc~] objects you can control, will depend on the
block-size: The FFT will generate control data for blocksize/2
oscillators. So with a blocksize of 64, you get frequencies, phase and
amplitudes for 32 osc~s.

Now for some deep mathematical reasons all these [osc~] objects have
fixed tunings: They all are multiples (harmonics) of
Samplerate/Blocksize.  So it starts at f0 = 0 Hertz, the next [osc~]
would have a frequency f1= 1 * SR/BS, the next at f2=2*SR/BS up to the
final one: f_final = (BS/2) * SR/BS == SR/2 or the Nyquist-frequency.
For a blocksize of 16 and a samplerate of 48000 Hz this would be: 

f0: 0
f1: 1 * 48000/16 = 3000
f2: 2 * 48000/16 = 6000
...
f32: 8 * 48000/16 = 24000

(Actually of course these are bs/2 + 1 frequencies, but 0 and Nyquist
are special anyway so I thought I could cheat a bit. ;))

Because the frequencies are fixed and known, the rfft~ object doesn't
need to specify them explicitly. It only needs to calculate the
amplitude and the phase of every partial [osc~].

Now the tricky parts to understand are these: 

[rfft~] will not directly output the amplitudes and the phases, but
this strange thing called real and imaginary part. These carry exactly
the same information about amplitude and phase, but encoded a bit
differently than you are probably used to from working with [osc~]:

They are specified in a kind of polar coordinate system, where the
amplitude is the radius (or distance from origin) and the phase is the
angle of the polar coordinates. Re and Im however are cartesian
coordinates (in the complex plane).

You can convert re/img-pairs to amplitude and phase using these
formulas: 

 amp = sqrt(re^2 + im^2) 
 phs = arctan(re/im)

This is a standard cartesian to polar conversion.

Most of the time you can skip calculating the phase, but more on that
later.

The amplitude calculation in Pd lingo looks like this:

amp:

 [rfft~]
 |\    |\
 [*~]  [*~]
 |    /
 |   /
 [+~ ] <= just inserted for clarity, you can also directly go to sqrt~
 |
 [sqrt~] or [q8_sqrt~], which is much faster.


The real and imaginary part (or the phase and amplitudes) are encoded
inside the signal blocks, that [rfft~] outputs. The first pair of
samples of the left and right outlet~s of [rfft~] contains the info
about amplitude and phase for the first [osc~] in our big oscillator
bank, that has frequency f0. Each second sample pair contains info for
the next osc~ with frequency f1 and so on up to the sample pair number
"blocksize/2", which contains the amp and phase for the final
oscillator at Nyquist frequency. The rest of the block always is zero,
as we don't have oscillators for that.

Some real world data might be useful: Assume we have a blocksize of 8.
Then a block of samples might look like this, when print~ed:

orig:
0.13004  0.26951  0.40352  0.52934  0.64446  0.74649  0.83341  0.90344 

If you send this through [rfft~] you will get this: 

img:
0        1.0317   0.41679  0.17191  0        0        0        0       

re:
4.4602   -0.58717 -0.46243 -0.44167 -0.43737 0        0        0 

Sending these two to [rifft~] and dividing by blocksize 8 will give
you the the original signal block back. 

You can also calculate the amplitudes like above, which of course is
easy for our first sample: amp = sqrt(4.4602^2 + 0^2) = 4.4602

Actually to get the correct amplitudes you would need to normalize the
re/im pairs here as well by dividing them by 8, I just skipped that.

Here's the full scoop:

amp:
4.4602   1.1871   0.62254  0.47394  0.43737  0        0        0

See attached "fft-up-close.pd" to try this on your own.

This means, that resynthesizing this signal at SR=48000 would be
similar to using oscillators like this: 

  [osc~ 0]
  |
  [*~ 4.4602]

  [osc~ 6000]
  |
  [*~ 1.1871]

  [osc~ 12000]
  |
  [*~ 0.62254]

  [osc~ 18000]
  |
  [*~ 0.47394]
  
  ...

and so on (Note that without normalizing these values are to loud.)

However: All these oscillators would also need to have their phases
set accordingly, so you cannot just use above oscillator bank directly
in real life.

The inverse FFT objects like [rifft~] will accept the amplitude and
phase information in the real/imaginary format directly. This means,
you can think of the [rifft~] as a resynthesis bank of blocksize/2
oscillators like above with real and imaginary inputs instead of
amplitude and phase input, and every oscillator inside [rifft~] is
spaced Samplerate/Blocksize Hertz apart.

As fft~-help.pd and my calculation above shows, connecting an [rfft~]
to a [rifft~] will just pass the signal practically unchanged (it's
just a bit louder afterwards, that's why you normally normalize it by
dividing the output by the blocksize like [/~ 64]). Depending on
Windowing and Overlap you need to use a different normalization
factor.

Of course it will only get interesting if we wreck havoc to the re/im
frequency data in the meantime. 

A simple FFT-based modification is shown in I03.resynthesis.pd: Here
every re/im sample pair (or every amplitude/phase-info for the
respective "oscillator" in rifft~) is multiplied by some value
retrieved from the gain table through tabreceive~. If this table has a
1 at a certain sample, this data is passed unchanged, if it has a 0 at
another sample, than that oscillator is muted. This is a filter
operation, and it only affects the amplitudes of the internal
oscillators. 

You might ask: "Why only the amplitudes? What about the phases? You
said, they are also encoded in the re/im data? Are you cheating
again?!" Read on.

If we scale the re/im pair by a value x, then the amplitudes will be
scaled by x as well: 

 amp(x*re,x*im) = sqrt((x * re)^2 + (x * im)^2)
                = sqrt (x^2 * (re^2+im^2))
                = sqrt(x^2) * sqrt(re^2+im^2)
                = x * amp(re,im)

However the phases will stay the same! Proof:

 phs(x*re,x*im) = atan(x*re/x*im) = atan(re/im) = phs(re,im)

Get it? That's why for such modifications you can omit the phase
calculation with atan etc.

Note that you need to do this multiplication *on every block* again
and again, because the data coming out of [rfft~] is constantly
updated - it still is an audio signal! That's why a [tabreceive~] is
used: Although the table received is not changing all the time, we
still need to read it again on every block and make a signal out of
it.

Now for a simple, amplitude-dependent gating or filtering, you first
need to calculate the actual amplitude using the formula above. Then
compare it to a value and multiply the original re/im-pairs with 0 or
1 depending on the result to change the amplitudes used in the
resynthesis. 

Attached specgate.pd illustrates this and also has a comparison of the
windowed and unwindowed fft, that affects the quality of the result
and also your normalization factors.

Ciao
-- 
 Frank Barknecht                 _ ______footils.org_ __goto10.org__
-------------- next part --------------
#N canvas 0 0 819 706 10;
#X obj 376 122 block~ 8;
#X msg 411 189 bang;
#X obj 71 175 rfft~;
#X obj 71 110 tabreceive~ \$0-8;
#X msg 390 75 bang;
#N canvas 0 0 450 300 (subpatch) 0;
#X array \$0-8 8 float 0;
#X coords 0 1 7 -1 200 140 1;
#X restore 503 49 graph;
#X obj 376 95 tabwrite~ \$0-8;
#X obj 376 51 osc~ 1000;
#X obj 411 219 print~ orig;
#X msg 408 302 bang;
#X obj 408 332 print~ re;
#X msg 410 246 bang;
#X obj 410 276 print~ img;
#X msg 405 438 bang;
#X obj 71 324 rifft~;
#X obj 405 471 print~ after-fft;
#X obj 72 360 /~ 8;
#X obj 164 279 *~;
#X obj 193 279 *~;
#X obj 163 309 sqrt~;
#X msg 404 370 bang;
#X obj 404 407 print~ amp;
#X connect 1 0 8 0;
#X connect 2 0 10 0;
#X connect 2 0 14 0;
#X connect 2 0 17 0;
#X connect 2 0 17 1;
#X connect 2 1 12 0;
#X connect 2 1 14 1;
#X connect 2 1 18 1;
#X connect 2 1 18 0;
#X connect 3 0 2 0;
#X connect 3 0 8 0;
#X connect 4 0 6 0;
#X connect 7 0 6 0;
#X connect 9 0 10 0;
#X connect 11 0 12 0;
#X connect 13 0 15 0;
#X connect 14 0 16 0;
#X connect 16 0 15 0;
#X connect 17 0 19 0;
#X connect 18 0 19 0;
#X connect 19 0 21 0;
#X connect 20 0 21 0;
-------------- next part --------------
#N canvas 315 82 813 656 12;
#N canvas 131 161 845 700 fft-analysis 0;
#X obj 110 84 inlet~;
#X obj 110 133 rfft~;
#X obj 110 455 rifft~;
#X obj 110 597 outlet~;
#X obj 208 388 >~ 0.8;
#X obj 207 179 *~;
#X obj 242 182 *~;
#X obj 111 424 *~;
#X obj 158 427 *~;
#X obj 208 240 q8_sqrt~;
#X obj 257 342 inlet;
#X obj 543 348 print~;
#X msg 544 322 bang;
#X text 298 243 calculate amplitudes;
#X obj 109 482 /~ 512;
#X obj 209 291 /~ 512;
#X obj 471 35 block~ 512;
#X obj 208 209 +~ 1e-20;
#X text 295 210 protect against divide-by-zero;
#X text 175 483 divide by N to normalize;
#X text 277 389 only let amps > x pass;
#X text 288 295 normalize: divide by N.;
#X connect 0 0 1 0;
#X connect 1 0 7 0;
#X connect 1 0 5 0;
#X connect 1 0 5 1;
#X connect 1 1 8 0;
#X connect 1 1 6 0;
#X connect 1 1 6 1;
#X connect 2 0 14 0;
#X connect 4 0 8 1;
#X connect 4 0 7 1;
#X connect 5 0 17 0;
#X connect 6 0 17 0;
#X connect 7 0 2 0;
#X connect 8 0 2 1;
#X connect 9 0 15 0;
#X connect 10 0 4 1;
#X connect 12 0 11 0;
#X connect 14 0 3 0;
#X connect 15 0 4 0;
#X connect 15 0 11 0;
#X connect 17 0 9 0;
#X restore 158 340 pd fft-analysis;
#X obj 158 479 dac~;
#N canvas 35 66 592 433 Hann-window 0;
#N canvas 0 0 450 300 (subpatch) 0;
#X array \$0-hann 512 float 0;
#X coords 0 1 511 0 200 120 1;
#X restore 293 249 graph;
#X msg 171 263 0;
#X obj 65 312 osc~;
#X obj 65 264 samplerate~;
#X obj 65 335 *~ -0.5;
#X obj 65 358 +~ 0.5;
#X obj 57 383 tabwrite~ \$0-hann;
#X text 279 241 1;
#X text 272 359 0;
#X text 288 372 0;
#X obj 65 288 / 512;
#X obj 57 241 bng 15 250 50 0 empty empty empty 0 -6 0 8 -262144 -1
-1;
#X text 336 221 Hann window;
#X text 113 310 period 512;
#X text 90 215 recalculate Hann;
#X text 125 230 window table;
#X obj 57 146 loadbang;
#X msg 79 179 \; pd dsp 1;
#X text 40 27 The Hann window is now recomputed on 'loadbang' to make
the file smaller (it doesn't have to be saved with the array.);
#X text 474 375 511;
#X connect 1 0 2 1;
#X connect 2 0 4 0;
#X connect 3 0 10 0;
#X connect 4 0 5 0;
#X connect 5 0 6 0;
#X connect 10 0 2 0;
#X connect 11 0 3 0;
#X connect 11 0 1 0;
#X connect 11 0 6 0;
#X connect 16 0 11 0;
#X connect 16 0 17 0;
#X restore 462 86 pd Hann-window;
#X obj 193 404 hsl 128 15 0 127 0 0 empty empty empty -2 -8 0 10 -262144
-1 -1 0 1;
#X obj 190 426 dbtorms;
#X obj 291 295 hsl 128 15 0 127 0 0 empty empty empty -2 -8 0 10 -262144
-1 -1 5500 1;
#X obj 288 317 dbtorms;
#X floatatom 312 343 5 0 0 0 - - -;
#X obj 90 392 env~;
#X floatatom 90 418 5 0 0 0 - - -;
#N canvas 131 161 845 700 fft-windowed 0;
#X obj 111 49 inlet~;
#X obj 110 133 rfft~;
#X obj 110 455 rifft~;
#X obj 110 597 outlet~;
#X obj 208 388 >~ 0.8;
#X obj 207 179 *~;
#X obj 242 182 *~;
#X obj 111 424 *~;
#X obj 158 427 *~;
#X obj 208 240 q8_sqrt~;
#X obj 257 342 inlet;
#X obj 543 348 print~;
#X msg 544 322 bang;
#X text 298 243 calculate amplitudes;
#X obj 208 209 +~ 1e-20;
#X text 295 210 protect against divide-by-zero;
#X text 277 389 only let amps > x pass;
#X obj 137 85 tabreceive~ \$0-hann;
#X obj 112 107 *~;
#X obj 471 35 block~ 512 4;
#X obj 135 538 tabreceive~ \$0-hann;
#X obj 110 560 *~;
#X text 189 476 divide by 3N/2 (factor of N because rfft and rifft
aren't normalized \, and 3/2 is the gain of overlap-4 reconstruction
when Hann window function is applied twice.);
#X obj 109 492 /~ 768;
#X obj 209 291 /~ 768;
#X text 288 295 normalize;
#X connect 0 0 18 0;
#X connect 1 0 7 0;
#X connect 1 0 5 0;
#X connect 1 0 5 1;
#X connect 1 1 8 0;
#X connect 1 1 6 0;
#X connect 1 1 6 1;
#X connect 2 0 23 0;
#X connect 4 0 8 1;
#X connect 4 0 7 1;
#X connect 5 0 14 0;
#X connect 6 0 14 0;
#X connect 7 0 2 0;
#X connect 8 0 2 1;
#X connect 9 0 24 0;
#X connect 10 0 4 1;
#X connect 12 0 11 0;
#X connect 14 0 9 0;
#X connect 17 0 18 1;
#X connect 18 0 1 0;
#X connect 20 0 21 1;
#X connect 21 0 3 0;
#X connect 23 0 21 0;
#X connect 24 0 4 0;
#X connect 24 0 11 0;
#X restore 436 272 pd fft-windowed;
#X obj 567 226 hsl 128 15 0 127 0 0 empty empty empty -2 -8 0 10 -262144
-1 -1 5400 1;
#X obj 564 248 dbtorms;
#X obj 434 453 dac~;
#X obj 469 378 hsl 128 15 0 127 0 0 empty empty empty -2 -8 0 10 -262144
-1 -1 10500 1;
#X obj 466 400 dbtorms;
#X floatatom 586 274 5 0 0 0 - - -;
#X text 39 342 unwindowed:;
#X obj 435 425 *~ 0;
#X obj 159 451 *~ 0;
#X obj 157 86 tabplay~ \$0-sample;
#X obj 173 61 spigot 1;
#X obj 240 38 tgl 15 0 empty empty loop 17 7 0 10 -262144 -1 -1 0 1
;
#X msg 157 35 bang;
#N canvas 0 0 506 290 sample 0;
#X obj 76 84 openpanel;
#X obj 76 161 soundfiler;
#X obj 75 189 table \$0-sample;
#X msg 76 135 read -resize \$1 \$2-sample;
#X obj 76 110 pack s \$0;
#X obj 76 57 inlet;
#X connect 0 0 4 0;
#X connect 3 0 1 0;
#X connect 4 0 3 0;
#X connect 5 0 0 0;
#X restore 346 86 pd sample;
#X obj 346 61 bng 20 250 50 0 empty empty open-sample 0 -6 0 8 -262144
-1 -1;
#X obj 455 305 env~;
#X floatatom 455 331 5 0 0 0 - - -;
#X connect 0 0 8 0;
#X connect 0 0 19 0;
#X connect 3 0 4 0;
#X connect 4 0 19 1;
#X connect 5 0 6 0;
#X connect 6 0 0 1;
#X connect 6 0 7 0;
#X connect 8 0 9 0;
#X connect 10 0 18 0;
#X connect 10 0 26 0;
#X connect 11 0 12 0;
#X connect 12 0 10 1;
#X connect 12 0 16 0;
#X connect 14 0 15 0;
#X connect 15 0 18 1;
#X connect 18 0 13 0;
#X connect 18 0 13 1;
#X connect 19 0 1 0;
#X connect 19 0 1 1;
#X connect 20 0 0 0;
#X connect 20 0 10 0;
#X connect 20 1 21 0;
#X connect 21 0 20 0;
#X connect 22 0 21 1;
#X connect 23 0 20 0;
#X connect 25 0 24 0;
#X connect 26 0 27 0;


More information about the Pd-list mailing list