[PD] noise floor: median in audio signals, for peak extraction

Thu Dec 22 16:46:55 CET 2011

Le 2011-12-20 à 21:29:00, Alexandre Torres Porres a écrit :

> this would be kinda like using the [median] or [median_n] objects, but 
> over audio blocks and not number lists.

Ok, lots of heavy details below...

The way [median] and [median_n] are built, they take a lot of time. 
Sorting each window of 32 elements is slow with any full sort. Even the 
usually slow process of putting elements one by one in a sorted array is 
faster than any full sort like qsort() or [sort], in this kind of 
situation, because you need to look at the result of the sort after each 
insertion/removal.

But a sort based on binary-trees will make your median filter able to 
compete with the speed of FFT, for example. You keep your sorted «array» 
as a sorted binary tree and this makes it fast to insert, delete, and find 
the middle. But you can't do that with C++'s std::map because it offers 
no way to find the middle quickly... you'd need something special.

There may be ways to bend quicksort so that it can do many similar sorts 
quickly, but you can't do that with qsort() nor anything based on it.

I don't know how zexy's [sort]. Apparently it doesn't use the quicksort 
method that I assumed it did, it uses the shellsort method instead, which 
is sometimes slower, sometimes faster. I wonder whether it could speed up 
your task if you made a kind of [median_n] that would reuse the 
already-sorted list to speed up the next sort. (This would not improve a 
quicksort, but I wonder whether it'd improve [sort]).

Generally, median-based methods are harder to work with, as they involve 
lots of comparisons and swaps and such, by sorting or by doing things that 
are like sorting ; whereas mean-based methods involve simple fast passes 
of addition and multiplication. But both give quite different results, in 
many situations.

> Since there's the need of calculating this in and using the result back 
> in the same block round into the audio chain, I can't put the spectrum 
> into a table, and then calculate the median over bits of it.

with [tabsend~] and a [bang~], then you can send a whole block into the 
message domain, compute it, and get it back as a signal, one block later.

> But then, how to do it? Should I be able to pull this out only if I 
> write a "median~" or [noise_floor~] external?
> 
> Or somehow there's another way to do this with some existing external, 
> or a similar technique, or even some audio math trick using [fexpr~] or 
> something?

I don't think you can do any reasonable sort using [fexpr~]... except 
perhaps a strange undecipherable network of a hundred [fexpr~] or so. I 
think it's easier to write an external.

> This has to do with the other post I did about a project that attempts 
> to isolate notes into a chord in a spectrum, something like melodyne is 
> does.

So, why is a mean (average) not good enough, while a median would be good 
enough ? It's possible, but I'd like to hear an explanation.

  ______________________________________________________________________
| Mathieu BOUCHARD ----- téléphone : +1.514.383.3801 ----- Montréal, QC