[PD] bit crusher abstraction

Fri Apr 30 03:56:15 CEST 2010

>> Reduced bit depth (which is what I think 'bit-crushing' means)  can be
>> achieved by dividing the signal by x, pass it through something like
>> [int~], multiply it by x again. An [int~] can be implemented by using
>> [wrap~] and [-~], which are both vanilla.
>
> It is also worth considering adding a DC offset. Your [wrap~] solution
> implements a floor-function, that is, rounding downwards. You can make it
> round to closest, by adding x/2 before rounding downwards.
>
> [expr int($v1)] is rounding zerowards (thus output zero corresponds to a
> twice bigger input range as any other input). Thus it will behave in a
> weird unequal way.
>
> When the volume gets low, the _effective_ bits-per-sample gets very low,
> because the relative precision of integers is proportional to amplitude
> (which is not the case with floats). Therefore, even with 16-bit audio, if
> your amplitude is 0.0001 times the max, it will feel as if it were 3-bit
> audio. In such circumstances, the differences between the possible
> roundings will become quite audible.

Bit-depth transitions are really complicated things, especially when
going from normalized float to int (and among different utitlities
which do this, e.g. libsndfile or ecasound, I've found several
different standards).  The biggest problem is what to do with the
asymmetry about zero with int representations -- there are more
negative numbers than positive in a two's complement int system.

Here are three formulas I've used for noise-shaping dither in csound
(the first one is more compatible with the bit-depth transitions that
occur when csound writes a file).

let X = the input signal and N = target bits:

1)  Y=(floor([2^(N-1)]*X)+0.5)/[2^(N-1)]
2)  Y=floor([2^(N-1)]*X+0.5)/[2^(N-1)]
3)  Y=floor([2^(N-1)]*X)/[2^(N-1)]

The only difference is whether the 0.5 is added, and whether it's
added before or after the floor function.

So let's say you're moving from floats (-1 to +1) to 8-bits:

In the first, one multiplies the signal by 2^7, getting a signal which
varies from -128 to +128 (floats).  Flooring that will result in ints
in the same range (+128 after the floor will be a relative rarity in
real signals, though, so the effective range is -128 to 127).  Adding
0.5 to all of that gives a range of -127.5 to 127.5 (with the very
rarely encountered 128.5).  Then division by 128 gives a signal which
varies between two values equally spaced from 0, but a constant 0 in
will result in a DC-offset out (it's a mid-rise function with 256
possible values).

In the second, one multiplies the signal by 2^7, getting a signal
which varies from -128 to +128 (floats).  Adding 0.5 to this will
result in a signal that varies between -127.5 to +128.5, and flooring
this gives -128 to +128 (ints).  This time, +128 will not be as rare.
This also is really just a rounding function, with x.5 values rounded
up.  It's a mid-tread function, so zeros in will result in zeros out.
At first glance it may seem as though this has one more than 2^8
values (128 negative values, 128 positive values, and 0 = 257 total
possible values).  However, if one were to graph the rounding
function, one would find that anything that started out between -127.5
and +127.5 will be rounded to -127 to +127 (255 values), while
anything lower than -127.5 will round to -128, and anything higher
than 127.5 will round to +128 -- the extremes of the staircase
function have half-quantization-step mappings, so the total effect is
still 8-bit.  This one has the advantage that both zero in and maxamp
(positive or negative) give "correct" output values as well, but the
disadvantage that one of the quantized steps is divided between the
top and the bottom of the total range.  Also, its graph is a
reflection of that of the first function about a diagonal.

In the third, one starts with the same math as the first, up to the
floor function, which gives an effective range of -128 to 127.  Then
one simply divides by the original scalar, which just gives the
mid-tread version of the function from the first formula (the first
formula minus half the quantization step), with zeros-in-zeros out but
one extra negative value than positive.

Maybe sometime later I could post some pictures for two-bit audio.
I'm sure there are plenty on the web.  There's also the question of
whether, even for a bit-crushing effect, one would add dither.

Matt