[PD-dev] Pd Strings
Mathieu Bouchard
matju at artengine.ca
Fri Nov 23 11:17:24 CET 2007
On Wed, 14 Nov 2007, Hans-Christoph Steiner wrote:
> Using arrays as strings is an interesting idea. I don't think non-
> ascii charsets should be too big a deal, they are decently supported
> right now, without even trying :). The Pd floats should store UTF-16
> fine, which really covers basically everything. By the time UTF-32
> is used much, Pd will be using 64-bit floats.
UTF is just an encoding over variable number of bytes. Storing as pd
floats is more like UCS-4, which is a unicode encoding over a fixed number
of bytes, the difference being that UCS-4 uses uint32 instead of float32,
or instead of float32 plus a type tag that is always set to say "this is a
float", or that plus padding because of 64-bit mode.
float32 supports all integers from 0 to 16777216, so, it includes anything
that uint24 can do. I don't think you need to go beyond 18 bits, let alone
24. (UTF-8 needs extra bits per byte to say whether the character
continues in the next byte; those don't count here, as we'd be using a
fixed size)
Afaik, the difference between UTF-8 and UTF-16 is only that the latter
tends to take somewhat less space if most of your characters are not in
the ASCII range (32 to 126). this is because of the extra bits I'm talking
about. In practice, all you can do with UTF-16 can already be done with
UTF-8.
_ _ __ ___ _____ ________ _____________ _____________________ ...
| Mathieu Bouchard - tél:+1.514.383.3801, Montréal QC Canada
More information about the Pd-dev
mailing list