[PD-dev] strings
Bryan Jurish
moocow at ling.uni-potsdam.de
Sat Dec 16 10:55:57 CET 2006
morning,
On 2006-12-16 01:40:03, Mathieu Bouchard <matju at artengine.ca> appears to
have written:
> On Fri, 15 Dec 2006, Hans-Christoph Steiner wrote:
>
> An advantage using the list-of-bytes approach is that because each
> character can be represented by a rather large integer, it can be
> extended to work on lists-of-characters meaning quickly, if there is a
> [utf8decode] and [utf8encode] to turn bytes into characters and back;
> also it's a method that is available now and reuses the existing list
> objects; and it's a method that supports \0 (NUL) characters.
>
> Disadvantages are that it takes more time to convert to C strings and
> back, it takes more space in .pd files, it isn't readable as text in .pd
> files, it takes up to 4 times more space to represent in .pd files, and
> exactly 4 times more space in RAM (in the case that just iso-latin-1 is
> used), and also that you can't make lists of strings like that.
i count (sizeof(int)+sizeof(float)-1)*strlen(message) wasted bytes per
string object, not counting the selector. as i think we've discussed
before, using ieee floats, which should be able to losslessly encode a
24 bit integer, that can be tweaked down to
(sizeof(int)+sizeof(float)-1)*strlen(message)/3 on average, but on my
system (32 bit floats), that still amounts to one wasted byte per
character for the representation, and it's hellishly cryptic to boot.
> (By the time we can have real strings, we can have nested-lists, and the
> other way around, because they'd use the same mechanisms. whether it's
> better to make them two types or one type, is a good question.)
... but then again, what else are ascii 0x1c-0x1f (28-31 =
{fs,gs,rs,us}) for? it's another ugly hack, would reserve some of the
ascii range, and would require additional parsing objects (potentially
constructable with [list]), but it's a possibility, should anyone
actually need nested lists as strings...
please don't get me wrong: i'm all in favor of "real" strings, nested
lists, and associative arrays - i wrote [pdstring] because i needed to
send some generated text over OSC to someone who could only interpret
ascii values: i'm glad if it's helpful to anyone besides myself, and i
don't see much difficulty in adding support for low-level c-type string
operations ([toupper], [tolower], at some later point maybe even
regexes), but i can't bring myself to believe that the list-of-bytes
approach is really the "right" way to do it, although i don't have a
better idea at the moment...
marmosets,
Bryan
--
Bryan Jurish "There is *always* one more bug."
jurish at ling.uni-potsdam.de -Lubarsky's Law of Cybernetic Entomology
More information about the Pd-dev
mailing list