[PD-dev] strings

Sun Dec 17 21:37:33 CET 2006

Mathieu Bouchard wrote:
> On Sat, 16 Dec 2006, Martin Peach wrote:
>
>>> What if strings could be automatically cast to symbols for externals 
>>> that would rather have symbols, and vice-versa?
>> I have written an external asc2sym that takes lists of bytes and 
>> splits them into symbols based on the argument(s) which are characters.
>> But it seems important to avoid symbols as much as possible to avoid 
>> filling up the symbol table with symbols that are referenced only once..
>
> Yes, but my reason for wanting this, is that all externals currently 
> available understand symbols but not strings. So, what if you want to 
> make strings as widely used as possible, as easily as possible, and 
> working with all externals currently available in Pd?
>
> You make them work as strings when they can, and
> You make them work as symbols when they must.
There would be two objects, [stringtosymbol] and [symboltostring] that 
you could put between string and symbol objects. Of course some strings 
would get impossibly mangled this way but that's because of the way 
symbols work.
>
>> A string could be considered unused when its length is set to 0.
>
> If you want to use a string as a mutable buffer, then you want to be 
> able to have 0-length strings, as a boundary condition: you start with 
> nothing and then add to it. You don't want to have to start with 
> something just because setting the length to 0 would delete it.
>
Yes, there's no reason not to have 0-length strings. And no reason to 
trash them when they are unused either, since they don't take up more 
space than any other object.
> It seems that you are suggesting that the deallocation would be 
> user-controlled? Then how do you prevent the user from crashing pd?
I'm suggesting that a [string] be like any other object and be 
deallocated when the patcher is closed. It's basically a variable-length 
list of bytes. It would contain methods to allocated and deallocate 
memory via malloc() or pd's getbytes(), which uses calloc().
> If you use a weak-pointer as an intermediate (like t_gpointer or 
> t_gfxstub), then you still have to manage reference counts. Whatever 
> you do for the user, you have to know more about externals' behaviour 
> than what they tell you now, because right now they don't deallocate 
> atoms explicitly.
>
> But if strings are going to be deallocated explicitly and there is not 
> going to be any checks, why not instead make something that will allow 
> users to deallocate symbols. It's about as safe as that and you don't 
> need to introduce a string type.
Symbols are difficult to work with because their content gets 
interpreted, for example if I write a comment "MP 20061214" it gets 
converted into "MP 2.00612e+007", or if I want a symbol to have spaces 
or carriage returns in it, it won't get created, which is very annoying 
when a lot of serial hardware wants to see a CR before it processes a 
message.
Also every time I change a symbol, it gets added to the global symbol 
table. So adding one character at a time to a string would result in 
that many symbols being created.
A string as I see it is closer to a list, and could be operated on with 
objects like the list objects -- append, split, etc.

>
>> Memory would need to be dynamically allocated in small blocks.
>
> What do you mean "in small blocks" ?
Whatever is most efficient. If malloc is better at allocating blocks of 
256 bytes than blocks of 1 then it's better to work with multiples of 
256. It seems inefficient to allocate 65536 bytes for every string at 
creation time.
>
>> The API should return "no method for string" if the external doesn't 
>> implement strings.
>
> That's aiming low. Why shouldn't there be any automatic casts between 
> the two?
Because it would require rewriting more of the pd core, and because a 
lot of strings can't be made into symbols (strings can contain any 
integer on [0...255] but symbols cannot). Having the two converter 
objects [stringtosymbol] and [symboltostring] is easier. The "no method 
for string" message would come from pd, not the external, so the 
external doesn't need to implement any string methods.

Martin