[PD] standard encoding of pd files in each system?

Sat Jan 29 17:12:02 CET 2011

On Fri, 28 Jan 2011, Bryan Jurish wrote:

> iirc, Miller has indicated in the past that he feels this sort of thing 
> should be done using arrays.

But a feeling is but a feeling. Now, how about a justification ?
But that's not the sort of thing one gets from Miller often.

>  (B) you must scale all size attributes (e.g. for re-allocation) by
> 1.0/sizeof(t_float), so to get an accurate byte length that is not a
> multiple of sizeof(t_float), you need to actually store that length
> additionally somewhere else

sizeof(t_float) is always a power of two, isn't it ? I haven't heard of 
anyone using 80-bit or 96-bit floats as t_float or t_sample.

thus a size stored as float will be accurate up to 16777216.

This is regardless of whether you store size*4, size, or size/4 : floats 
are quite scale-independent, but are perfectly so when the scalings are 
powers of two (provided you don't overflow by scaling by pow(2,128) or so)

I think you could read a bit about the IEEE-754 standard :
   http://en.wikipedia.org/wiki/IEEE_754

But especially some kind of short, direct tutorial that will make it 
obvious what won't be rounded and what will be :
   http://kipirvine.com/asm/workbook/floating_tut.htm

>  (C) saving array data with a patch and re-loading can cause data loss
> (float truncation may mess up raw byte values)

for integers, all values from -1000000 to 1000000 will be correctly saved 
(those two bounds will be encoded as -1e+6 and 1e+6, and all the rest will 
look like plain integers).

>  (D) it's not really portable (byte order problems with load/save)

byte order problems won't happen with floats saved as text. they will 
happen with floats saved as binary. they will also happen with UCS-2 text 
saved as two floats per code point (no matter how you save the floats), 
but if you use UTF-8 instead, or if you use one-float-per-codepoint, that 
aspect will be safe.

> 2) If otoh you let the array remain a t_float* and just assign the
> floats byte ((unsigned) char) or even wide character (wchar) values, then:
>  (A) you potentially waste a lot of memory
> (strlen(str)*(sizeof(float)-1) bytes)

In 2011, wasting a lot of RAM is not a problem. Wasting too much RAM can 
be a problem, and that's very relative, as quite often, the solution is to 
wait until RAM is less expensive. I like the idea of not wasting any RAM, 
but I recognise that this is because I got used to think about ways to 
reduce waste, not because it's always good to worry about it.

Text is usually a lot smaller than video. It's not uncommon for me to 
store a buffer of 64 frames of video in colour. In 640x480, that's over 55 
megs, and that's tiny compared to the total amount of RAM the computer 
has. How often do you need that much text at once in RAM ?

>  (C) if you really want to store your string data in an array, you can
> use [str] or [pdstring] together with e.g. [tabdump] and [tabset] from
> zexy, which just makes the conversion overhead explicit.

GridFlow's grids support the byte format (unsigned char). This is one of 
the six allowed grid formats, and perhaps the 2nd most used (after signed 
int).

> I think there are workarounds for both techniques, but not without 
> patching the pd core code, and if we're going to patch the core code, we 
> might as well take a patch that does the job "right" (i.e. Martin's)...

If all of this can work as externals without hairy workarounds, then you 
don't need to be obsessing about patching pd's core code, and that's a 
good thing, especially if you aim to be patching vanilla's.

  _______________________________________________________________________
| Mathieu Bouchard ---- tél: +1.514.383.3801 ---- Villeray, Montréal, QC