[PD-dev] float handling in [binfile] for UTF-8 handling

Mathieu Bouchard matju at artengine.ca
Tue Jan 31 17:42:15 CET 2012


Le 2012-01-31 à 16:18:00, IOhannes m zmoelnig a écrit :

> and btw, isn't utf-8 agnostic of byte-endianness? at least [2] suggests 
> this.

UTF-8, despite the existence of something named BOM (Byte Order Mark), 
only has a single version.

The BOM was invented to distinguish endianness of the two UTF-16 
encodings : the fully big-endian one, and the «little-endian» one (which 
is actually a middle-endian because beyond byte-order of 16-bit chunks, 
UTF is always big-endian).

However, encoding the BOM as the first character of a document or string 
also allows to unambiguously distinguish UTF-8, both kinds of UTF-32, 
UTF-7, UTF-1, and the four weirder variations of UTF. So it's useful well 
beyond the original idea (name) of BOM.

  ______________________________________________________________________
| Mathieu BOUCHARD ----- téléphone : +1.514.383.3801 ----- Montréal, QC


More information about the Pd-dev mailing list