[PD-dev] UTF-8 for pd-devel (again)

Bryan Jurish moocow at ling.uni-potsdam.de
Fri Mar 20 23:16:12 CET 2009


morning all,

Of course I never really like to see my code wither away in the bit
bucket, but I personally don't have any pressing need for UTF-8 symbols,
comments, etc. in Pd -- I'm a native English speaker, after all ;-)

Also, my changes are by no means the only way to do it (or even the best
way); we could gain a little speed by slapping on some more buffers
(mostly and possibly only in rtext_senditup()), but since this seems to
effect only GUI/editing stuff, I think we can live with a smidgeon of
additional cpu time ... after all, it's all O(n) anyways.

Really I just wanted to see how easy (or difficult) it would be to get
Pd to use UTF-8 as its internal encoding... turned out to be harder than
I had thought, but (ever so slightly) easier than I had feared :-/

marmosets,
	Bryan

On 2009-03-20 18:39:06, Hans-Christoph Steiner <hans at eds.org> appears to
have written:
> 
> I wonder what the best approach is to getting it included.  I also think
> its a very valuable contribution.  I think we need to first get the
> Tcl/Tk only changes done, since that was the mandate of the pd-devel
> 0.41 effort.  Then once Miller has accepted those changes, then we can
> start with the C modifications there.  So how to proceed next, I think
> is based on how eager you are, Bryan, to getting this in a regular build.
> 
> One option is making a pd-devel-utf8 branch, another is posting these
> patches to the patch tracker and waiting for Miller to make his next
> update with the Pd-devel Tcl-Tk code.
> 
> Maybe we can get Miller to chime in on this topic.
> 
> .hc
> 
> On Mar 13, 2009, at 12:00 AM, dmotd wrote:
> 
>> hey bryan,
>>
>> just a quick note of a appreciation for getting this one out.. i hope
>> it gets
>> picked up in millers build soon.. a very useful and necessary
>> modification.
>>
>> well done!
>>
>> dmotd
>>
>> On Thursday 12 March 2009 08:07:50 Bryan Jurish wrote:
>>> moin folks,
>>>
>>> I believe I've finally got pd-devel 0.41-4 using UTF-8 across the board.
>>> So far, I've tested message boxes & comments (g_rtext), as well as
>>> symbol atoms, and all seems good.  I think we can still expect goofiness
>>> if someone names an abstraction using a multibyte character when the
>>> filesystem isn't UTF-8 encoded (raw 8-bit works for me here too), but I
>>> really don't want to open that particular can of worms.
>>>
>>> So I guess I have 2 questions:
>>>
>>> (1) what should I call the generic UTF-8 source files? (see my other
>>> post)
>>>
>>> (2) shall I commit these changes to pd-devel/0.41-4, or somewhere else,
>>> or just post a diff (ca. 33k, ought to be easier to read now; I've tried
>>> to follow the indentation conventions of the source files I modified)?
>>>
>>> marmosets,
>>>     Bryan

-- 
Bryan Jurish                           "There is *always* one more bug."
jurish at ling.uni-potsdam.de      -Lubarsky's Law of Cybernetic Entomology




More information about the Pd-dev mailing list