[PD-dev] UTF-8 for pd-devel (again)

Bryan Jurish jurish at uni-potsdam.de
Tue Jan 19 22:16:23 CET 2010

morning all,

attached is a UTF-8 support patch against branches/pd-gui-rewrite/0.43
revision 13051 (HEAD as of an hour or so ago).  most of the bulk is new
files (s_utf8.c, s_utf8.h), most other changes are in g_rtext.c.  It's
not too monstrous, and I've tested it again here briefly with some utf-8
test patches (see other attachment), and things appear to be working as
expected.  if desired, I can check this in; otherwise feel free to do it
for me ;-)

2 annoying things here during testing (I don't see how my patches could
have caused this, but you never know):

(1) all loaded patch windows appear at +0+0 (upper left corner), which
with my wm (windowmaker) means the title bar is off the screen, and I
have to resort to keyboard shortcuts to get them mouse-draggable, which
is a major pain in the wazoo: is this a known bug?

(2) I can't figure out how to get at the properties dialog for number,
number2, or any other gui-atom objects: should these be working already?


On 2010-01-18 23:09:34, Hans-Christoph Steiner <hans at eds.org> appears to
have written:
> Awesome!  If its big and complicated, I say post it to the list first,
> if not too bad, then just commit.
> .hc
> On Jan 18, 2010, at 4:47 AM, Bryan Jurish wrote:
>> moin Hans, moin list,
>> I think perhaps I never actually did post the cleaned-up patch anywhere
>> (bad programmer, no biscuit);  I guess I'll check out
>> branches/pd-gui-rewrite/0.43 and try patching my changes in; then I can
>> either commit or just post the (updated) patch.  Hopefully no major
>> additional changes will be required, so it ought to go pretty fast.
>> marmosets,
>>     Bryan
>> On 2010-01-17 22:57:33, Hans-Christoph Steiner <hans at eds.org> appears to
>> have written:
>>> Hey Bryan,
>>> I'd like to try to get your UTF-8 code into pd-gui-rewrite.  You mention
>>> in this posting back in May that you had the whole thing working.  I
>>> couldn't find the diff/patch for this.  Is it posted anywhere?  Do you
>>> want to try to check it in yourself directly to the pd-gui-rewrite/0.43
>>> branch?
>>> .hc
>>> On Mar 20, 2009, at 6:16 PM, Bryan Jurish wrote:
>>>> morning all,
>>>> Of course I never really like to see my code wither away in the bit
>>>> bucket, but I personally don't have any pressing need for UTF-8
>>>> symbols,
>>>> comments, etc. in Pd -- I'm a native English speaker, after all ;-)
>>>> Also, my changes are by no means the only way to do it (or even the
>>>> best
>>>> way); we could gain a little speed by slapping on some more buffers
>>>> (mostly and possibly only in rtext_senditup()), but since this seems to
>>>> effect only GUI/editing stuff, I think we can live with a smidgeon of
>>>> additional cpu time ... after all, it's all O(n) anyways.
>>>> Really I just wanted to see how easy (or difficult) it would be to get
>>>> Pd to use UTF-8 as its internal encoding... turned out to be harder
>>>> than
>>>> I had thought, but (ever so slightly) easier than I had feared :-/
>>>> marmosets,
>>>>    Bryan
>>>> On 2009-03-20 18:39:06, Hans-Christoph Steiner <hans at eds.org>
>>>> appears to
>>>> have written:
>>>>> I wonder what the best approach is to getting it included.  I also
>>>>> think
>>>>> its a very valuable contribution.  I think we need to first get the
>>>>> Tcl/Tk only changes done, since that was the mandate of the pd-devel
>>>>> 0.41 effort.  Then once Miller has accepted those changes, then we can
>>>>> start with the C modifications there.  So how to proceed next, I think
>>>>> is based on how eager you are, Bryan, to getting this in a regular
>>>>> build.
>>>>> One option is making a pd-devel-utf8 branch, another is posting these
>>>>> patches to the patch tracker and waiting for Miller to make his next
>>>>> update with the Pd-devel Tcl-Tk code.
>>>>> Maybe we can get Miller to chime in on this topic.
>>>>> .hc
>>>>> On Mar 13, 2009, at 12:00 AM, dmotd wrote:
>>>>>> hey bryan,
>>>>>> just a quick note of a appreciation for getting this one out.. i hope
>>>>>> it gets
>>>>>> picked up in millers build soon.. a very useful and necessary
>>>>>> modification.
>>>>>> well done!
>>>>>> dmotd
>>>>>> On Thursday 12 March 2009 08:07:50 Bryan Jurish wrote:
>>>>>>> moin folks,
>>>>>>> I believe I've finally got pd-devel 0.41-4 using UTF-8 across the
>>>>>>> board.
>>>>>>> So far, I've tested message boxes & comments (g_rtext), as well as
>>>>>>> symbol atoms, and all seems good.  I think we can still expect
>>>>>>> goofiness
>>>>>>> if someone names an abstraction using a multibyte character when the
>>>>>>> filesystem isn't UTF-8 encoded (raw 8-bit works for me here too),
>>>>>>> but I
>>>>>>> really don't want to open that particular can of worms.
>>>>>>> So I guess I have 2 questions:
>>>>>>> (1) what should I call the generic UTF-8 source files? (see my other
>>>>>>> post)
>>>>>>> (2) shall I commit these changes to pd-devel/0.41-4, or somewhere
>>>>>>> else,
>>>>>>> or just post a diff (ca. 33k, ought to be easier to read now; I've
>>>>>>> tried
>>>>>>> to follow the indentation conventions of the source files I
>>>>>>> modified)?
>>>>>>> marmosets,
>>>>>>>   Bryan
>>>> -- 
>>>> Bryan Jurish                           "There is *always* one more
>>>> bug."
>>>> jurish at ling.uni-potsdam.de      -Lubarsky's Law of Cybernetic
>>>> Entomology
>>> ----------------------------------------------------------------------------
>>> The arc of history bends towards justice.     - Dr. Martin Luther
>>> King, Jr.
>> -- 
>> ***************************************************
>> Bryan Jurish
>> Deutsches Textarchiv
>> Berlin-Brandenburgische Akademie der Wissenschaften
>> Jägerstr. 22/23
>> 10117 Berlin
>> Tel.:      +49 (0)30 20370 539
>> E-Mail:    jurish at bbaw.de
>> ***************************************************
> ----------------------------------------------------------------------------
> As we enjoy great advantages from inventions of others, we should be
> glad of an opportunity to serve others by any invention of ours; and
> this we should do freely and generously.         - Benjamin Franklin

Bryan Jurish                       "There is *always* one more bug."
jurish at uni-potsdam.de       -Lubarsky's Law of Cybernetic Entomology
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: pd-gui-rewrite-0.43.utf8-moo-2010-01-19.full.diff
URL: <http://lists.puredata.info/pipermail/pd-dev/attachments/20100119/3eda225d/attachment.txt>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test-utf8.pd
Type: application/puredata
Size: 686 bytes
Desc: not available
URL: <http://lists.puredata.info/pipermail/pd-dev/attachments/20100119/3eda225d/attachment.bin>

More information about the Pd-dev mailing list