[PD-dev] [tcpserver]: bad performance of new version

Tue May 5 01:41:35 CEST 2009

Roman Haefeli wrote:
> On Fri, 2009-05-01 at 18:48 -0400, Martin Peach wrote:
>> Roman Haefeli wrote:
>>> On Fri, 2009-05-01 at 09:16 -0400, Martin Peach wrote:
>>>> Roman Haefeli wrote:
>>>>> On Thu, 2009-04-30 at 10:17 -0400, Martin Peach wrote:
>>>>>> Roman Haefeli wrote:
>>>>>>> i ve been testing the new netpd-server based on the new
>>>>>>> [tcpserver]/[tcsocketserver FUDI] now for a while and definitely could
>>>>>>> solve some problems, but some new ones were introduced. 
>>>>>>>
>>>>>>> i found, that the most recent version of [tcpserver] peforms quite bad
>>>>>>> cpu-wise. this has some side-effects. in netpd, when a certain number of
>>>>>>> users are logged in (let's say 16), it can happen, that the traffic of
>>>>>>> those clients makes the netpd-server use more than the available
>>>>>>> cpu-time. i made some tests and checked, if all messages come through
>>>>>>> and if messages delivered by the server are still intact. under normal
>>>>>>> circumstances, there is no problem at all. but under heavy load, when
>>>>>>> the pd process is demanding more than available cpu time, some messages
>>>>>>> are corrupted or lost completely; in the worst case the pd process
>>>>>>> segfaults, at the moment of  a client connecting or disconnecting. i
>>>>>>> guess, this is due to some buffer under- or overrun between pd and the
>>>>>>> tcp stack, but i don't really know.
>>>>>> Hi Roman,
>>>>>> Did you try using the new [timeout( message? The latest version of 
>>>>>> tcpserver defaults to a 1ms timeout, so if you have a bunch if 
>>>>>> disconnected clients, Pd will hang for 1ms each, which will quickly add 
>>>>>> up to more than the audio block time and then Pd will start thrashing 
>>>>>> and eventually die or become comatose, as it were.
>>>>> no, i haven't tried this parameter yet. but i sure will do and report
>>>>> back, when i can tell more about how it behaves. 
>>>>>
>>>>> i haven't fully understood, what it does and what it can be used for.
>>>>> could you elaborate that a bit more? yet it sounds a bit strange to me,
>>>>> that i need to tweak a networking object with a time value for correct
>>>>> operation.
>>>>>
>>>> When you send some message through tcpserver, the send routne first 
>>>> checks to see if it can be sent. The call to do this is a function known 
>>>> as "select", which has a timeout parameter. The select call returns as 
>>>> soon as the socket is available or the timeout expires, whichever comes 
>>>> first. If the socket is blocked, select would never return if there was 
>>>> no timeout. So I gave the call a default 1ms timeout.
>>> ok. i think, i understand. thanks for the explanation.
>>>
>>>> This could all be done using threads as well but I just don't know when 
>>>> I'll have time to do it.
>>> no hurry. it's not the case, that i know, that threading would help for
>>> the issues, i am experiencing. i just wanted to have my troubles
>>> reported. and i think, i read somewhere about server implementations,
>>> that they often use a separate thread for each socket.
>>>
>>>> I still don't see that it would solve your 
>>>> problem anyway, if your application insists on sending to disconnected 
>>>> clients, you would have lots of threads sitting around, and still get no 
>>>> feedback about the connection.
>>> the only feedback needed: was something actually sent or not? if you (or
>>> the patch) _know_, that messages are not received by the other end, then
>>> you (the patch) can handle the situation somehow.
>>> anyway, that is the part that seems to be already working. by using the
>>> current [tcpserver], you notice, if the other end vanished or is still
>>> listening.
>>> the problems i currently encounter are coming from the fact, that the
>>> performance of the new version is probably 20 times worse than the
>>> version included in current stable pd-extended. for me its a problem,
>>> since with a certain sane number of clients connected (let's say 16), it
>>> already overloads the cpu of a 1.7GHz pentium m processor. why the big
>>> difference to the previous version?
>>>
>> If you set the sending timeout to zero (by sending [timeout 0( message 
>> to [tcpserver] )then the performance should be the same as the older 
>> version. AFAIK that's all I changed. Did you try that yet?
>> If not, something else is causing the slowdown.
>> If it works better, maybe set the timeout to 10 instead of 1000.
> 
> there is no difference in performance, no matter what value i use for
> 'timeout'. on my box, sending the message (in byte representation) from
> the benchmark test 1000 times takes ~90ms for [tcpserver]. the same (in
> ascii presentation) sent with [netserver] takes around 8ms. 
> the only difference i can see with lower (< 10us) timeout value is, that
> messages on the receiving side (client) are messed up, completely lost,
> partially cut or concatenated together. 
> on my box, the new [tcpserver] with 'timeout' set to 0 performs much
> worse than the old version with the buffer overrun problem.
> 

Maybe just calling select slows everything down then. It seems to be a 
trade-off between speed and reliability. You should really send udp 
packets, then nothing hangs if the other end doesn't receive them. You 
could still have a low-bandwidth tcp connection open to test the connection.

> have you tested on windows only? i haven't tried windows yet. how did
> you test?

I didn't test for speed at all, I just checked that it worked on WinXP 
and Debian.

Martin