[PD-dev] [tcpserver]: bad performance of new version

Tue May 5 02:13:57 CEST 2009

On Mon, 2009-05-04 at 19:41 -0400, Martin Peach wrote:
> Roman Haefeli wrote:
> > On Fri, 2009-05-01 at 18:48 -0400, Martin Peach wrote:
> >> Roman Haefeli wrote:
> >>> On Fri, 2009-05-01 at 09:16 -0400, Martin Peach wrote:
> >>>> Roman Haefeli wrote:
> >>>>> On Thu, 2009-04-30 at 10:17 -0400, Martin Peach wrote:
> >>>>>> Roman Haefeli wrote:
> >>>>>>> i ve been testing the new netpd-server based on the new
> >>>>>>> [tcpserver]/[tcsocketserver FUDI] now for a while and definitely could
> >>>>>>> solve some problems, but some new ones were introduced. 
> >>>>>>>
> >>>>>>> i found, that the most recent version of [tcpserver] peforms quite bad
> >>>>>>> cpu-wise. this has some side-effects. in netpd, when a certain number of
> >>>>>>> users are logged in (let's say 16), it can happen, that the traffic of
> >>>>>>> those clients makes the netpd-server use more than the available
> >>>>>>> cpu-time. i made some tests and checked, if all messages come through
> >>>>>>> and if messages delivered by the server are still intact. under normal
> >>>>>>> circumstances, there is no problem at all. but under heavy load, when
> >>>>>>> the pd process is demanding more than available cpu time, some messages
> >>>>>>> are corrupted or lost completely; in the worst case the pd process
> >>>>>>> segfaults, at the moment of  a client connecting or disconnecting. i
> >>>>>>> guess, this is due to some buffer under- or overrun between pd and the
> >>>>>>> tcp stack, but i don't really know.
> >>>>>> Hi Roman,
> >>>>>> Did you try using the new [timeout( message? The latest version of 
> >>>>>> tcpserver defaults to a 1ms timeout, so if you have a bunch if 
> >>>>>> disconnected clients, Pd will hang for 1ms each, which will quickly add 
> >>>>>> up to more than the audio block time and then Pd will start thrashing 
> >>>>>> and eventually die or become comatose, as it were.
> >>>>> no, i haven't tried this parameter yet. but i sure will do and report
> >>>>> back, when i can tell more about how it behaves. 
> >>>>>
> >>>>> i haven't fully understood, what it does and what it can be used for.
> >>>>> could you elaborate that a bit more? yet it sounds a bit strange to me,
> >>>>> that i need to tweak a networking object with a time value for correct
> >>>>> operation.
> >>>>>
> >>>> When you send some message through tcpserver, the send routne first 
> >>>> checks to see if it can be sent. The call to do this is a function known 
> >>>> as "select", which has a timeout parameter. The select call returns as 
> >>>> soon as the socket is available or the timeout expires, whichever comes 
> >>>> first. If the socket is blocked, select would never return if there was 
> >>>> no timeout. So I gave the call a default 1ms timeout.
> >>> ok. i think, i understand. thanks for the explanation.
> >>>
> >>>> This could all be done using threads as well but I just don't know when 
> >>>> I'll have time to do it.
> >>> no hurry. it's not the case, that i know, that threading would help for
> >>> the issues, i am experiencing. i just wanted to have my troubles
> >>> reported. and i think, i read somewhere about server implementations,
> >>> that they often use a separate thread for each socket.
> >>>
> >>>> I still don't see that it would solve your 
> >>>> problem anyway, if your application insists on sending to disconnected 
> >>>> clients, you would have lots of threads sitting around, and still get no 
> >>>> feedback about the connection.
> >>> the only feedback needed: was something actually sent or not? if you (or
> >>> the patch) _know_, that messages are not received by the other end, then
> >>> you (the patch) can handle the situation somehow.
> >>> anyway, that is the part that seems to be already working. by using the
> >>> current [tcpserver], you notice, if the other end vanished or is still
> >>> listening.
> >>> the problems i currently encounter are coming from the fact, that the
> >>> performance of the new version is probably 20 times worse than the
> >>> version included in current stable pd-extended. for me its a problem,
> >>> since with a certain sane number of clients connected (let's say 16), it
> >>> already overloads the cpu of a 1.7GHz pentium m processor. why the big
> >>> difference to the previous version?
> >>>
> >> If you set the sending timeout to zero (by sending [timeout 0( message 
> >> to [tcpserver] )then the performance should be the same as the older 
> >> version. AFAIK that's all I changed. Did you try that yet?
> >> If not, something else is causing the slowdown.
> >> If it works better, maybe set the timeout to 10 instead of 1000.
> > 
> > there is no difference in performance, no matter what value i use for
> > 'timeout'. on my box, sending the message (in byte representation) from
> > the benchmark test 1000 times takes ~90ms for [tcpserver]. the same (in
> > ascii presentation) sent with [netserver] takes around 8ms. 
> > the only difference i can see with lower (< 10us) timeout value is, that
> > messages on the receiving side (client) are messed up, completely lost,
> > partially cut or concatenated together. 
> > on my box, the new [tcpserver] with 'timeout' set to 0 performs much
> > worse than the old version with the buffer overrun problem.
> > 
> 
> Maybe just calling select slows everything down then. It seems to be a 
> trade-off between speed and reliability. You should really send udp 
> packets, then nothing hangs if the other end doesn't receive them. You 
> could still have a low-bandwidth tcp connection open to test the connection.

udp is no option for me (see previous mails). i really do need a working
netpd-server and the good thing is, that the server doesn't necessarily
needs to be written in pd. i think, i'll try the python road. i know a
little python, whereas c is definitely too low level for me, altough it
probably might be much more performant for what i want. 

besides my situation, [tcpserver] generally isn't yet fully usuable
under real world conditions, although it has been improved a lot
( thanks for all your efforts!!! ). for serious use, i think, the
performance issue is a real problem. but i also encountered other
troubles. 

in particular, there are certain situations, where the pd process
running the [tcpserver] based netpd-server segfaults. this happens
usually, when:
a) there is some net traffic going on, and
b) a client connects or disconnects.
i wasn't able to track the problem down, so i am not really sure, where
the problem comes from, but the fact, that it only happens on connects
or disconnects lets me  assume, that it is somehow related to
[tcpserver]. now i wonder: is it safe at any time to send whatever
message to [tcpserver]? could it be, that [tcpserver] is 'confused',
when a certain client disconnects, while [tcpserver] is sending data to
this particular client?

this problem doesn't exist with the [netserver] based netpd-server
actually this patch/external combo never ever segfaulted, as far as i
remember, the only problems were a hanging pd process due to full
buffer. 

> 
> > have you tested on windows only? i haven't tried windows yet. how did
> > you test?
> 
> I didn't test for speed at all, I just checked that it worked on WinXP 
> and Debian.

i posted a benchmark patch in first mail of this thread, if you're
interested.

roman

___________________________________________________________ 
Telefonate ohne weitere Kosten vom PC zum PC: http://messenger.yahoo.de