[PD-dev] Refactoring Pure Data

Tue Sep 12 20:32:46 CEST 2006

On 9/12/06, Miller Puckette <mpuckett at imusic1.ucsd.edu> wrote:
>
> Hi all,
>
> Sorry not to jump in before now...

Thanks for taking the time to answer then :)

My top priority for Pd is to finish getting the 'language' defined.  I
> don't see this as an open-ended pursuit; another year or so of fooling
> with 'data structures' seems to be the main remaining thing.

No problem : this work is compatible with the philosophy of development that
I've talked about earlier (and which work on many other open source
projects) :
=>  you want to focus on the "language and data structure" for the coming
year and that's your right (!)

=> Some want to focus on GUI and that's their right. It is independent from
data structures & language, since no matter what the objects do, drawing
boxes, connecting them and talking to the "engine" is generic no matter how
the language evolves, unless you tell us so.
This could be done with any tool, it being Tcl/Tk or whatever fits the
developers needs and wishes. Again, if it's good, it'll stay and be
improved, if it's not, it'll die, no need to argue about what language, what
method, or anything. As long as it connects to the engine and provides what
we expect from it, no pb.

=> Some want to focus on realtime issues and that's their right : once the
objects, no matter what they are, are loaded into chains of computing
(separate or tightly sequenced), they still need to be scheduled efficiently
no matter what they do.
Some specificities can be taken into account, but there are not plenty of,
and they can be clearly specified or understood.

=> Some wish to work on unit-testing and error handling, and that works as
long as you take into consideration realtime / non realtime tasks : for
example, the setup of the program should heavily report errors, since that's
what is the most likely to cause problems. Once everything is setup, only a
few runtime issues (like xruns or failure in the external) need to be
reported, and error reports should even be avoided to not slow down the
realtime processing.

Right now, even pd_init() reports absolutely nothing and it's called only
once... !

=> Some other on documentation
=> Some on portability
=> Some on making the code safer (no overflow)
=> Some on bug fixing
...

We can't just "wait" til one task is done to work on the others, that's just
insane and inefficient (apparently some of us know quite a bit about that).
It'd be a shame that we are just "clients" of Pd, external developers and Pd
bug killers.

So, let's say you "lock" the few files you intend to work on / extend /
improve, or your isolate the files & data you need and put them all together
as one software component that can be integrated in the rest (basically, a
bunch of .c & .h files together in a directory with comments and Doxygen
doc), and keep all of us updated as often as possible (and commit to CVS) so
we can work on other parts in parallel.
If some other people's work are to interfere with yours, it'll be added to a
TODO list and negociated to see how it's feasible. But more segmented the
work is, the less issues like this there will be.
I imagine you won't touch s_main.c too much for instance, so one can work on
command line parsing, signal handling, watchdog launching, setup of the
subsystems, errors checking, logs, current state structure (what objects are
loaded), load/unload of externals, ...
I also imagine you won't rework the audio & MIDI handling, so that can be
worked on to be separated as externals which are to be manipulated as any
other object.
And someone on the GUI can work on a way to interact with externals so they
can be configured and setup through the GUI, maybe with dynamic menus
structure...
Lots could happen while you sleep as long as we can work on different and
independent pieces without having to go though the merge / diff
energy-wasting & discouraging process !

We'll then know that you're working on this and just this, some other can
help you on this and just this, and we can keep going on the other tasks.
The end result will be that Pd is going to evolve much faster, and that new
features appearing today in other branches will be more easily integrable.

Could you then tell us what you intend to work on exactly, and say "Ok,
don't touch THIS and THIS, because I'll be doing something like THIS, and
the rest is up to you guys" ?

Until this is fairly well under control I don't see much point in rewriting
> existing
> code;

Apparently, some of us do :) ... since there are a few forks here and there
and people are already rewriting some parts.

in particular, I want to do some heavy run-time profiling to find
> out what really needs improvement.

That can be done per function, per module, and then together with other
components running all together, and again, isn't compatible with others
working in parallel on Pd.

  But this is pointless until I know
> what typical patterns of usage will look like, particularly as regards
> to 'data' traversal, which I think has some severe inefficiencies now.

Mmh... the delay at launch time can be analyzed easily without even
launching or using any patch.
Finding a smarter way to handle the cmdline and eventually avoid too many
strcmp or memcpy can be done right now too.
Checking if another pd "engine" is running / alive at launch time can be
done without knowing anything about the rest of the way the program is going
to be used...
And the way we can separate the "engine" from the GUI (GUI launching the
engine, or connecting to it if it crashed ) can also be done right now.
LinuxSampler, Qjackctl, and SooperLooper are IMO good examples to follow in
terms of "engine / GUI" separation.

If by "run time" you mean actual computing time, this can be done by
thinking about the scheduler ONLY, and eventually rewriting it (avoiding
gotos maybe ?), and profiling it on a set of objects independently from GUI
& MIDI (if we're talking about audio).

I think Vincent is right on about the need for better error handling.
> This is also, partly, a design issue, since patches themselves sometimes
> need a mechanism for detecting errors.

C already provides these mechanisms for the program itself as C functions
can already return errors, but right now they're mostly not returning
anything (void), and if they do, we rarely listen to what they have to say
:)
Let's make them talk and tell us what's going on !
Let's make them short and doing just one thing so the error is meaningful !
The mechanism exists, and I tried to show with my silly example (see
previous post) how we can do this easily.

Too, we have so many fprintf(someoutput, "hardcoded message") that it makes
localization / creation of log files hard to make right now.
Taking into account potential errors in the code without automatically
assuming the program behavior will even help debugging without opening gdb
;)

As far as patches themselves handling their errors, it could be done by
improving objects so they are more "foolproof", so that you can't do stuff
that has been identified as potentially dangerous. That way, we can
exterminate errors from the source. Just my 2 cents ... Other than that I
don't understand what you're talking about : any example ?
Maybe we could think about this too and propose solutions ...

  I've spent some time thinking
> about this and eventually I want to make a formal structure for flagging
> and
> inquiring about errors... but not just yet; the data traversal objects
> need to get finalized first.

Could you elaborate on this a little ?

I know I'm being very open about my own views, but I do it here so we can
find a solution to improve Pd, speed up developments and focus more on
quality.
So any comments / constructive criticize are more than welcome ! :)

++

vincent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puredata.info/pipermail/pd-dev/attachments/20060912/34ae7acd/attachment.htm>