[PD-dev] Refactoring Pure Data

Tue Sep 12 16:05:50 CEST 2006

Hi again ;)

>> So, supposing you want to work with Miller
> > Why no ?
>
> Because you want to refactor Pd and because, I suppose that you've read
> pd-list and/or pd-dev for some time.

I did. And I've seen how this project works a little.
The thing is, it doesn't work as a truly open community, but this can
change.

> To me, I shouldn't have to "work with Miller" or "work with Mathieu" or
> > anything like this,
>
> So you are going to work alone? Or you're going to reunite the family?

At first, doing little splitting here and there could be done by one single
person and I can do this.
Now, as I said, the real issue is "where do I cut", and "how", and that's
architecture, conception, and that has to be done with the whole "family".
And as you pointed out, changing anything today means lots of work in
merging.

But if we don't want to do merging, we need to break down vanilla into
pieces, and put these pieces together with others, like DD's GUI.

To me, making DesireData (or any other project) a viable solution starts
with splitting vanilla into "what you want to work on" and "the rest".
Then, natural selection will come and we'll pick which solution is best.

> simply because we are wasting way too much energy doing theses diffs /
> > merge stuff. Having different branches is fine and even desirable (!),
>
> If there's not going to be merging then what do you want the branches to
> do? split further apart?
>
> > as long as it leads to more generic modules that can be tested and
> > refined in more diverse cases.
>
> do you do unit-testing?

Here's my point of view : to be able to do unit testing, we need functions
that are actually "testable", and that means they are :
- small
- not complex (small Cyclomatic number)
- doing one single thing
- handling errors
- easily isolated from the rest of the program

I believe more in "good conception leads to less bugs", and I'm not a big
fan of unity testing myself. But it is a honorable goal to do unit-tests.

This is how we could avoid quite a few stupid bugs :
=> just look at callgraphs using Doxygen with callgraphs enabled, it's free
& easy and it'll show you a lot about the program structure (and where to
"cut")
Attached is an Doxyfile example with callgraphs enabled. Takes a while to
generate, and produces big callgraphs (from / to) images.
=> running splint, uno, and others. It also shows some little pieces of dirt
here and there.

CCCC can also help us keeping track of the size of the code and its
complexity, as a default of stuff like RSM, and to know where to start
working first.
We don't necessarily need these tools, but they can help.

Some of you guys, Mac users, have access to Shark (part of XCode) for
instance (http://developer.apple.com/tools/sharkoptimize.html).
I don't. But i'd be interested to see the results published on PD and see
its memory usage, and to profile the application to have an idea of where we
should focus efforts.
Sure Kprof and others exist too, but it's not the same :)
There's also http://www.drugphish.ch/~jonny/cca.html

Anyway, the point is we have loads of tools to check and split the program
and I used them on Pd.
Now, my initial question was : "ok, I found that lots can be improved : how
do I propose improvments to the community that imply refactoring and
splitting ?"
And I see why everyone so far answered "You're getting into social issue
man, good luck !"
And that's why I'm saying : "Fine, let's break things down so we can all
work on Pd's development then"
Now, I can't and don't want to do that by myself, behind closed doors or on
a separate repository or do another useless fork : I need the adhesion of
the "community", or else I won't be able to make any changes... like many
others who have been discouraged.

> [Pd] deserves a clear roadmap,
>
> Several people have their own roadmaps. Does it need one big shared one?
> Just how shared would that one be?

Architecture goals should be shared. Then, on each SW component, everyone
can do what they want, as long as they respect / define / communicate on
interfaces.
Again : it works with pd / externals.
m_pd.h is an API, Johannes has put together a documentation, and
developments of externals take place.

The goals of each person are achievable as long as they don't interfere with
other's : and that's where separating the codebase and functions is
essential.
The "core" should be as minimalist as possible, so it allows anyone to build
their own "Pd" on top of it. (with any GUI, with any components, float or
int, externals or not, network or not, full pack, whatever).

The final distribution for users should still include everything possible as
Hans is doing (congrats on the build farm btw).
But the code should allow us to carry on projects like PDa, ePD (embedded
PD), make PD core into a plugin, or a standalone app with GriPD-reloaded,
make it available on webservers, or anything people have been talking about
for a while now.

> I think this work on architecture will be mandatory as we add new
> > features (like video~)
>
> Do you think that adding a sixth video subsystem to pd is going to solve
> anything?

I absolutely don't. What I think is adding it might present the advantage of
making pd more modular than it is today, otherwise it's going to be hell.
Cuz we might ask ourselves : how do we integrate these new objects ? We need
to treat audio, MIDI, video like any other external !
We need to reduce Pd's core to the minimum possible (=put CoreLibs, Audio,
MIDI, GUI aside of it).

> The solution isn't to make forks on forks IMO : the key is to reduce the
> > amount of code to be merged. Making Pd more modular allows to work on
> > what you want,
>
> You don't understand what I mean: there's a catch-22 (a deadlock) -
> modularizing may reduce the amount of merging in the long term, but in the
> short term, it's increasing it, and that's what will prevent
> modularization from happening in the current branches, despite its
> benefits.

Mmh ... I guess we should start by putting everything on the table and
splitting vanilla and stop working on devel and other branches, at first.
That way, DD's (and others) developments / merges could be easier right
after.
Since you understand french "C'est un mal pour un bien".

>> Again, my goal is not to alter pd in its behavior (yet),
> >> my goal is.
>
> Actually, I also have the goal of refactoring, of course, but I can't
> guarantee invariance of behaviour in any way...

That's true. It's a risk. But it needs to be taken, and it'll help
understanding how we can improve Pd.

> - Self explanatory naming (how many single letters variables and / or
> > funny functions names do we have ?)
>
> most names don't have to be long, especially when local and also
> especially when often used.

Let me take a quick example to illustrate my point :

(from s_inter.c, removed #ifdefs for this example)

void sys_set_priority(int higher)
{
    struct sched_param par;
    int p1 ,p2, p3;
    p1 = sched_get_priority_min(SCHED_FIFO);
    p2 = sched_get_priority_max(SCHED_FIFO);

    p3 = (higher ? p1 + 7 : p1 + 5);
    par.sched_priority = p3;
    if (sched_setscheduler(0,SCHED_FIFO,&par) != -1)
       fprintf(stderr, "priority %d scheduling enabled.\n", p3);

    if (mlockall(MCL_FUTURE) != -1)
        fprintf(stderr, "memory locking enabled.\n");
}

and

(quick and dirty version)

int sys_set_priority(int higher)
{
    struct sched_param pd_sched_settings;
    int priority_min;
    int priority_max;
    int priority;
    int error_desc;

    priority_min = sched_get_priority_min(SCHED_FIFO);
    priority_max = sched_get_priority_max(SCHED_FIFO);

    priority = (higher ? priority_min + WATCHDOG_PRIORITY_BUMP :
priority_min + PD_PRIORITY_BUMP);
    pd_sched_settings.sched_priority = priority;

    if (sched_setscheduler(0, SCHED_FIFO, &pd_sched_settings) != -1)
    {
       fprintf(stderr, "priority %d scheduling enabled.\n", priority);
       return (0);
    }
    else
    {
        error_desc=errno;
        fprintf(stderr, "couldn't change process priority to %d.\n",
priority);
        fprintf(stderr, "%s (%d)", strerror(error_desc), error_desc);
        return (WHATEVER_ERROR_TO_BE_DEFINED);
    }
}

(since this is ANOTHER function, so we separate it ! )

int sys_set_memory_lock(void)
{
    int error_desc;

    if (mlockall(MCL_FUTURE) != -1)
    {
        fprintf(stderr, "memory locking enabled.\n");
       return(0);
    }
    else
    {
        error_desc=errno;
        fprintf(stderr, "couldn't enable memory locking.\n");
        fprintf(stderr, "%s (%d)", strerror(error_desc), error_desc);
        return (WHATEVER_ERROR_TO_BE_DEFINED);
    }
}

I know it's so not the "best" code ever and it is improvable, but I just
wanted to illustrate my point.
For instance,
- Better error handling can be done here,
- Localization can be done here,
- Generic error function can be written,
- Safe fprintf can be used to avoid overflow.

An explicit name doesn't take more memory at runtime, and saves the dev
brain power at coding time ;)
And splitting functions won't make Pd noticeably slower, since these
functions are called ONCE or TWICE throughout the whole program's runtime ;)
But it'll allow comprehension, flexibility, error handling, debug and
testability.

> - Getting rid of stuff.h which is a nonsense to me and having .h in
> modules
> > when required.
>
> I agree about splitting s_stuff.h or otherwise cleanly indicating what its
> sections are; however I don't know what you mean by having .h "in"
> modules. Is a "module" some kind of directory?

Yes, like
/audio
|-/pd_audio.c, pd_audio.h
|--/alsa
|--/jack
|--/whatever

stuff.h already has sections. But there it is confusing and unsafe that all
files include "stuff.h" and can access anything declared in it ! It is
against separation of variables, functions, structures and ... modularity !

> Maybe Pd's internals architecture could be a nice topic for a little IRC
> > meeting ?
>
> I stopped caring about trying to organise PureData developers meetings
> some time ago. I think we've had seven of them. It didn't catch on.
> I decided to call DesireData meetings instead, but the 2nd meeting is
> loooong overdue.

Communication is key. IRC isn't the best tool but it's a start.
Meetings are nice too :) The one we had in Paris during the NIME06 was
interesting.
I like launchpad, but we could also make a better use of puredata.info wiki,
or setup trac or anything like this.

> Who's in ?
>
> I could be in. Is this going to happen on #dataflow ?

Wherever you guys want. I think we need Miller on board though.
We need to know who's in, it's a community's job :)

I'd like to have Miller's point of view on this whole conversation too.

++
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puredata.info/pipermail/pd-dev/attachments/20060912/4580f6d1/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Doxyfile
Type: application/octet-stream
Size: 10147 bytes
Desc: not available
URL: <http://lists.puredata.info/pipermail/pd-dev/attachments/20060912/4580f6d1/attachment.obj>