[PD-dev] Refactoring Pure Data

Tue Sep 12 17:55:41 CEST 2006

Hi all,

Sorry not to jump in before now... I'm in the middle of getting ready
for the school year here and don't have much time for longer-range
planning at the moment.

My top priority for Pd is to finish getting the 'language' defined.  I
don't see this as an open-ended pursuit; another year or so of fooling
with 'data structures' seems to be the main remaining thing.  Until this
is fairly well under control I don't see much point in rewriting existing
code; in particular, I want to do some heavy run-time profiling to find
out what really needs improvement.  But this is pointless until I know
what typical patterns of usage will look like, particularly as regards
to 'data' traversal, which I think has some severe inefficiencies now.

I think Vincent is right on about the need for better error handling.
This is also, partly, a design issue, since patches themselves sometimes
need a mechanism for detecting errors.  I've spent some time thinking
about this and eventually I want to make a formal structure for flagging and
inquiring about errors... but not just yet; the data traversal objects
need to get finalized first.

cheers
Miller

On Tue, Sep 12, 2006 at 04:05:50PM +0200, Vincent Lordier wrote:
> Hi again ;)
> 
> >>So, supposing you want to work with Miller
> >> Why no ?
> >
> >Because you want to refactor Pd and because, I suppose that you've read
> >pd-list and/or pd-dev for some time.
> 
> 
> 
> I did. And I've seen how this project works a little.
> The thing is, it doesn't work as a truly open community, but this can
> change.
> 
> 
> >To me, I shouldn't have to "work with Miller" or "work with Mathieu" or
> >> anything like this,
> >
> >So you are going to work alone? Or you're going to reunite the family?
> 
> 
> 
> At first, doing little splitting here and there could be done by one single
> person and I can do this.
> Now, as I said, the real issue is "where do I cut", and "how", and that's
> architecture, conception, and that has to be done with the whole "family".
> And as you pointed out, changing anything today means lots of work in
> merging.
> 
> But if we don't want to do merging, we need to break down vanilla into
> pieces, and put these pieces together with others, like DD's GUI.
> 
> To me, making DesireData (or any other project) a viable solution starts
> with splitting vanilla into "what you want to work on" and "the rest".
> Then, natural selection will come and we'll pick which solution is best.
> 
> 
> >simply because we are wasting way too much energy doing theses diffs /
> >> merge stuff. Having different branches is fine and even desirable (!),
> >
> >If there's not going to be merging then what do you want the branches to
> >do? split further apart?
> >
> >> as long as it leads to more generic modules that can be tested and
> >> refined in more diverse cases.
> >
> >do you do unit-testing?
> 
> 
> 
> Here's my point of view : to be able to do unit testing, we need functions
> that are actually "testable", and that means they are :
> - small
> - not complex (small Cyclomatic number)
> - doing one single thing
> - handling errors
> - easily isolated from the rest of the program
> 
> 
> I believe more in "good conception leads to less bugs", and I'm not a big
> fan of unity testing myself. But it is a honorable goal to do unit-tests.
> 
> This is how we could avoid quite a few stupid bugs :
> => just look at callgraphs using Doxygen with callgraphs enabled, it's free
> & easy and it'll show you a lot about the program structure (and where to
> "cut")
> Attached is an Doxyfile example with callgraphs enabled. Takes a while to
> generate, and produces big callgraphs (from / to) images.
> => running splint, uno, and others. It also shows some little pieces of dirt
> here and there.
> 
> CCCC can also help us keeping track of the size of the code and its
> complexity, as a default of stuff like RSM, and to know where to start
> working first.
> We don't necessarily need these tools, but they can help.
> 
> Some of you guys, Mac users, have access to Shark (part of XCode) for
> instance (http://developer.apple.com/tools/sharkoptimize.html).
> I don't. But i'd be interested to see the results published on PD and see
> its memory usage, and to profile the application to have an idea of where we
> should focus efforts.
> Sure Kprof and others exist too, but it's not the same :)
> There's also http://www.drugphish.ch/~jonny/cca.html
> 
> Anyway, the point is we have loads of tools to check and split the program
> and I used them on Pd.
> Now, my initial question was : "ok, I found that lots can be improved : how
> do I propose improvments to the community that imply refactoring and
> splitting ?"
> And I see why everyone so far answered "You're getting into social issue
> man, good luck !"
> And that's why I'm saying : "Fine, let's break things down so we can all
> work on Pd's development then"
> Now, I can't and don't want to do that by myself, behind closed doors or on
> a separate repository or do another useless fork : I need the adhesion of
> the "community", or else I won't be able to make any changes... like many
> others who have been discouraged.
> 
> 
> >[Pd] deserves a clear roadmap,
> >
> >Several people have their own roadmaps. Does it need one big shared one?
> >Just how shared would that one be?
> 
> 
> Architecture goals should be shared. Then, on each SW component, everyone
> can do what they want, as long as they respect / define / communicate on
> interfaces.
> Again : it works with pd / externals.
> m_pd.h is an API, Johannes has put together a documentation, and
> developments of externals take place.
> 
> The goals of each person are achievable as long as they don't interfere with
> other's : and that's where separating the codebase and functions is
> essential.
> The "core" should be as minimalist as possible, so it allows anyone to build
> their own "Pd" on top of it. (with any GUI, with any components, float or
> int, externals or not, network or not, full pack, whatever).
> 
> The final distribution for users should still include everything possible as
> Hans is doing (congrats on the build farm btw).
> But the code should allow us to carry on projects like PDa, ePD (embedded
> PD), make PD core into a plugin, or a standalone app with GriPD-reloaded,
> make it available on webservers, or anything people have been talking about
> for a while now.
> 
> >I think this work on architecture will be mandatory as we add new
> >> features (like video~)
> >
> >Do you think that adding a sixth video subsystem to pd is going to solve
> >anything?
> 
> 
> 
> I absolutely don't. What I think is adding it might present the advantage of
> making pd more modular than it is today, otherwise it's going to be hell.
> Cuz we might ask ourselves : how do we integrate these new objects ? We need
> to treat audio, MIDI, video like any other external !
> We need to reduce Pd's core to the minimum possible (=put CoreLibs, Audio,
> MIDI, GUI aside of it).
> 
> >The solution isn't to make forks on forks IMO : the key is to reduce the
> >> amount of code to be merged. Making Pd more modular allows to work on
> >> what you want,
> >
> >You don't understand what I mean: there's a catch-22 (a deadlock) -
> >modularizing may reduce the amount of merging in the long term, but in the
> >short term, it's increasing it, and that's what will prevent
> >modularization from happening in the current branches, despite its
> >benefits.
> 
> 
> 
> Mmh ... I guess we should start by putting everything on the table and
> splitting vanilla and stop working on devel and other branches, at first.
> That way, DD's (and others) developments / merges could be easier right
> after.
> Since you understand french "C'est un mal pour un bien".
> 
> 
> >>Again, my goal is not to alter pd in its behavior (yet),
> >>> my goal is.
> >
> >Actually, I also have the goal of refactoring, of course, but I can't
> >guarantee invariance of behaviour in any way...
> 
> 
> That's true. It's a risk. But it needs to be taken, and it'll help
> understanding how we can improve Pd.
> 
> 
> >- Self explanatory naming (how many single letters variables and / or
> >> funny functions names do we have ?)
> >
> >most names don't have to be long, especially when local and also
> >especially when often used.
> 
> 
> 
> Let me take a quick example to illustrate my point :
> 
> (from s_inter.c, removed #ifdefs for this example)
> 
> void sys_set_priority(int higher)
> {
>    struct sched_param par;
>    int p1 ,p2, p3;
>    p1 = sched_get_priority_min(SCHED_FIFO);
>    p2 = sched_get_priority_max(SCHED_FIFO);
> 
>    p3 = (higher ? p1 + 7 : p1 + 5);
>    par.sched_priority = p3;
>    if (sched_setscheduler(0,SCHED_FIFO,&par) != -1)
>       fprintf(stderr, "priority %d scheduling enabled.\n", p3);
> 
>    if (mlockall(MCL_FUTURE) != -1)
>        fprintf(stderr, "memory locking enabled.\n");
> }
> 
> and
> 
> (quick and dirty version)
> 
> int sys_set_priority(int higher)
> {
>    struct sched_param pd_sched_settings;
>    int priority_min;
>    int priority_max;
>    int priority;
>    int error_desc;
> 
>    priority_min = sched_get_priority_min(SCHED_FIFO);
>    priority_max = sched_get_priority_max(SCHED_FIFO);
> 
>    priority = (higher ? priority_min + WATCHDOG_PRIORITY_BUMP :
> priority_min + PD_PRIORITY_BUMP);
>    pd_sched_settings.sched_priority = priority;
> 
>    if (sched_setscheduler(0, SCHED_FIFO, &pd_sched_settings) != -1)
>    {
>       fprintf(stderr, "priority %d scheduling enabled.\n", priority);
>       return (0);
>    }
>    else
>    {
>        error_desc=errno;
>        fprintf(stderr, "couldn't change process priority to %d.\n",
> priority);
>        fprintf(stderr, "%s (%d)", strerror(error_desc), error_desc);
>        return (WHATEVER_ERROR_TO_BE_DEFINED);
>    }
> }
> 
> (since this is ANOTHER function, so we separate it ! )
> 
> int sys_set_memory_lock(void)
> {
>    int error_desc;
> 
>    if (mlockall(MCL_FUTURE) != -1)
>    {
>        fprintf(stderr, "memory locking enabled.\n");
>       return(0);
>    }
>    else
>    {
>        error_desc=errno;
>        fprintf(stderr, "couldn't enable memory locking.\n");
>        fprintf(stderr, "%s (%d)", strerror(error_desc), error_desc);
>        return (WHATEVER_ERROR_TO_BE_DEFINED);
>    }
> }
> 
> 
> I know it's so not the "best" code ever and it is improvable, but I just
> wanted to illustrate my point.
> For instance,
> - Better error handling can be done here,
> - Localization can be done here,
> - Generic error function can be written,
> - Safe fprintf can be used to avoid overflow.
> 
> An explicit name doesn't take more memory at runtime, and saves the dev
> brain power at coding time ;)
> And splitting functions won't make Pd noticeably slower, since these
> functions are called ONCE or TWICE throughout the whole program's runtime ;)
> But it'll allow comprehension, flexibility, error handling, debug and
> testability.
> 
> 
> >- Getting rid of stuff.h which is a nonsense to me and having .h in
> >modules
> >> when required.
> >
> >I agree about splitting s_stuff.h or otherwise cleanly indicating what its
> >sections are; however I don't know what you mean by having .h "in"
> >modules. Is a "module" some kind of directory?
> 
> 
> 
> Yes, like
> /audio
> |-/pd_audio.c, pd_audio.h
> |--/alsa
> |--/jack
> |--/whatever
> 
> stuff.h already has sections. But there it is confusing and unsafe that all
> files include "stuff.h" and can access anything declared in it ! It is
> against separation of variables, functions, structures and ... modularity !
> 
> 
> >Maybe Pd's internals architecture could be a nice topic for a little IRC
> >> meeting ?
> >
> >I stopped caring about trying to organise PureData developers meetings
> >some time ago. I think we've had seven of them. It didn't catch on.
> >I decided to call DesireData meetings instead, but the 2nd meeting is
> >loooong overdue.
> 
> 
> 
> Communication is key. IRC isn't the best tool but it's a start.
> Meetings are nice too :) The one we had in Paris during the NIME06 was
> interesting.
> I like launchpad, but we could also make a better use of puredata.info wiki,
> or setup trac or anything like this.
> 
> 
> >Who's in ?
> >
> >I could be in. Is this going to happen on #dataflow ?
> 
> 
> 
> Wherever you guys want. I think we need Miller on board though.
> We need to know who's in, it's a community's job :)
> 
> I'd like to have Miller's point of view on this whole conversation too.
> 
> ++

> _______________________________________________
> PD-dev mailing list
> PD-dev at iem.at
> http://lists.puredata.info/listinfo/pd-dev