[PD-dev] SVN?

Jeff Rose rosejn at gmail.com
Wed Oct 24 14:39:19 CEST 2007


There are a couple key factors in this decision that seem to be getting 
confused with each other.  I'm new to PD, but I've been using Git for 
over a year now with my own repositories.  In various consulting 
projects over the last few years I've developed with CVS, SVN, 
Mercurial, Darcs, Arch and Bazaar-ng.  There are a couple facets to the 
decision.

1) Distributed Development

The first, and I think most important, is the development model. 
Distributed development supports a different kind of collaboration, that 
I think is showing itself to be quite advantageous for open source 
software.  It lets groups of people organize themselves, get a piece of 
code working, and then it can be shared with the world.  Permissions and 
management of a central repository is no longer an issue.  It doesn't 
exist.  Instead Hans-Christoph, for example, would manage an official 
pd-extended branch that had the fully tested features, while everyone 
else could still share the latest and greatest stuff without having a 
centralized bottle-neck.

2) Personal development advantages

In git you have the whole repository locally.  This takes more storage 
space, which is incredibly cheap, but saves LOTS of time.  SVN can never 
merge as well as GIT because you don't have past versions of the 
repository to merge against.  Additionally, with git you get local 
versioning, local branching, and tons of distributed back-ups of the 
repository.  I'm always working on the train, for example, and with Git 
I can maintain a much better log of my development because I commit all 
the time without needing internet access. Branches in git, even for huge 
projects, are basically free.  In space and time.  This changes the way 
you develop.  You can make a branch for any little idea you want to mess 
with or any tricky feature you are working on.  It's quite easy to then 
merge these branches, share them with friends, update them from other's 
repositories etc.  This is a level of power and control that you can not 
get with svn.

3) User Interface

Git was originally designed to function as a back-end suite that would 
support easy to use front-end utilities.  That has changed over time, 
and it now includes a nice set of commands and tools that make it work 
just like you would expect.  For sure Git doesn't have the GUI plugins 
like SVN or CVS and it will take some learning, but I think the benefits 
in the long term will far, far outweight the initial investment.

There is a Tk interface, GitK, which is incredibly useful to visually 
look at the history of a repository, merges, branches etc.  There is 
also git-web, which lets you view the repository online.  Git also lets 
you publish a repository just by copying a directory to an http 
accessible location.  This means anyone can share their ideas, features, 
abstractions etc, without setting up a server, getting permission for a 
centralized location or anything.  Also relevant to the user experience 
is speed.  Git blows away all the other systems in terms of speed.  It 
does everything faster, and you really notice this because it changes 
what you do.  In some of the other systems, like Bazaar-ng, committing, 
pulling and pushing took so long I wouldn't do it that often, but with 
git it's all so cheap I do it every time I get a new unit test to pass.

4) Technical

Git writes the repository into a highly compressed format, and then it 
does not mess with the files.  Append only is the standard operation, 
except for when you occasionally compress the whole thing, in which case 
it can verify sanity.  It is also much cleaner to work with in terms of 
permissions and access, just because of the whole usage model.  I've had 
many experiences with subversion where the repository had to be 
recovered because of permissions issues and/or corruption.  This is very 
bad, especially since it is a centralized server that everyone is 
counting on.  It takes a root user to go in and run "svnadmin recover", 
which in my opinion is a command that shouldn't need to exist in 
something as important as a source repository.

Keith Packard who runs the X.org project wrote a very good post 
detailing his research and opinions into source control.  It's a bit old 
now, but worth the read:
   http://keithp.com/blog/Repository_Formats_Matter/index.html

Sorry for the long post, but this is something I have dealt with a lot 
recently.  In my mind distributed versioning is the only proper way to 
run a modern open source project.  PD is a great piece of software with 
a seemingly cool and diverse group of developers, which seems perfect 
for a decentralized model.  Either way, I think the decision should be 
made first between a centralized or distributed versioning system, and 
then the decision can be simplified to figuring out compatibility, 
usability etc.  It's great news that Git now runs on windows though, 
because I think it is far better than the others.  To quote one of 
Hans-Christoph's recent signature lines:

Mistrust authority - promote decentralization.  - the hacker ethic

:-)

Ciao,
Jeff

David Plans Casal wrote:
> Speaking from experience, having suffered several migrations to prcs  
> from cvs, to svn from cvs, and to darcs from svn, in commercial  
> projects, I have to say the only one that worked well, made sense,  
> and was least painful was cvs-->svn, and I'd vote for that.
> 
> David
> 
> On 23 Oct 2007, at 19:41, Stephen Sinclair wrote:
> 
>>> are used ... beside that, i am not sure, how access to specific  
>>> parts of
>>> the repository can be restricted ...
>> You don't have to restrict anything.
>> If someone messes with your work, you simply don't merge what they  
>> did.
>> They're free to do whatever they want in _their_ repository, but
>> you're not forced to accept it into _your_ "official" repo for your
>> project.
>>
>> Of course, other people would still be free to check out what the
>> other guy did, even if it's not officially part of the main tree.  The
>> trick here is to think in terms of "pulling" rather than "pushing".
>> Since no one is pushing, no one can step on anyone else's toes.
>>
>> Anyways, I think under a system like git, the Pd kernel would have its
>> own repo, and other subprojects (abstraction collections, externals,
>> libraries) would have their own repos.  A project like "Pd extended"
>> could then simply be a "super-repo" collecting specific repos as
>> submodules.  Each submodule would be tagged to a specific
>> version/branch.
>>
>> I totally understand the hesitation, git does seem somewhat
>> complicated at first.  Once I finally understood the whole distributed
>> way of thinking though, I couldn't help but dive in, and now I like
>> the idea so much I think I can't go back...  and actually, I
>> discovered that the interface is really not that hard.  Mostly you
>> just use "clone", "pull", and "commit".
>>
>> Another nice thing about it:  you don't even have to "officially"
>> switch all at once.  Someone can just start a git repo based off the
>> CVS, and you can go from there... whoever does that is responsible for
>> keeping his repo up-to-date with the CVS and vice-versa.  He can check
>> his changes back into the CVS whenever he wants but still work with
>> git on his own computer.  It allows for a gradual weening away from
>> the central repository instead of requiring everyone to switch at
>> once.
>>
>> I won't try any more to push the idea, but I think it's worth  
>> considering.
>>
>>
>> Steve
>>
>> _______________________________________________
>> PD-dev mailing list
>> PD-dev at iem.at
>> http://lists.puredata.info/listinfo/pd-dev
> 
> 
> _______________________________________________
> PD-dev mailing list
> PD-dev at iem.at
> http://lists.puredata.info/listinfo/pd-dev






More information about the Pd-dev mailing list