[PD-dev] urlloader, a web-based abstraction loader

Thu Jul 10 02:40:44 CEST 2008

Hello,

I have just submitted patch #, which implements a new loader for Pd.
This allows one to specify in a patch a "base URL" in which missing
abstractions are expected to be found.  The loader downloads an index
at each listed URL, and if it knows about an abstraction that hasn't
yet been loaded, it will download the patch file to a cache and open
it from there.  This is an idea I've been toying with for a while, and
I've given it some consideration on several levels.  I'm not even
convinced it's a great idea, but I think it's something some people
might be interested in.  Lately I haven't had time to work on it, but
I've got it working well enough that perhaps I should publish it
before it starts to just stagnate in my home folder.

I thought I should just send a follow-up email explaining how to use
it and expanding on some of the points listed in the patch comments:

First, to test the patch you'll need to do a few things.

1) Get it compiled and loaded.

The code is in externals/loaders/urlloader.  Put it in your startup
path and make sure it gets loaded.  (pd should post some messages when
it is loaded.)  You'll need the libcurl-dev package installed.

2) Make the files accessible through http.

There is a directory under /urlloader called /remote.  These files
must go to a web server.  For testing, I have a local Apache running
on my laptop, and the files are hosted at http://localhost/pd.  The
/pd folder is just a symlink from /var/www/pd to the /remote folder.
Note that the test patch (test.pd) points to this location, so if you
decide to use a different location, you'll have to modify it.

3) Sign the patches in the remote folder.

The urlloader requires GPG to work (more on this below).  So you'll
need to have gpg working to test this code.  If you have a gpg key,
use it, otherwise make one.  (Not hard.)
You want files ending in .asc, so issue these two commands:

gpg -b --armor -s unknown.pd
gpg -b --armor -s another.pd

4) Open test.pd

It should open the test patch, and then the [another] and [unknown]
objects should be instantiated.  Open them and you'll see that the
actual location of the files is in a cache directory in your /home
folder.

Some points:

- Requires GPG to verify downloaded patches before loading them.

Since this patch essentially allows code downloaded from a website to
run locally on your machine, it can obviously be a screaming security
risk.  Therefore I did not want to publish this code before
implementing this feature.  However, it's got a few ugly user
interface implications.  Of course not everyone knows how to use GPG
(I learned how just for this actually).  But also, what if you don't
have the key to verify a signature?  Currently it just 'posts' this to
the pd window, and you have to go download it.  The person publishing
the abstractions must provide their public key for this purpose.  You
then have to manually import it into your keyring with 'gpg --import'.

- Also verifies the files MD5 sums

I think this is probably redundant with the GPG stuff mentioned above.
 Originally I was just going to use the MD5 sum of a file to verify
its integrity.  However, I realized that trust issues go much deeper
than that, since a loaded abstraction can easily specify another URL
and then download patches from the new location.  You wouldn't want
your computer automatically following and downloading anything it
sees, so I saw it as a necessity to support some kind of trust
infrastructure.  But I left the MD5 code in the patch for now anyways.
 The pd.index file that sits in /remote is just the output of the
'md5sum' command.

On the TODO list:

- Port to non-Linux operating systems.

Currently I developed this on Ubuntu.  It should be portable to other
systems, provided libcurl and gpg work there.

- Determine proper caching rules.

Currently patches are cached for 100 seconds, then downloaded again
after the expire.  This is just for testing purposes.  In real-world a
more intelligent caching system would be necessary.  (That might be a
good use of the MD5 sum in the index file, now that I think about it.)

- Aliases for URLs which could be prepended to object names.
  (e.g., urlA:myobject)

If more than one URL has the same object listed, the first one will be
downloaded, which may as well be considered undeterministic to the
user.  It would be nice to support a way for the user to specify what
URL an object belongs to.  Using some kind of alias prepended to the
object name might be a way of doing it, with the alias specified in
the [baseurl] object.  In fact, these should be related only to the
canvas the [baseurl] objects reside in, whereas right now everything
is pretty much global.  (It would probably require modification of the
g_list data structure, if I understand right.)

- Deal sanely with versioning.  (support svn URLs?)

This brings up a problem with versions.  What if a patch is designed
that depends on an abstraction at some server, but then the owner of
that server makes an incompatible change which breaks the person's
patch.  It would be nice to support some kind of versioning, so you
could say not only, 'this patch depends on xyz', but 'this patch
depends on version 2.5 of xyz'.  I was thinking that one way to
support this might be to include support for subversion URLs, or at
least websvn, so that specific versions of files would be downloaded,
but still easily updated to a newer version.

- Determine whether this is even a good idea.

I'm not completely convinced that downloading patches and running them
directly is the best way to deal with abstractions hosted in the
'cloud'.  It might be better to have a coherent and trusted repository
system like APT.

Please try it and give some feedback.  If you'd like me to commit this
work to the svn, please feel free to grant me access, though I'll
likely continue developing it on my own either way.

Steve