[PD-dev] deken: major revamp of the packaging format for externals

IOhannes m zmölnig zmoelnig at iem.at
Mon Feb 26 17:06:12 CET 2018


hi all,

TL;DR: i suggest a new deken fileformat to be ready for double-precision
Pd (once it comes) and other future vagaries. it requires you to update
the search plugin and (if you use them) the cmdline tools.


(this is a (longish) discussion/draft/poll. it's implemented, but not
deployed yet.)

everybody loves deken.
thanks joe for the ignition and chr13m for the original implementation.

now, 3 years later, i would like to fix a few inconsistencies and make
deken fit for the future.

# backstory

smaller issues and real problems we have been seeing with the current
implementation are:
- archive formats depend on platform
- the package version cannot be parsed reliably
- the architecture strings are not well-defined.
leading to much confusion and as a consequence, some parts of the
architecture string are simply ignored by the current deken-plugin.

unrelated to deken, there has been some work done in pd-vanilla land to
get native Win64 binaries to work (for deken this means: a new
architecture).
and there have been rumours, that we might get double-precision builds
for Pd - adding a whole new can of worms with incompatible externals.

now deken is not ready for double-precision externals. as things are, it
will happily suggest to download a single-precision external to a double
precision Pd and vice versa, making Pd fail to load the
suggested-as-compatible externals (in the best case) or simply crash Pd
(in the worst case).

the latter triggered a review of the deken packaging format, with the
main task to make deken ready for the future.


the result i've come up with so far, is a new filename scheme for
external packages, which allows us to git rid of the design problems of
the old one while keeping the pool of existing deken packages (which
have well known properties) - and still be able to use it from a user's POV.

# suggestion for a new format

## internal structure
the internal structure of a deken packages stays the same as it already
is: a simple archive of a directory tree containing whatever things the
packager thinkgs need to be packaged.

## file format
however, all deken packages should uniformly use the zip-format (instead
of requiring linux&osx systems to support both zip and tgz).

## filename format
the only real changes affect the format of the *filename* - as the
filename contains all the meta-information of a deken-package.

so my suggestion is, that the filename must have the form:

`<libraryname>[v<version>](<arch1>)(<arch2>).dek`

Example:

 helloworld[1.0.0](Linux-i386-32)(Windows-i386-32)(Darwin-amd64-32).dek


### libraryname
at the beginning of the filename comes the libraryname.
it may contain any characters with the exception of square brackets
(`[]`) and parentheses (`()`).

### version
e.g. "[v3.14]"

the version of a library comes after the libraryname and must be given
in square brackets (`[]`). the actual version number must be prefixed
with `v`.
(using a prefix allows us to easyily extend the filename format to
contain more meta-info if we ever have the need).

the version string may contain any characters with the exception of
square brackets (`[]`) and parentheses (`()`).

NOTE: this is different from the current format (which used `-v<version>-`)

### architecture
e.g. "(Darwin-ppc-32)(Darwin-i386-32)(Darwin-amd64-32)"

to understand whether an external is compatible with the currently
running system, Pd needs to know about the architectures contained in
the deken package.
the full architecture specification consists of a number (zero or more)
of arch sepcifiers, each surrounded by parantheses (`()`).

one of the arch specifiers may be "Source", indicating that the package
contains the source code for the binaries.
each (remaining) arch specifier consists of 3 elements, delimited by
dash: the Operating System (OS), the machine (CPU) and the floatsize of
the external (e.g. "Windows-i386-32").
each arch specifier may contain any character with the exception of
square brackets (`[]`) and parentheses (`()`), and the dash (`-`) is
reserved as delimiter (so it must not be used in the components)

typical values for the OS are "Linux", "Darwin" (for OSX/macOS) and
"Windows".

typical values for CPU are
- "i386" (for 32bit Intel or AMD processors; this is also used if you
are running a 32bit OS on a processors capable to run 64bit code)
- "amd64" (for 64bit processors made by AMD or Intel)
- "arm" (for generic ARM)
- "armv7" (for a more specific ARM processor)
- "ppc" (PowerPC)
the CPU field describes the *calling convention* rather than the actual
hardware of the CPU (e.g. Win32 (and WoW64 for that matter) uses a 32bit
calling convention, even if the actual CPU is an "x86-64 CPU", so deken
uses the arch-string "i386").

typical values for the floatsize are "32" (single precision float point).
once there are double-precision builds of Pd available, this can also
have the value of "64".


NOTE: this is similar to the original format, but now that last value
has an explicit meaning.

### extension
all deken packages must use the uniform filename extension ".dek".
(no more ~-externals.zip~ or ~-externals.tgz~ for you).

objectlists (which keep their format) must use the ".dek.txt" extension.

NOTE: this is different from the original format


# backward compatibility
now, if the deken-server returned the new .dek packages for all queries,
the deken-plugin will fail to download and extract these packages (as it
doesn't know how to handle ".dek" files, just like many major OSs).

the proposal for this dilemma is:
- the deken-plugin tells the server whether it is a "new" plugin (by
means of setting the HTTP User-Agent field)
- the deken-server will only return .dek packages if the plugin is new
enough
- the deken-server will automatically suggest to install the "deken"
package if the plugin is too old.


# implementation
implementation is complete (afaict), but not yet deployed

so far, i've implemented support for this in the [dek] branch of the
deken repository.
the (new) "deken" cmdline tool will (by default) create packages in the
new format. (it can be tricked into creating legacy packages using the
"--dekformat 0" flag).
the (new) "deken-plugin" (aka "Find externals...") will happily work
with both old and new formats.

since the deken-plugin sends its query to a webserver
(deken.puredata.info), this webserver (aka: the glue) must be updated as
well (as it pre-parses all the available deken packages to be able to
quickly answer any queries). code can be found at
https://git.iem.at/zmoelnig/deken-server



# discussion

for now i see two points that might require discussion.

1) katja suggested to use the arch-specific Pd-extensions (e.g. "d_ppc")
instead of the more verbose "Darwin-ppc").
personally i'm not totally opposed, but i kind of like the verbose
naming, as it allows for more future possibilities.
e.g. "b_amd64" currently means "FreeBSD-amd64" but what about other BSD
flavours, are they compatible? (not that i believe that most OSs apart
from the current three will matter very much to the community in the
near future; but then i'd rather be too inclusive than not)

2) the proposed user-agent string might be considered a privacy issue.

it currently is something like:
"Deken/0.2.6 (Linux-amd64-32) Pd/0.48.1 Tcl/8.6.8"
thus showing the Deken version, the Pd version the Tcl/Tk version and
the architecture.

note, that the current plugin uses Tcl/Tks default user-agent header,
which already reveals the Tcl-version (and probably the OS, because
different Tcl implementations seem to set the string differently).
also, an architecture identifier is practically sent whenever your
ordinary webbrowser sends a request to ...google or whatever.

so i don't think that the header reveals problematic information, but
then i thought it might be better to raise the awareness of others (you)
first, before there are any complaints afterwards.

3) having the server add an automatic search for "deken" if you are
using an outdated version.

the idea is, that if people who are using a plugin which cannot handle
the new .dek file, they are nagged into upgrading deken.

the only way the server can communicate to the plugin is via the search
results. so why not add a search-result for a newer (compatible) version
of deken whenever the user might need it (because they are missing
search results which require a newer plugin).

however, such "slight nags" can be pretty^Wvery annoying.
i'd like to have feedback on this before i enable it.


that's probably it.
fgmdasr
IOhannes



[dek] https://github.com/pure-data/deken/tree/dek

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.puredata.info/pipermail/pd-dev/attachments/20180226/3b347a41/attachment-0001.sig>


More information about the Pd-dev mailing list