[PD-dev] double precision Pd: .patch files, tests and benchmarks

Tue Oct 4 11:38:12 CEST 2011

ah yes, this was indeed my fault.
since i don't feel comfortable with editing m_pd.h to get a different
build, i used CFLAGS="-DPD_FLOAT_PRECISION=64", which undid any
optimization flags (which by default are "-O6", which i find a bit
overdone; and "-g" is not set at all...)

the proper way is to use CPPFLAGS="-DPD_FLOAT_PRECISION=64", which
results in:

osc-delay-perftest with 400 instances:
debian           : 31%
original         : 29%
single           : 22%
single(O0)       : 64%
single(O2)       : 25%
single(O2+loop)  : 22%
single(pentium3) : 24%
single(pentium4) : 22%
single(prescott) : 22%
single(core2)    : 22%
single(core2+sse): 22%
double           : 25%
double(O0)       : 86%
double(O2)       : 27%
double(O2+loop)  : 26%
double(pentium3) : 25%
double(pentium4) : 24%
double(prescott) : 24%
double(core2)    : 24%
double(core2+sse): 25%

osc-delay-perftest with 1200 instances:
debian           : 94%
original         : 81%
single           : 65%
single(O2)       : 72%
single(O0)       : ++%
single(O2+loop)  : 66%
single(pentium3) : 70%
single(pentium4) : 66%
single(prescott) : 65%
single(core2)    : 59%
single(core2+sse): 64%
double           : 77%
double(O0)       : ++%
double(O2)       : 82%
double(O2+loop)  : 77%
double(pentium3) : 79%
double(pentium4) : 75%
double(prescott) : 75%
double(core2)    : 71%
double(core2+sse): 75%

which is more inline with katja's measurements.

this is (again) on an i5 650 @ 3.2GHz running in 32bit mode
optimization flags (as far as they can be reconstructed :-))
debian: "-g -O2" (this is what is dictated by debian policy)
original: "-O6 -funroll-loops -fomit-frame-pointer"  (seems to be the
default)
single/double: ->original
(O0): -O0
(O2): -g -O2
(O2+loop): -g -O2 -funroll-loops -fomit-frame-pointer
(prescott): ->original + "-march=prescott"
(core2): ->original + "-march=core2"
(core2+sse): ->original + "-march=core2 -mfpmath=sse -msse2"

so it seems like the biggest performance boost is given (on the tested
platform), by compiling with "-g -O2 -funroll-loops
- -fomit-frame-pointer" (which is cool because i think this can even make
it into debian, the way it is)

> inline function (like it was already suggested by IOhannes a while
> ago), but at -O0 nothing will be inlined. A benchmark howto would be
> useful indeed.

well, i usually just cram lots of the same object into a subpatch (until
i get approximately 80% in the slowest environment, in order to not max
out the CUP and get unknown side-effects), and measure it with the
built-in load-meter (for loads <100% it behaves quite the same as top)
nothing very dramatic.

fgmasdr
IOhannes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk6K1AQACgkQkX2Xpv6ydvTGgwCfSp1ytXru2AtPqCQx2O1BZ3Zc
A2QAoNS7ki9euvd4XKaRMhtc0grI2D9V
=EwUX
-----END PGP SIGNATURE-----

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3636 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.puredata.info/pipermail/pd-dev/attachments/20111004/ac2fbe3b/attachment.bin>