[PD-dev] [GEM] Further CVS changes

chris clepper cclepper at artic.edu
Thu Jan 30 20:36:28 CET 2003

>Your results are really impressive.
>I did some experiments to with the pix code on linux, and you are right
>that we can gain a lot by optimizing the code.
>I changed pix_add and others from float to integer. I have also done
>experiments with MMX (which, I have to admit, did not give the
>results I had hoped for, but maybe just because I did not really
>know what I was doing ).

MMX is only coded in asm right?  That makes it that much harder for 
anyone who's not already an assembly expert.  Altivec is at least C 
based functions and data types, but still tricky to learn.  PDP is 
using some MMX on yuv pixels from the looks of it, maybe the results 
are better processing 16 bit pixels vs 32bit ones?  There's always 
that huge range of SIMD options on x86 to pick from too: 
MMX,3DNow,SSE,SSE2, etc.

>At least we should get rid of float pixel processing on all platforms.

That seems like a good idea.  Type conversion in a big for loop isn't 
the most efficient thing to do.  We can also check for division and 
branching in the loops, and try to eliminate those where possible. Do 
you know how things like loop unrolling help x86 platforms?  There 
are some scalar tricks like that can be tried as well.  The gcc on 
x86 seems to be far better at optimizing code than on PPC so maybe 
Linux is already getting a big boost by using a good compiler.

>#ifdefs are ugly
>If it is in some way possible, we should come up with macro's or
>templates for common optimizable functions. (I think there was a message
>from Daniel about that).
>If that  is not possible, put the architecture dependend code in its
>own source file, and write a non-altivec version of the same code.
>Later someone may add the MMX version, but can concentrate on
>that instead of having to go through ifdefs.

#ifdefs _are ugly!  Sometimes it's the only way, but there's probably 
a solution to adding in optimized code without tons of #ifdefs. 
Maybe one way is to add processRGBMMX, processYUVAltivec to the range 
of processing routines.  It is possible to do a runtime architecture 
check for Altivec, is it possible for MMX?  A check in GemPixObj 
could be made for these hardware features.



More information about the Pd-dev mailing list