[GEM-dev] gem cvs question: altivec (SIMD) conventions

cgc at humboldtblvd.com cgc at humboldtblvd.com
Wed Oct 1 20:51:27 CEST 2003


Quoting IOhannes m zmoelnig <zmoelnig at iem.at>:

> 
> now my question: wouldn't it be better to make more use of inheritance
> ?
> and call the altivec-processing directly from GemPixObj, if altivec is
> 
> supported  by the machine ?
> i write this with MMX and SSE2 in mind:
> i guess, all modern apple-computers have altivec;
> all modern pcs have mmx, but only some of them have sse2 (i guess 
> pentium-4 only, but amd will support it in the future).
> i would rather decide at runtime whether any parallel processing is 
> available, and then call the appropriate function. the fallback of all
> 
> those altivec/mmx-functions would of course be the generic-processing.
> 

The big problem with run-time checking is that all modern processors have branch
prediction so that when a conditional is reached it will speculatively execute a
part of each branch.  If one branch has a SIMD instruction that the machine
can't handle then the app will segfault with an illegal instruction.  Only if
that condition can be absolutely assured never to happen will run-time checking
work (and you have to really, really enjoy debugging random crashes too).

The only way that I've seen people do this is runtime module loading like in
Photoshop.

> i don't know whether G5-ALTIVEC-compiles can be executed on G4's - would
> it be possible if the PPC970-code would be capsuled in an if-clause ?

Certain instructions (vec_dst) have to be avoided on the G5/970 because they are
software prefetch ops that over-ride the built in hardware prefetch in the 970 
which serializes instructions until the op is complete.  That's some bad news. 
I've made an ifdef for this, which also avoids the nasty branch prediction and
OOOE (Out-of-order-Execution) issue.  

> that in PixDualObj's the naming is rather arbitrary: 
> pix_subtract::processYUVAltivec() vs. 
> pix_chroma_key::processYUV_YUVAltivec()

pix_chroma_key is a dual pix object and follows the YUV_YUV convention of the
other dualPix objects.  There could be a RGB_YUV oro YUV_RGB version at some
point (who knows?).

cgc




More information about the GEM-dev mailing list