[PD] [OT] SSE/MMX tips?

Mathieu Bouchard matju at artengine.ca
Wed Sep 7 16:12:39 CEST 2011


On Wed, 7 Sep 2011, Bill Gribble wrote:

> So far iteration on plain floats seems to be the best I can come up 
> with, but HADDPS is tantalizingly close to what I want to do.  Any 
> hints?

Once I thought that with some commutativity you could speed things up like 
this :

(f0+f1+f2+f3)+(f4+f5+f6+f7)+...

can be rearranged as :

(f0+f4+...)+(f1+f5+...)+(f2+f6+...)+(f3+f7+...)

But I don't remember whether I tried it or not.

That's a speedup without even using MMX/SSE... theoretically, it can 
double the speed of a summation like this, and you can apply this boost to 
sum-of-products to get a certain amount of speedup too.

  _______________________________________________________________________
| Mathieu Bouchard ---- tél: +1.514.383.3801 ---- Villeray, Montréal, QC


More information about the Pd-list mailing list