[PD] [OT] SSE/MMX tips?
Mathieu Bouchard
matju at artengine.ca
Thu Sep 8 02:59:54 CEST 2011
On Wed, 7 Sep 2011, Mathieu Bouchard wrote:
> On Wed, 7 Sep 2011, Bill Gribble wrote:
>
>> So far iteration on plain floats seems to be the best I can come up with,
>> but HADDPS is tantalizingly close to what I want to do. Any hints?
>
> Once I thought that with some commutativity you could speed things up like
> this :
>
> (f0+f1+f2+f3)+(f4+f5+f6+f7)+...
>
> can be rearranged as :
>
> (f0+f4+...)+(f1+f5+...)+(f2+f6+...)+(f3+f7+...)
But what I said does not apply to your case, because you want a scan,
whether I didn't really read and assumed a fold.
I don't know how to optimise a scan.
_______________________________________________________________________
| Mathieu Bouchard ---- tél: +1.514.383.3801 ---- Villeray, Montréal, QC
More information about the Pd-list
mailing list