[PD-dev] gcc 4.1 and auto-vectorization
Thomas Grill
gr at grrrr.org
Wed Nov 22 01:32:30 CET 2006
As a short follow-up, that's the skeleton for a DSP function that can
get auto-vectorized (under gcc 4.0.1/PPC at least):
#include <stdlib.h>
#define VECELEMS 4
#define ALIGNMENT (sizeof(float)*(VECELEMS))
#define ALIGNED(ptr) (((size_t)(ptr)&((ALIGNMENT)-1)) == 0)
typedef float *__restrict__ __attribute__((aligned(ALIGNMENT)))
aligned_float_ptr;
void addfun(int n,float *dst,const float *src1,const float *src2)
{
int i,j;
if(ALIGNED(dst) && ALIGNED(src1) && ALIGNED(src2)) {
aligned_float_ptr d = (aligned_float_ptr)dst;
aligned_float_ptr s1 = (aligned_float_ptr)src1;
aligned_float_ptr s2 = (aligned_float_ptr)src2;
int nv = n/VECELEMS;
/* this loop will be auto-vectorized */
for(i = 0; i < nv; ++i,d += VECELEMS,s1 += VECELEMS,s2 +=
VECELEMS)
for(int a = 0; a < VECELEMS; ++a)
d[a] = s1[a]+s2[a];
n -= nv*VECELEMS;
for(i = 0; i < n; ++i)
d[i] = s1[i]+s2[i];
}
else {
for(i = 0; i < n; ++i)
dst[i] = src1[i]+src2[i];
}
}
Of course, in C++ this can be made much more flexible using templates.
Looking at the assembly output is not recommended - it's a mess. It's
much better to code similar functionality using the vector primitives
that gcc and MSVC provide.
best greetings,
Thomas
Am 20.11.2006 um 00:16 schrieb Thomas Grill:
>
> Am 19.11.2006 um 22:57 schrieb Mathieu Bouchard:
>
>> On Sun, 19 Nov 2006, Thomas Grill wrote:
>>> Am 18.11.2006 um 22:16 schrieb Mathieu Bouchard:
>>>> perhaps it would be a good start to reimplement newbytes(n)
>>>> using memalign(16,n) instead of malloc(n).
>>> A few years ago i introduced aligned memory allocation in the pd-
>>> devel branch.
>>
>> I see how you did it. Is it because posix_memalign() isn't as
>> portable as we'd like it to be? (I wrote "memalign" by mistake,
>> which is the name of a deprecated function that does a similar job)
>>
>> It seems like a lot of memory is allocated unaligned. Is that
>> normal? If the memory allocations you've align cover the most
>> speed-critical memory, then why did Tim say that about memory
>> alignment?
>
> The point is that i only introduced and used the aligned memory
> functions for the SIMD codelets, which are used for DSP and array
> processing. I'm sure that there are aligned memory allocation
> functions for either platform (maybe not necessarily
> posix_memalign...), but i wanted to stay as close as possible to
> the original PD memory functions.
> I don't think it makes much sense to use aligned memory for
> anything else than DSP and tables. If one wanted to use it with
> auto-vectorization the header code would be much the same as the
> one in the DSP perform functions, with some casting to aligned
> pointers, so that the compiler knows about it. Aliasing is another
> thing, though.
>
> greetings,
> Thomas
>
>
> _______________________________________________
> PD-dev mailing list
> PD-dev at iem.at
> http://lists.puredata.info/listinfo/pd-dev
>
>
More information about the Pd-dev
mailing list