[GEM-dev] vertices...questions without answers

Tue Nov 22 18:04:37 CET 2005

hi

i am currently trying to re-do the vertex stuff in a manner that 
everybody can live with.

i think we agree on following (synching brains):

everything is a float! (even color; see matju's thoughts about 
out-of-gamut colors; and i think it is a very hard restriction for 
processing if we stick to 256 values (we already see problems with 
pix-processing))

vertices, color: 4 floats (x,y,z,w; r,g,b,a)
normals: 3 floats (x,y,z)
texcoords: 2 floats (u,v)

data is tightly packed.

however, some things are a bit unclear to me, and i'd like to have your 
advice on this topic.

i have 2 base classes for vertex unops and binops.

unary operation (excerpt from Base/GemVertexObj.h)
<GemVertexObj.h>
   //////////
   // which vertices to manipulate
   // this is a generic routine for all arrays (whether they have 2, 3 
or 4 elements)
   // note: you should really implement the optimized routines below
   virtual void processVertex(GLfloat*array, int count, int stride);

   //////////
   // optimized routines
   // note: the SIMD-optimized routines fall-back to the non-SIMD routines
   // note: these routines fall back to the generic processVertex

   // texCoords are only (u,v)
   virtual void processVertex2       (GLfloat*array, int count);
   virtual void processVertex2Altivec(GLfloat*array, int count);
   virtual void processVertex2SSE    (GLfloat*array, int count);

   // normals are (x,y,z)
   virtual void processVertex3       (GLfloat*array, int count);
   virtual void processVertex3Altivec(GLfloat*array, int count);
   virtual void processVertex3SSE    (GLfloat*array, int count);

   // vertices are (x,y,z,w); colors are (r,g,b,a)
   virtual void processVertex4       (GLfloat*array, int count);
   virtual void processVertex4Altivec(GLfloat*array, int count);
   virtual void processVertex4SSE    (GLfloat*array, int count);
</GemVertexObj.h>
so there are specialized functions for processing an array (without a 
notion of what the data actually represents; however there is a notion 
about the "array-width";

there are SIMD-optimized functions which fallback to the generic ones.
there is an optional super-generic function which normally just prints 
an error message if it ever gets called.
this is just a fallback for more specialized functions (and most likely 
will never get really implemented)

the render() function calls the appropriate processVertex*() function.
if the user specified to process just a sub-array of the data, this is 
handled by render() (so the processVertex*() functions don't know 
anything about offsets,...)

that was the easy part.
however i have some problem with the binop:

<GemVertexDualObj.h>
//////////
// Do the rendering
virtual void    processVertices(float*larray, int lsize, int lstride, 
float*rarray, int rsize, int rstride);

virtual void    processVertices2(float*larray, int lsize, float*rarray, 
int rsize);
virtual void    processVertices2SSE(float*larray, int lsize, 
float*rarray, int rsize);
virtual void    processVertices2Altivec(float*larray, int lsize, 
float*rarray, int rsize);
virtual void    processVertices3(float*larray, int lsize, float*rarray, 
int rsize);
virtual void    processVertices4(float*larray, int lsize, float*rarray, 
int rsize);
</GemVertexDualObj.h>

so the basic concept is the same (i just stripped some SIMD-function 
declarations to make it more readable)

i thought it would be very nice, to be able to "merge" 2 arrays of 
different lengths, so there are "lsize" and "rsize".

but when i converted the first code snippets, i just started to wonder, 
whether this is a good idea?
most likely SIMD-optimization (especially streaming extensions) will be 
disabled with such feature turned on.

so would it be better to just have a dedicated [vertex_resize] object 
that does the extrapolation to bigger/smaller arrays (however well this 
might work), and enforce the 2 arrays to be of the same size??

quite minor question, but i'd like to get some feedback.

additionally, what do you think of it in general?

as for now, the coder would have to write the functions for each type of 
arrays.
does anybody know of a way how to faciliate this task? it is basically a 
copynpaste of the same code-snippet with very minor modifications.
it would be nice if the function would have to be coded only once and 
the compiler/preprocessor would expand it to all eventualities.
i would hate to use weird preprocessor magic.
and the resulting code must be as fast as if it was written explicitely.

anybody has any ressources on this?

fma.dsr.
IOhannes