[GEM-dev] vertices...questions without answers
IOhannes m zmoelnig
zmoelnig at iem.at
Tue Nov 22 18:04:37 CET 2005
i am currently trying to re-do the vertex stuff in a manner that
everybody can live with.
i think we agree on following (synching brains):
everything is a float! (even color; see matju's thoughts about
out-of-gamut colors; and i think it is a very hard restriction for
processing if we stick to 256 values (we already see problems with
vertices, color: 4 floats (x,y,z,w; r,g,b,a)
normals: 3 floats (x,y,z)
texcoords: 2 floats (u,v)
data is tightly packed.
however, some things are a bit unclear to me, and i'd like to have your
advice on this topic.
i have 2 base classes for vertex unops and binops.
unary operation (excerpt from Base/GemVertexObj.h)
// which vertices to manipulate
// this is a generic routine for all arrays (whether they have 2, 3
or 4 elements)
// note: you should really implement the optimized routines below
virtual void processVertex(GLfloat*array, int count, int stride);
// optimized routines
// note: the SIMD-optimized routines fall-back to the non-SIMD routines
// note: these routines fall back to the generic processVertex
// texCoords are only (u,v)
virtual void processVertex2 (GLfloat*array, int count);
virtual void processVertex2Altivec(GLfloat*array, int count);
virtual void processVertex2SSE (GLfloat*array, int count);
// normals are (x,y,z)
virtual void processVertex3 (GLfloat*array, int count);
virtual void processVertex3Altivec(GLfloat*array, int count);
virtual void processVertex3SSE (GLfloat*array, int count);
// vertices are (x,y,z,w); colors are (r,g,b,a)
virtual void processVertex4 (GLfloat*array, int count);
virtual void processVertex4Altivec(GLfloat*array, int count);
virtual void processVertex4SSE (GLfloat*array, int count);
so there are specialized functions for processing an array (without a
notion of what the data actually represents; however there is a notion
about the "array-width";
there are SIMD-optimized functions which fallback to the generic ones.
there is an optional super-generic function which normally just prints
an error message if it ever gets called.
this is just a fallback for more specialized functions (and most likely
will never get really implemented)
the render() function calls the appropriate processVertex*() function.
if the user specified to process just a sub-array of the data, this is
handled by render() (so the processVertex*() functions don't know
anything about offsets,...)
that was the easy part.
however i have some problem with the binop:
// Do the rendering
virtual void processVertices(float*larray, int lsize, int lstride,
float*rarray, int rsize, int rstride);
virtual void processVertices2(float*larray, int lsize, float*rarray,
virtual void processVertices2SSE(float*larray, int lsize,
float*rarray, int rsize);
virtual void processVertices2Altivec(float*larray, int lsize,
float*rarray, int rsize);
virtual void processVertices3(float*larray, int lsize, float*rarray,
virtual void processVertices4(float*larray, int lsize, float*rarray,
so the basic concept is the same (i just stripped some SIMD-function
declarations to make it more readable)
i thought it would be very nice, to be able to "merge" 2 arrays of
different lengths, so there are "lsize" and "rsize".
but when i converted the first code snippets, i just started to wonder,
whether this is a good idea?
most likely SIMD-optimization (especially streaming extensions) will be
disabled with such feature turned on.
so would it be better to just have a dedicated [vertex_resize] object
that does the extrapolation to bigger/smaller arrays (however well this
might work), and enforce the 2 arrays to be of the same size??
quite minor question, but i'd like to get some feedback.
additionally, what do you think of it in general?
as for now, the coder would have to write the functions for each type of
does anybody know of a way how to faciliate this task? it is basically a
copynpaste of the same code-snippet with very minor modifications.
it would be nice if the function would have to be coded only once and
the compiler/preprocessor would expand it to all eventualities.
i would hate to use weird preprocessor magic.
and the resulting code must be as fast as if it was written explicitely.
anybody has any ressources on this?
More information about the GEM-dev