[GEM-dev] vertex proposals

Thu Aug 26 18:45:13 CEST 2004

Quoting IOhannes m zmoelnig <zmoelnig at iem.at>:

> 
> naively i have assumed that all arrays are of the same dimension, which 
> is not true; texCoordArray is [Nx2] and normalArray is [Nx3], the rest 
> is [Nx4];
> this is certainly more memory-efficient.
> 
> however, i would propose to unify all arrays to [Nx4];

This is not going to work well because OpenGL wants the arrays in very specific
formatting.  A loop to rearrange arrays before uploading would be a performance
killer.  In fact, profiling the vertex_ stuff shows that a lot of time is spent
uploading the data to the card through the driver.  There are supposedly ways
to DMA this and/or eliminate driver copies, but I haven't gotten them to work. 
The best advice I can give is to find the best (fastest) array format for GL
and stick to it religiously.  The vertex_stuff has the potential for some very
heavy shit and all of the fat has to be trimmed to make the paths as fast and
unencumbered as possible.

Specifics:

> why ?
> 1) i can apply a single processVertex-function to all arrays without 
> having to know anything about the type of data (this reminds me strongly 
> of the "generalized 3d shape synthesizer")

It's a good idea, but the sacrifice of performance for flexibility might be far
to great to have a usable system in the end.

> 2) there shouldn't be a problem with memory nowadays.

GEM is very memory efficient, but take a look at what I had to do with
vertex_model - there is a cached copy of the model.  This could potentially put
some strain on memory in certain cases like my Powerbook that has 1GB max RAM,
which is even on the high side as far as laptops go.

> 3) operations on the normalArray is probably faster than with [Nx3] 
> (given that SIMD needs aligned memory)

Honestly, I don't forsee doing a whole lot of normal processing.  But take
heart, because even though the 96 bit wide data isn't ideal for SIMD it can be
dealt with efficiently.  Here's how:

while (count < max){

//interleave the float and vector ops

//vector float
vertex = vec_madd(vertex,scale,offset);

//vector int
color = vec_adds(color,color_offset);

//float
normal = normal * scale;

//vector float
texcoord = vec_madd();

}

Here the super-scalar architecture would still issue these operations
immediately as there would be nothing in the pipeline (ideal case) before them
and no dependencies either.  The only possible stall would be from a
dependency for the texcoord from a vertex op or the vertex op could not be
pipelined.  Subsequent ops would have to be pipelined as always but at the very
least all ops would be issued immediately.  Something like a PPC 970 could
really crank on this with it's dual FPUs (although the float and int vector
units share the same resources which may or may not be problematic). 

I think the arrays should remain specific to their data types for the most
efficient handling of them.  The real-time nature of GEM pretty much demands
this.

> _sources_: OBJ-loader, grid, quad, supershape, random, sphere,...
> _sinks_: draw, OBJ-exporter
> _manips_(with only a lefthand gemlist): add(=offset), scale, set, 
> matrix-multiplication, rotation
> _manips2_(with 2 gemlists): add, mul, set, blend
> _misc_: info, merge (e.g. take array1 of gemlist1 as color and array4 of 
> gemlist2 as vertex)

That's pretty close to my original list, although I favor having more
non-standard sources like the supershape rather than the usual Geo primitives. 
The OBJ exporter can use the already provided code we use for importing the
models.

Also, it is possible to use vertex arrays in display lists, so that might make
for a nice option.

cgc

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.