Creating
efficient triangle strips
Pierre Terdiman
Last revision:
04.01.2000
Efficient rendering of triangle-based meshes often requires you pack
them into triangle strips. Strips provide interesting advantages : the
mesh is stored in a more compact way, hence wasting less ram, and you save
bandwidth when you send it to your favorite rendering API. It is usually said
that you need two to three times less bus traffic with strips than with rough
triangles. In a nutshell, triangle strips are the most efficient primitives on
today’s hardware.
However strips give rise to a thorny problem : the way one should
create them. In this article I’ll review the standard algorithms used to create
strips, and discuss some improvements I made in my own stripifier.
For the happy beginners out there, let’s briefly recall what a triangle
strip is. Say your mesh contains a list of connected triangles. A triangle is
made of three vertex references, and in case of connected triangles, two of
them may be shared from one triangle to another. The list of indices resulting
from this sharing forms a triangle strip. For example those triangles :
012
123
234
345
are equivalent to a single strip :
012345
If your own 3D engine only works with trilists (i.e. standard
lists of triangles), you may wonder whether moving to tristrips is worth
the pain. List of triangles are a lot easier to deal with, especially when you
have dynamic topologies (e.g. for terrains) and may well be fast enough for
your needs... But come on ! Things never are fast enough ! You
should keep trilists for all dynamic meshes whose topology changes over time,
because recomputing strips on-the-fly is a very tedious task which sometimes
involves the use of skip lists [1], and other non-trivial data structures.
However on-the-fly stripification sometimes happens with smart drivers,
provided your trilists have been preprocessed and transformed into a suitable
format. The Dreamcast version of DirectX for example, contains an index list
optimizer which transforms your vertex references so that the driver can
further efficiently build strips during the DrawPrimitive call.
Unfortunately this doesn’t exist on PC, neither in DirectX, nor in OpenGL. So
you’re still supposed to create your own triangle strips, and that’s the only
way to reach the limits of the newest video cards such as the GeForce256.
The creation of triangle strips from an arbitrary mesh is an NP-complete
problem [4]. NP stands for « Nondeterministic Polynomial time », and
a problem is said to be NP-complete if it is both NP and NP-hard. Basically,
all we have to know is that a NP problem doesn’t have any theoretical solution.
Hence, one way or another we’re stuck with heuristic methods to solve it, and
we’re free to tackle that problem the way we want. Roughly speaking, there are
two main algorithms used to create triangle strips : the SGI algorithm
and the STRIPE algorithm. Both have been implemented and widely used
through years. The most popular source code available for the first one was
written by Brad Grantham [3], the other is part of a classic package called
STRIPE [5], written by Evans & al. At Meltdown X’99, Mike Chow from 3dfx
presented a variant of the SGI algorithm [2] which works just as well.
Unfortunately, all available codes have various problems. Brad
Grantham’s code outputs strips which aren’t single-sided, STRIPE especially
works with quads, and so on. My implementation works with triangles in input,
outputs single-sided strips, and uses the SGI algorithm or Chow’s variant
according to what the user wants. On top of that, I also implemented the
look-ahead improvement that Mike Chow presented in [2].
In order to stripify a mesh you need a data structure to describe the
connections between all faces. The adjacency structures I use are
presented in the next paragraph. Some may have used a classic winged edge
structure, but I find mine easier to handle. Anyway, you may rewrite all of
this the way you want, that’s just a matter of feeling.
Here’s the standard algorithm :
1)
choose a
starting face for a strip
2)
choose a
direction (i.e. an edge) in which you’ll walk along the strip
3)
actually extend
the strip in the chosen direction until you reach a triangle with no forwards
connections
4)
go to 1) until
all faces have been visited
A basic improvement (at least implemented by Grantham) is :
3b) reverse the strip and extend it in the opposite
direction
I’ll now go into the details of each step.
To create triangle strips by tracking connected triangles you need to be
able to jump from one triangle to any of its three possible adjacent neighbors.
Hence the need for adjacency structures which will provide this information.
Here’s what I use :
struct
AdjTriangle{
udword VRef[3]; //
Vertex references
udword ATri[3]; //
Triangle references
};
The triangle has the usual references to three vertices, and also three
new references to possible adjacent triangles. My convention say that :
ATri[0] is the triangle adjacent to edge 0-1
ATri[1] is the triangle adjacent to edge 0-2
ATri[2] is the triangle adjacent to edge 1-2
Of course edge a-b is the one which links VRef[a] and VRef[b]. I could
have used three explicit names (say ATri01, ATri02 and ATri12) but using an
array allows for an easiest access. I also could have used pointers but indices
are better : I can pack extra information in their most significant bits.
Current version uses the two most significant bits to encode an edge number
between 0 and 2. What edge number ? For example when I jump from triangle
A to triangle B through the edge 0-1 (i.e. A->ATri[0]), the counterpart edge
in B is the one whose ID is encoded by the two most significant bits of
A->ATri[0]. That way I automatically know where I come from when I jump to a
new triangle. This is not difficult to implement, and this is quite handy.
Another way would have been to keep track of the two vertex references of the
incoming edge, and scan the three vertex references of the new triangle to find
which of its edges is the same as the one we just crossed. Yuck !
My way is a lot faster, I just have a udword to shift. Is it worth the extra
code ? Is the speed gain negligible ? You may think that’s a lot of
pain just to create triangle strips, don’t you ? Correct. But the way
they’re built, you can use the exact same structures to implement a lot of
other things. I successfully used them to track the silhouette of a mesh (an
operation involved in the shadow volumes computation, but also in occlusion
culling using shadow frusta, etc), do local searches in collision detection,
and subdividing surfaces with the modified Butterfly algorithm. Among other
things. Trust me, those ones are worth the effort.
Fine. Now, how do we create the structures ?
We first allocate a structure for each face, initialized will the
correct vertex references and null links. Since we’re not using pointers, a
null link is encoded as 0xffffffff. We also create three edge structures for
each triangle :
struct
AdjEdge{
udword Ref0; //
Vertex reference
udword Ref1; //
Vertex reference
udword FaceNb; // Owner face
};
At first we don’t try to detect shared edges, we just create three times
as many edges as there are faces. Vertex references are pre-sorted in those
structures, so that Ref0 < Ref1. That way we ensure an edge a-b will be
encoded as an edge b-a. All we have to do next is to sort all the AdjEdge
structures, in order to group faces sharing the same edge. If we use an enhanced
Radix Sort as described in [6], this part is both easy and fast : actually
just one line of code. Then, we just have to read the structures in sorted
order to get connected faces back. Once we have those faces and the common
edge, we just have to fill the adjacency structures with the relevant
information. Some faces may have boundary edges whose links will remain
encoded as 0xffffffff. As we will see just below, those faces will be the first
ones chosen by the SGI algorithm when we’ll start creating strips.
There are two standard ways of choosing the first triangle of a
strip:
-
the first
method is the one from the SGI algorithm : pick the less connected faces
first. Such faces, due to their lack of connections, are easily left isolated
after some strips have been created. The idea is to take those faces first, in
order to reduce the number of isolated triangles in the final mesh. This is
just a heuristic method, but it works well in practice.
-
The second
method is just to take the first free face, i.e. the first face which doesn’t
belong to a strip yet. This is what Chow does and according to him it’s just as
good as the SGI way.
Both methods can be coded in a very similar way by using a precomputed insertion
order, i.e. the order in which we’ll pick the faces to start creating a new
strip. The insertion order for the second method is just the decimal one :
we pick triangle 1 first, then triangle 2, and so on. For the SGI algorithm, we
just have to sort the default list according to the number of connections of
each triangle. The number of connections depends on the number of boundary
edges, as defined in the previous paragraph, and we already have this
information thanks to the adjacency structures. We just have to call the sort
routine once more.
My own code takes one method or the other according to what the user
wants.
Once we have our starting face, we must choose a direction in which
we’ll extend the strip. Most implementations just take the first edge, and let
it go. Chow proposed in [2] to look ahead. I don’t know if what I did
was the same as what he suggested. In my implementation indeed I chose the
simple, brute-force solution : there are three edges, or three possible
directions, then I just compute the three possible strips and select the longest
one.
Since the creation of a strip takes linear time, this is not a matter of
running time, but just a matter of coding : this is relatively painful to
implement, because you must keep track of every operations you do, just to be
able to discard them in the end. Well. No subtle things there, just some more
buffers to fill and some extra code to debug.
Of course I also reverse the three possible strips and extend all of
them in the opposite directions. Hence, starting from a given face, I just
can’t miss the best possible strip sharing this face. The only thing which
could produce better strips is the order in which you select your starting
faces, but this is left for further investigation.
Walking the strip is done with the following code :
udword Length = 2; // Initial length is 2, we have 2 indices
in input
strip[0] = oldest; // First index of the strip
strip[1] = middle; // Second index of the strip
bool DoTheStrip =
true;
while(DoTheStrip)
{
//
Get the third index of a face given two of them
udword
Newest = mAdj->mFaces[face].OppositeVertex(oldest, middle);
strip[Length++]
= Newest; // Extend the strip,...
*faces++
= face; // ...keep track of the face,...
tags[face]
= true; // ...and mark it as "done".
// Get the edge ID…
ubyte
CurEdge = mAdj->mFaces[face].FindEdge(middle, Newest);
//
...and use it to catch the link to adjacent face.
udword
Link = mAdj->mFaces[face].ATri[CurEdge];
if(IS_BOUNDARY(Link)) DoTheStrip = false;
//
If the face is no more connected, we're done...
else
{
// ...else the link gives us the new face index.
face = MAKE_ADJ_TRI(Link);
// Is the new face already done?
if (tags[face]) DoTheStrip=false;
}
oldest
= middle; // Shift the indices and wrap
middle
= Newest;
}
return Length;
Click here for Picture1 :
the three triangle strips (Red, Magenta, Brown) have all been generated
starting from the first face of the sphere. Only the longest one is kept in the
final strip, others are discarded.
There is a little problem with standard available code for
tristrips : backface culling. For example the code from Brad Grantham
produces strips whose orientation isn’t guaranteed to be the same as the one of
the original mesh. In other words, if you want your strip to be correctly
displayed, you must use double-sided faces. Quite annoying. This problem comes
from the 3b) optimisation : when reversing the strip, the original
starting face can be flipped, depending on the strip length. Care must be taken
to ensure the final strip (extended in both directions) has the same
orientation as the original model.
That’s not as easy as it sounds. To reverse the culling of a strip, it
isn’t sufficient to write it in reverse order. It actually depends on its length.
Let’s have some examples.
Say we start from face (0,1,2). From this face, we create the first part
of a strip, for example :
0
1 2 3 4
Those 5 indices encode 3 faces :
012
123
234
Please note a strip of N indices always encodes N-2 faces. The first face encoded, (0,1,2) is in an arbitrary CW order, and we want that order to be conserved. Now, to extend the strip in the opposite direction, we must write it in reverse order, so that the first face becomes the last one :
4
3 2 1 0
Then we can extend that strip even more, and compute a final strip which would be, for example :
4
3 2 1 0 5 6 7
But this strip now encodes the following faces :
432
(+)
321
(-)
210
(+)
105
(-)
056
(+)
567
(-)
The original face (0,1,2) is now the third one in the strip. Don’t
forget that culling order is inverted from one face to the next in a
triangle strip. The sign near the faces gives you the actual culling. It is the
same as :
432
(+)
312
(+)
210
(+)
150
(+)
056
(+)
576
(+)
Hence, our original face is now displayed as (2,1,0), which is the same
as (0,2,1) (you can safely « scroll » the numbers without changing
the culling). As you see, that face has been inverted, and if the original one
was CW, that one is to be displayed as CCW.
If you just write the whole strip in reverse order once gain, you may
not fix the problem :
7
6 5 0 1 2 3 4
encodes
765
(+)
650
(-)
501
(+)
012
(-)
123
(+)
234
(-)
That is, we get our original face back (0,1,2) but its position in the
strip implies it still is displayed as CCW (see the minus sign on the side).
Would our final strip have had another extra index, the final culling for our
original face would have been correct.
In bad cases such as our example, the only solution is to replicate the
first index, introducing a void vertex but actually fixing the culling
problem.
In short, here’s the recipe for single-sided strips:
-
if the length
of the first part of the strip is odd, the strip must be reversed
-
to reverse the
strip, write it in reverse order. If the position of the original face in this
new reversed strip is odd, you’re done. Else replicate the first index.
Picture 2 :
Click for a one-sided striped teapot.
Longer strips can be created by using swaps [7], but multiple
strips can also be connected in a single one, even if they don’t share
vertices. For example, say you want to connect the strip :
0
1 2 3 4
and the strip :
5
6 7 8 9
They don’t have any common vertex, but you still can create the following strip :
0
1 2 3 4 4 5 5 6 7 8 9
Which encodes the following triangles :
012
(+)
123
(-)
234
(+)
344
(-)
445
(+)
455
(-)
556
(+)
567
(-)
678
(+)
789
(-)
This method requires inserting two void vertices in the strip,
and it produces four void faces, but since those faces have two common
indices, they rapidly get discarded by your rendering API as zero-area faces,
and don’t get very far into the rendering pipeline – ie they’re almost free.
Whatever happens, sending two more vertices is still better than calling
DrawPrimitive two times. Multiple rendering calls indeed is the main
disadvantage of strips, compared to trilists. If you manage to render all your
strips with a single call, you’re getting best of both worlds. Hence, even if
linking two strips in such a way introduces extra vertices, it usually is a
win. As far as I know, this method works well in practice.
However, as the signs near the faces suggest, linking two strips may
flip the culling ! That’s why packing all your strips in a single one is
somewhat delicate. The easiest way is to create your strips in the usual way,
as described until now, and to link them together only in the end. Sometimes
you will need to flip a strip, as in the example above. This happens when the
total length of all accumulated strips is odd. To flip the incoming strip you
can just replicate the first vertex. However that strip may already
begin with a replicated vertex, because of the way we created them. In such
cases, we just can discard the first replicated vertex, which flips the strip
as well, but saves some space.
Picture 3 :
a mesh packed in a single, single-sided strip
Source code
All
the features discussed in this article have been implemented in the companion source code. You’re free to
include it in your own commercial or non-commercial products, as long as you
send me a mail for
notification. Feedback is welcome as well.
References
[1] Jihad El-Sana, Elvir Azanli, Amitabh Varshney, Skip Strips :
maintaining Triangle Strips for View-dependent Rendering
[2] Mike Chow, Using Strips for Higher Game Performance, Meltdown
X99 presentation
[3] Brad Grantham’s web site :
http://tofu.alt.net/~grantham/meshifier/show.html
[4] Francine Evans and Steven S. Skiena and Amitabh Varshney. Optimizing
Triangle Strips for Fast Rendering , IEEE Visualization '96, pp.
319-326 (October 1996). IEEE. Edited by Roni Yagel and Gregory
M. Nielson. ISBN 0-89791-864-9.
[5] STRIPE homepage :
http://www.cs.sunysb.edu/~stripe/
[6]
Pierre Terdiman, Radix Sort Revisited,
http://codercorner.com/RadixSortRevisited.htm
[7] Tomas
Möller, Eric Haines, Real-Time Rendering (p.232) ISBN 1-56881-101-2