While I’m at it…
Friday, February 22nd, 2008What’s the deal with alpha-sorted polygons those days? I hear more and more that engine X or Y doesn’t support them because “it’s too slow”. Errr, hello? We even sorted polygons on 8 Mhz Atari STs, I assume you can manage to do the same on a 3 Ghz multi-core machine? Sure the number of polygons has increased, but so has the CPU speed. So what’s the problem?
- the first problem is that they don’t use the right algorithm. If the number of polygons and the CPU speed evolve in parallel, the ratio only remains constant with an O(n) sort. Since they insist on using qsort, well, they’re doomed before the race even starts.
- the second problem is that they don’t generate their sort keys efficiently. No, you don’t need that square root. No, you don’t need the distance to the camera position. You need a simple dot-product: the vertex projection on the camera’s forward vector.
- the third problem is that they sort everything each frame. It’s useless. Sort the index buffer once, and only sort it again when the view direction has changed significantly. You don’t need an exact sort each frame since the whole process is not accurate to begin with (each triangle is sorted according to a single witness point). So it means you need to actually sort stuff only once in a while - or all the time but over N frames, if you prefer.
- the fourth problem is that they sort at runtime. You don’t have to. As we just said the sorting is approximate, so you can precompute 8 sorts/index buffers for the 8 possible octants in which the view vector can be, and simply pick up the correct one at runtime. This has been used countless times in countless demos on Atari and PC. Seriously old skool. It certainly increases the memory used, but here’s my deal for you: drop the STL, and you’ll win back all the memory you need to include those precomputed sorted buffers. The number of transparent objects shouldn’t be huge anyway, so this is usually not a problem. Especially since those sorted buffers can be easily compressed like crazy, and decompressed to the actual D3D index buffer only when the view vector changes.
So, what’s your excuse already?