Archive for the 'Physics' Category

The evolution of PhysX (12/12) - Final test and conclusion

Saturday, May 11th, 2013

This last test (“SeaOfStaticBoxes3”) didn’t fit in any category, so I kept it for the end. It does not actually do anything, it only creates a massive amount of static boxes (255*255). There is otherwise no dynamic objects in the scene, nothing to simulate at all.

There are 2 things to notice here. The first one is memory usage. We went from 94Mb to 54Mb to 39Mb, from one PhysX version to another. That’s the right direction here!

The other thing is the time it takes to simulate an empty scene. As you can see, it doesn’t take much time with PhysX – as it should. Bullet however, for some reason, takes a massive amount of time here. On my home PC, Bullet takes about 34ms (!?) to simulate this empty scene. I double-checked everything, looking for something I did wrong, but it turns out the problem has already been reported on the Bullet forums. I think this should be fixed and work out-of-the-box.

In any case, I am not here to fix Bullet issues. The point is simply, again, that contrary to what people may still believe, PhysX is actually very optimized and a perfectly fine CPU physics engine. In fact, if some competitors would not prevent me from publishing the results, I would happily show you that it often beats everybody else. I invite curious readers to create their own benchmarks and see for themselves.

—-

This report is currently incomplete: it did not talk about CCD, or multithreaded simulations, or overlap tests. It also barely scratched the surface of what these physics engines have to offer: I did not talk about character controllers, vehicles, the performance of binary deserialization, or any of those more advanced topics.

As you can easily imagine, I’d need to write a book to cover all this, not just some blog posts.

However I think these posts may reach their limited goal, which is simply to show with very basic tests that:

  • PhysX is getting faster all the time
  • PhysX is very well optimized, thank you very much

The evolution of PhysX (11/12) - More sweep tests

Saturday, May 11th, 2013

Now that we’re done with the simple stuff, let’s jump to the next level: meshes.

The first mesh scene (“SceneBoxSweepVsStaticMeshes_Archipelago”) performs 1K vertical box sweeps against the mesh level we previously used.

There are no terribly big surprises here. All engines report the same number of hits. Each PhysX version is faster than the one before. They all perform adequately.

PhysX 3.3 is only about 2X faster than PhysX 2.8.4, so things were generally ok there.

Bullet suffers a bit, being a bit more than 6X slower than PhysX 3.3 on average.

—-

For the next tests we switch back to the Konoko Payne level, which is much more complex and thus, maybe, closer to what a modern game would have to deal with. We start with sphere-sweeps against it (“SceneSphereSweepVsStaticMeshes_KP”).

Results are quite similar to what we got for the Archipelago scene: PhysX gets faster with each new version, 3.3 is a bit more than 2X faster compared to 2.8.4.

Bullet suffers again, being now a bit more than 8X slower than PhysX 3.3 on average.

—-

The same test using box-sweeps (“SceneBoxSweepVsStaticMeshes_KP”) reveals the same kind of results again.

This time PhysX 3.3 is about 3.3X faster than 2.8.4.

Remarkably, PhysX 3.3 is more than an order of magnitude faster than Bullet here (about 11.6X)!

—-

Finally, the same test using capsule-sweeps (“SceneCapsuleSweepVsStaticMeshes_KP”) confirms this trend.

PhysX 3.3 is a bit more than 2X faster than 2.8.4, once again.

As for Bullet, it is now about 15X slower than PhysX 3.3. I feel a bit sorry for Bullet but I think this proves my point: people who claim PhysX is not optimized for CPU should pay attention here.

—-

Now the KP scene, as you may remember from the raycasts section, has a lot of meshes in the scene but each mesh is kind of small. We will now test the opposite, the Bunny scene, which was a scene with just one highly-tessellated mesh.

And since we are here to stress test the engines, we are going to run several long radial sweeps against it.

And using large shapes to boot.

This is not supposed to be friendly. This is actually designed to break your engines.

So, what do we get?

—-

Let’s start with the spheres (“SceneSphereSweepVsStaticMeshes_TessBunny”).

Note that we only use 64 sweeps here, but it takes about the same time (or more) than performing 16K radial raycasts against the same mesh… Yup, such is the evil nature of this stress test.

It is so evil that for the first time, there really isn’t much of a difference between each PhysX versions. We didn’t progress much here.

Nonetheless, PhysX is an order of magnitude faster than Bullet here. Again, contrary to what naysayers may say, the CPU version of PhysX is actually very fast.

In fact, stay tuned.

—-

Switch to capsule sweeps (“SceneCapsuleSweepVsStaticMeshes_TessBunny”). Again, only 64 sweeps here.

There is virtually no difference between PhysX 3.2 and 3.3, but they are both measurably faster than 2.8.4. Not by much, but it’s better than no gain at all, as in the previous test.

Now the interesting figure, of course, is the performance ratio compared to Bullet. Yep, PhysX is about 25X faster here. I can’t explain it, and it’s not my job. I’m only providing a reality check for a few people here.

—-

Ok, this is starting to be embarrassing for Bullet so let’s be fair and show that we also blew it in the past. Kind of.

Switch to box sweeps (“SceneBoxSweepVsStaticMeshes_TessBunny_Test1”). And use fairly large boxes because it’s more fun. Again, we only need 64 innocent sweep tests to produce this massacre.

Look at that! Woah.

“Each PhysX version is faster than the one before”, hell yeah! PhysX 3.3 is about 4X faster than PhysX 3.2, and about 31X faster (!) than PhysX 2.8.4. Again, if you are still using 2.8.4, do yourself a favor and upgrade.

As for Bullet… what can I say? PhysX 3.3 is 123X faster than Bullet in this test. There. That’s 2 orders of magnitude. This is dedicated to all people still thinking that PhysX is “not optimized for the CPU”, or “crippled”, or any other nonsense.

The final test (“SceneLongBoxSweepVsSeaOfStatics”) has a pretty explicit name. It’s just one box sweep. The sweep is very large, going from one side of the world to the other, in diagonal. The world is massive, made of thousands of statics. So that’s pretty much the worst case you can get.

The results are quite revealing. And yes, the numbers are correct.

For this single sweep test, PhysX 3.3 is:

  • 60X faster than PhysX 3.2
  • 270X faster than PhysX 2.8.4
  • 317X faster than Bullet

Spectacular, isn’t it?

The evolution of PhysX (10/12) - Sweep tests

Saturday, May 11th, 2013

Another important feature is sweep tests, a.k.a. “linear casts” or “shape casts”. This can be used, for example, to implement character controllers. What I previously wrote for raycasts is even more valid for sweeps: a lot of things depend on whether you go for a generic GJK-based sweep, or for dedicated shape-vs-shape sweep functions. And since sweep functions are usually much more difficult to write than raycast functions, this is often the part of physics engines where things sometimes go spectacularly wrong.

There are even more potential things to test here, than for raycasts. Even when you go for customized raycast functions, you get away with N of them (e.g. for N supported shapes you need to write N raycast functions). For sweeps however, you can decide to sweep any shape against any other, so you suddenly need to write N*(N-1)/2 functions, and each of them is a lot harder than just a raycast. This is why most engines just go for a generic GJK-based sweep function, which can be used as-is for all cases. The downside of this is that the generic code, as often, may not be as fast as a customized function.

But enough theory: let’s see what we get. I will only focus on a small set of selected tests, rather than tediously listing the results for all N*(N-1)/2 cases.

—-

Our first scene (“SceneBoxSweepVsStaticSpheres”) uses the same array of spheres as our initial raycast test. But instead of doing a raycast against each of them, we do a box sweep. We are again using the scene-level API functions to do so.

The boxes are slowly rotating over time, to make sure that we hit all codepaths from the sweep functions, and just to make sure that things don’t suddenly break for some angle, for some reason.

The results are a bit different than usual. Each PhysX version is faster than the one before… kind of. This time there isn’t much of a difference between 2.8.4 and 3.2. It looks like 3.2 is a wee bit faster, but frankly that might be just noise at this point.

Bullet looks fine in this test, very similar to PhysX 2.8.4 and 3.2, although slightly slower. The profile curve also isn’t as smooth as for PhysX but overall it’s quite comparable.

The clear winner in this test though, is PhysX 3.3. It is about 3X faster than both Bullet and the previous PhysX versions. Nice.

—-

Let’s stay with box-sweeps, but let’s try against static boxes now (“SceneBoxSweepVsStaticBoxes”).

Things are a bit different this time: Bullet is faster than 2.8.4 is that test. On the other hand it is slower than PhysX 3.x.

Each PhysX version is faster than the one before – checked. In fact, on average, each PhysX version is 2X faster than the one before. That’s quite something!

—-

Ok we did box-sweeps vs spheres and boxes, let’s see what happens against capsules in the next test (“SceneBoxSweepVsStaticCapsules”).

PhysX 3.x is significantly faster than PhysX 2.8.4 – with again each PhysX version being faster than the one before. In terms of relative performance there’s a remarkable 6.7X speedup on average between 2.8.4 and 3.3. Nice.

Bullet has about the same performance as 2.8.4, with again a much more chaotic curve. But PhysX 3.3 is about 8X faster than Bullet on average here.

—-

Now, you guessed it, after spheres, boxes and capsules, and in sake of completeness, let’s have a look at box-sweeps against convex objects (“SceneBoxSweepVsStaticConvexes”).

Wo-ah. Now that is a surprise! See, that’s what I what talking about: sometimes things can go spectacularly wrong with sweeps, and you really need a strong test suite to make sure you catch all those problems.

Clearly the 2.8.4 code had a problem here. I can’t remember what it was, but clearly, if you are still using 2.8.4 and you do a lot of box-sweeps against convexes, you seriously need to upgrade. On average, PhysX 3.3 is 34X faster than 2.8.4 in this case. Now that is some serious speedup.

Other than that, each PhysX version is faster than the one before (good), and contrary to what we saw just in the previous test, Bullet remains quite competitive here, even against PhysX 3.3.

So, that’s sweep tests. You never know until you actually profile.

—-

Now, we could repeat this exercise for sphere-sweeps and capsule-sweeps against all the other shapes as we just did for boxes, but the results are very similar. The only surprise comes again from the capsule-vs-convex case (“SceneCapsuleSweepVsStaticConvexes”).

In short, this is the same as for box-vs-convex: 2.8.4 was pretty bad here. On average, PhysX 3.3 is 17X faster than 2.8.4 in this case.

Bullet’s profile curve is weird. I can’t explain why it suddenly becomes faster towards the end. It is probably related to the fact that the capsules are rotating, and the algorithm may switch from one case to another, but I don’t know the Bullet’s internals enough to tell.

The evolution of PhysX (9/12) - Raycasts vs meshes

Saturday, May 11th, 2013

Now, simple shapes are one thing but as for rigid body simulation, what really matters are triangle meshes. So let’s start the serious tests.

The first scene involving meshes (“SceneRaycastVsStaticMeshes_Archipelago16384”) does 16K vertical raycasts against the Archipelago mesh level.

No big surprise here. All engines find the same number of hits. On the right side of the screenshot you can see all the rays that missed the mesh level. It’s all correct.

Each PhysX version is faster than the one before. Bullet is the slowest but remains competitive (“only” about 2X slower than 3.3 here).

—-

The next scene (“SceneRaycastVsStaticMeshes_TessBunny16384”) does again 16K raycasts, but this time they’re radial rays instead of vertical, and we cast them against a single highly tessellated mesh (yes, I guarantee you there is a mesh underneath that mass of raycasts :))

Using a single highly tessellated mesh means that this test stresses the “midphase” structure from that mesh, rather than the scene-level structure. Using radial rays means that engines not using node sorting in the tree traversal will be disadvantaged.

As before, each PhysX version is faster than the previous one. Bullet is slowest, and about 4X slower than 3.3.

—-

The next scene (“SceneRaycastVsStaticMeshes_KP16384”) is the opposite of what we just tested: it now contains a lot of small meshes, none of them having a lot of triangles. So the test stresses the scene-level “broadphase” structure, more than the individual “midphase” mesh structures.

The scene is just the mesh level from Konoko Payne: it contains 4104 meshes, for a total of 592267 triangles. We perform 16K radial raycasts again, from a random position deep inside the level.

The results are in line with what we saw so far. Each PhysX version is faster than the one before. Bullet is slowest, about 4.7X slower than 3.3 on average.

—-

I repeated the same test from another initial ray position in the level, to see if we got the same results (“SceneRaycastVsStaticMeshes_KP16384_2”). This time some of the rays hit a mesh, some don’t.

Each library (pretty much) reports the same number of hits, that’s good.

Each PhysX version is faster than the one before, that’s good.

Bullet is slowest, about 3.47X slower than 3.3 on average. It’s slightly better than before but not exactly fast. Still, I wanted to see if the better performance was due to the rays not hitting anything, so I created a last test in the same scene, where none of the rays hit anything (“SceneRaycastVsStaticMeshes_KP16384_NoHit”).

But it did not change much of the results. Bullet is still slowest, and about 4X slower than 3.3. Well these are pretty consistent results at least.

—-

The final test for this section (“Terrain_RT”) is a raytracing test against a large terrain (the same terrain we previously used for the ragdoll test).

Compared to the KP scene the mesh level is different, the raycasts are neither radial nor vertical anymore, but the results are quite similar to what we saw so far.

All engines find the same number of hits, that’s good. Each PhysX version is faster than before, that’s good. PhysX 3.3 is again a bit more than 2X faster than 2.8.4, and a bit more than 4X faster than Bullet.

Well, contrary to what people may say on the Internet, we do optimize our code for the CPU.

The evolution of PhysX (8/12) - Raycasts vs simple shapes

Saturday, May 11th, 2013

Measuring performance of rigid body simulation is important. But the dark secret of game physics, if you ask me, is that it is not actually the most important part. The most important and far more useful features for a game are simple collision queries like raycasts and sweeps. So let’s see what we get here.

There’s really an army of tests that one could create here, to stress and benchmark various parts of the system. For example for raycasts, like for contact generation, you usually have a choice between using a generic piece of code to implement all your raycasts (e.g. GJK), or you bite the bullet and implement a dedicated raycast function for each shape (e.g. ray-vs-box, ray-vs-sphere, etc). If you do so, one of these functions may be particularly slow compared to the others, e.g. ray-vs-convex-mesh, and thus you’d need a dedicated test scene to discover this.

PEEL does contain a test scene for each case. But I am not going to report all of them here, since at the end of the day they’re mostly saying the same thing. The following, thus, is only a selected subset of the tests implemented in PEEL.

—-

The first scene (“SceneRaycastVsStaticSpheres”) is as simple as it gets. There’s a bunch of static spheres and we do a raycast against each of them, via the scene-level API. This is important: most physics engines also allow you do raycast directly against the shape itself (bypassing the scene level). But we are interested in the generic raycast performance here, and a large part of that is related to traversing the scene-level structures. Hence, we use the scene/world-level API in all the following tests. (This also means that various local acceleration structures like phantoms in Havok or volume-caches in PhysX are ignored here. We could run some other tests some other day for those things).

In the collision-queries tests I excluded the second PhysX 3.3 build using PCM, since PCM has no impact at all on any of these queries.

Anyway, without further ado, here’s what we get:

The profile graphes in these tests is often very flat, since the rays are not moving, the scene is static, and thus we do exactly the same work each frame.

As usual the main thing we want to check is that each PhysX version is faster than the previous version. This is clearly the case here, good.

Bullet appears to be significantly slower (up to 3X slower than PhysX 3.3), even on this simple test.

—-

The second scene (“SceneRaycastVsDynamicSpheres”) is exactly the same, but we made all the spheres dynamic objects (and thus they are now falling under gravity). This is just to check whether this makes any difference in terms of performance. Turns out it does not, numbers are pretty much the same as before. This is why we will not bother with dynamic objects anymore in subsequent tests.

—-

Replacing the spheres with boxes or capsules gives similar results (Bullet < 2.8.4 < 3.2 < 3.3). However things change when replacing the spheres with convexes, as seen in our next scene (“SceneRaycastVsStaticConvexes”).

As I was just saying before, a dedicated ray-vs-convex raycast function might be slower than the other ones, and this is what we see here.

Each PhysX version is faster than the one before, that’s good.

Contrary to what happened for the other shapes, Bullet is faster than 2.8.4 in this case. It remains significantly slower than PhysX 3.x though.

—-

The next scene (“PotPourri_StaticRaycasts2”) creates a whole bunch of random shapes (spheres/boxes/capsules) at random positions in a cubic volume of space, and creates 20000 random raycasts in that space. The resulting screenshot admittedly looks like a giant exploding mess:

An interesting thing to note is that each library returns a different number of hits. This has to do with a number of things. First there is how each of them treats raycasts whose origin lies inside a shape. Some libraries like Havok do not count this as a hit, for example. Others may not give unified results in this case, i.e. maybe a ray starting inside a box gives a hit, but a ray starting inside a sphere does not. I did not investigate the details of it, but we clarified and unified all of these rules in PhysX 3.3.

Another thing that may contribute to the different number of hits is just FPU accuracy and what happens in limit cases. If the ray just ends on the surface of the shape, or if it just touches an AABB from the scene’s AABB-tree, this can be seen as a hit, or not. Using SIMD or not can also make a difference in these limit cases. Etc. In any case the results are approximately the same, and one could verify that they are indeed all “correct”.

Anyway, each PhysX version is faster than the one before – checked. PhysX 3.3 is about 2X faster than 2.8.4.

Bullet is again significantly slower, about 5X slower than 3.3.

The evolution of PhysX (7/12) - Joints

Saturday, May 11th, 2013

Our test scenes so far only used simple rigid bodies. We’ll now look at a few scenes involving rigid bodies connected by joints.

The first scene (“BridgeUsingHinges”) contains a bunch of bridges, made of boxes connected by hinge/revolute joints. This is again a pretty standard test scene, just to check that the basics are indeed working.

And indeed, they are.

The results are not surprising and not especially revealing. This stuff just works in all engines, which is what we expected from mature libraries. Performance and memory usage are pretty much the same, even though PhysX 3.3 seems to have an edge compared to the competitors.

—-

The next two scenes (“HingeJointChain” and “SphericalJointChain”) are similar, and they just create chains of objects connected by either hinge or spherical joints (one end of the chain being attached to a static object).

No big surprise here. Again, this stuff just works, and both performance and memory usage are very similar in all engines.

Compared to the first scene though, Bullet appears to be measurably slower than PhysX here. On the other hand PhysX 3.3 seems to be once again measurably faster than the other libraries, even in these simple scenes.

But let’s see what happens with more complex scenarios.

—-

The next scene (“SphericalJointNet”) creates a large grid of sphere objects connected by spherical joints, creating a “net” falling down on a larger static sphere.

The resulting curves are interesting. 2.8.4 and 3.2 don’t like this test much, and there’s a big spike when the net touches the large static sphere – which does not appear neither in Bullet nor in 3.3.

I believe this spike is related to broadphase. By nature the SAP algorithm doesn’t really like artificial scenarios with perfectly aligned grid of objects, and when the first collisions occur it probably creates a lot of useless swaps within the internal sorted arrays. In any case this is the scenario I used in this test, bad luck. This is a fair game and I’m not going to edit the scene just to hide this.

I believe Bullet doesn’t spike here because I used its “dbVt” option by default (instead of SAP). And PhysX 3.3 doesn’t spike simply because we improved the SAP there quite a bit – in fact, IIRC we used PEEL and this very scene to identify the problem and work out a fix.

Anyway, if we ignore this initial broadphase problem and focus on the second part of the curves (which is when the net has wrapped itself around the large sphere), we can see what the joint-performance looks like. And we find our “expected” results again: each version of PhysX is a bit faster than the previous one, while Bullet remains noticeably slower.

Kudos to Bullet in that one nonetheless, for avoiding falling into the broadphase trap setup in that test.

—-

The final scene (“Ragdolls_256_OnTerrain”) spawns 256 ragdolls on a massive landscape. Each ragdoll has 19 bones and 18 joints. They are spawned at different altitudes to make sure we don’t hit a pathological case in the broadphase. The landscape is made of 256 chunks, for a total of 492032 triangles.

The PhysX 2.8.4 and PhysX 3.2 curves intersect in that one, with 2.84 being a bit faster in the first part, and then slower in the second part. The average time is pretty similar for both.

Then there is PhysX 3.3, which is clearly faster (with or without PCM), and Bullet which is clearly slower.

The evolution of PhysX (6/12) - Rigid bodies (meshes)

Saturday, May 11th, 2013

We only used simple shapes so far, but what really makes or breaks an engine is triangle meshes. Supporting meshes is a lot more difficult than supporting other shapes – there’s a full new “midphase” pass that did not exist before, contact generation becomes much harder, meshes use up quite a bit of extra memory, etc.

So let’s see!

—-

The first scene (“BigConvexesOnPlanarMesh”) just drops a lot of big convexes on a tessellated planar mesh. Vanilla stuff just to see where we stand.

The curves are quite interesting. The initial flat part is when the convexes are falling towards the mesh, without actually touching anything. Then follows a large spike, when they hit the mesh. Then things settle down and engines recover from the hit.

Let’s start with memory usage to get that out of the way. There are basically two categories: engines using about 4Mb (PhysX 2.8.4, PhysX 3.3) and engines using twice (!) that amount (Bullet, PhysX 3.2). I’m not sure how to explain it, or why it’s either 4 or 8. In any case, minus point for PhysX 3.2 here.

Each PhysX version is faster than the one before: good. However there was clearly something fishy with 2.8.4, which is slower than Bullet in this scene. That did not happen much so far. Gut feeling is that the SAT code from 2.8.4 didn’t handle that large-convex-vs-tessellated-mesh situation very well.

So, as far as PhysX is concerned, 3.2 has an issue with memory usage, 2.8.4 has an issue with performance, I guess the conclusion is clear: switch to 3.3. It’s just better. On average,PhysX 3.3 /PCM is 7X faster than PhysX 2.8.4 and 4.7X faster than Bullet here. That’s quite a lot.

—-

For the next scene (“DynamicConvexesOnArchipelago”) we drop a bunch of simple convexes on a mesh level. Contrary to what we got in the first scene, the mesh level is made of multiple meshes (not just one), of varying shapes, triangle densities, etc. There are 108 meshes and 46476 triangles in total in that scene.

Overall, each PhysX version is faster than the one before: good. PhysX 3.3 is about 2X faster than 2.8.4, but not that much faster than 3.2.

Bullet starts nicely (much better than 2.8.4) but for some reason it collapses when things start to collide (much slower than 2.8.4).

PCM in PhysX doesn’t seem to help here, but it increases memory usage – which is otherwise quite similar for all engines.

—-

The next scene (“DynamicSpheresOnArchipelago”) is a variation on that same theme: we use a simpler shape, but we drop 4X more of them. The mesh level remains the same.

Each PhysX version is faster than the one before: good.

But man, what is going on with 2.8.4?! For some reason it is very slow in the beginning, and only gains speed when the spheres are touching the mesh. I suspect the 2.8.4 broad phase (SAP) did not like this grid of 4K falling objects at all. The PhysX 3.x broaphases in these tests also use a SAP implementation, but it has been completely rewritten since 2.8.4. Not sure if this is why these engines don’t suffer from the same problem.

If we ignore this initial 2.8.4 anomaly, the curves are quite similar. Bullet works better than in the previous scene, but remains slower than PhysX.

As in the previous scene, PCM in PhysX doesn’t help much, but increases memory usage – which is otherwise quite similar for all engines, again.

—-

In the next scene (“DynamicConvexesOnArchipelagoTess”) we mix things up a bit and crank up the numbers: this time we use a huge number of convexes on the same mesh level, but tessellated. That is, the mesh level still has 108 meshes, but it now contains 743616 triangles (the flat shading doesn’t reflect this, don’t let it fool you).

The resulting curves are kind of similar to what we saw before.

Each PhysX version is faster than the one before: good. There’s about a 4X speedup between 2.8.4 and 3.3 - if you are still using 2.x, you should really consider upgrading!

It is interesting that the flat part of the 2.8.4 curve is a bit above 200000 time units in both this test and the previous one. This happens when the same number of objects is falling towards the mesh, and there are no collisions at all at this point. It really does sound like the broadphase is guilty. After impact, objects start moving in random directions, which breaks the objects alignment, and the SAP starts to breathe again. The curve quickly goes back up because of the expensive convex-vs-tessellated-mesh contact generation anyway.

Bullet starts strong, but collapses again after impact. Good broadphase, not so good contact generation.

On the other hand PhysX 3.2 and especially 3.3 perform quite well both in the broadphase and contact gen departments.

The evolution of PhysX (5/12) - Rigid bodies (compounds)

Saturday, May 11th, 2013

We have been using simple shapes so far but many game objects are actually compounds, made of multiple sub-shapes. Let’s investigate stacks & piles for these.

—-

The first scene (“StackOfSmallCompounds”) is as follows:

For the first time, we see a small regression between 2.8.4 and 3.2, but it only happens for the first part of the curve. After that, 3.2 takes over and becomes faster than 2.8.4, as it should be. Overall each PhysX version is faster than the one before, but not by much.

Bullet is again up to 2X slower than 3.3. Not much to say about that one.

—-

The second scene (“PileOfSmallCompounds”) is much more surprising.

3.2 is now clearly slower than 2.8.4. That’s the first clear regression seen in these tests so far. It’s not a lot slower, but the graph is very clear, and it is wrong. Luckily things seem to have been fixed between 3.2 and 3.3, which became faster again. If we except this 3.2 anomaly, all PhysX versions are pretty similar here.

Now Bullet on the other hand is a mystery. It really doesn’t like this scene, and overall it is about 3X slower than PhysX. No idea why. This one is a bit surprising.

The evolution of PhysX (4/12) - Rigid bodies (piles)

Saturday, May 11th, 2013

Stacks are fun, but games mainly need piles. For example piles of falling debris when you shoot things up. What you usually look for is an engine that can simulate piles without jittering, or wobbling objects. Fortunately all the engines involved here do a good job in that respect, and we can just look at the performance side of things.

—-

The first scene (“PotPourri_Box_WithConvexes”) is a simple box container filled with boxes, spheres, capsules and convexes.

Each PhysX version is faster than the one before. Checked. PhysX 3.3 is about 2X faster than PhysX 2.8.4.

PCM does not seem to help much in this case. And in fact Bullet is significantly slower than all PhysX versions here.

Also, for some reason Bullet is now the one consuming about 2X more memory than the others – while it was quite good memory-wise before. Curious.

—-

The second scene (“ConvexGalore”) is a pile of falling convexes, of various shapes and complexities.

The profile curves were kind of flat so far but this is now changing. The initial part of the curves, while the objects are falling down without touching each other, look remarkably similar in all engines. But then things diverge and 2 trends emerge: for some engines (Bullet, 2.8.4) the curve keeps growing (i.e. performance becomes worse and worse), while for other engines (PhysX 3.x) the curve goes down and performance increases.

Overall each PhysX version is faster than the previous one. Checked.

PCM in 3.3 is ultimately slower than regular contact generation in this case.

Again, Bullet seems to suffer both in terms of performance and memory usage.

—-

The third scene (“HugePileOfLargeConvexes”) is a stress test containing more than 5000 large convexes falling on each other.

As usual, each PhysX version is faster than the previous one. Checked.

The graph shows 3 clear things:

  • 2.8.4 is significantly slower than the others
  • 3.3 / PCM is significantly faster than the others
  • The remaining engines are otherwise very close to each-other, performance-wise.

Since we are using large convexes, it is expected that PCM works better than SAT here. With so many objects falling vertically, other subsystems like the broadphase might have a stronger impact on performance than in the other scenes.

In any case, no big surprise in this one.

The evolution of PhysX (3/12) - Rigid bodies (convex stacks)

Saturday, May 11th, 2013

We continue with stacks, this time stacks of convex objects. This is where things start to become interesting. There are multiple things to be aware of here:

  • Regular contact generation between boxes is a lot easier (and a lot cheaper) than contact generation between convexes. So PCM-based engines should take the lead here in theory.
  • But performance depends a lot on the complexity of the convex objects as well.
  • It also depends whether the engine uses a “contact cache” or not.

Overall it’s rather hard to predict. So let’s see what we get.

We have two scenes here, for “small” and “large” convexes. Small convexes have few vertices, large convexes have a lot of vertices. The number of vertices has a high impact on performance, especially for engines using SAT-based contact generation.

—-

The first scene (“ConvexStack”) uses the small convexes. Each PhysX version is again a bit faster than the previous one, so that’s good, that’s what we want.

There doesn’t seem to be any speed difference between PCM and regular contact generation. In fact PCM has even a slightly worse worst case – maybe the overhead of managing the persistent manifolds. I suppose it shows that SAT-based contact generation is a viable algorithm for small convexes, which is something we knew intuitively.

Now, 3.3 is about 2X faster than 3.2 on average, even when they both use SAT. The perf difference here probably comes from two things: 3.3 uses extra optimizations like “internal objects” (this probably explains the better worst case) and it also uses a “contact cache” to avoid regenerating all contacts all the time (this probably explains the better average case).

As for 2.8.4 and Bullet, they’re of similar speed overall, and both are significantly slower than 3.3. We see that 2.8.4 has the worst worst case of all, which is probably due to the old SAT-based contact generation used there – it lacked a lot of the optimizations we introduced later.

As for box stacks the initial frame is a lot more expensive than subsequent frames. Contrary to box stacks though, I think a lot of it is due to the initial contact generation. Engines sometimes use temporal coherence to cache various data from one frame to the next, and that’s why the first contact generation pass is more expensive. It should be pretty clear on a scene like this, where all objects are generating their initial contacts in the same frame, and then they don’t move much anymore. This is not a very natural scenario (most games use “piles” of debris rather than “stacks” of debris), but those artificial scenes are handy to check your worst case.

—-

The second scene (“ConvexStack3”) uses large convexes. Now look at that! Very interesting stuff here.

Each PhysX version is again faster than the previous one, that’s good. But the differences are massive now, with PhysX 3.3 an order of magnitude faster than PhysX 2.8.4. And also almost 3X faster than 3.2. Well it’s nice to see that our efforts paid off.

In terms of PCM vs SAT, it seems pretty clear that PCM gives better performance for large convexes. We see this with PhysX 3.3, but also very clearly with Bullet. It is the first time so far that Bullet manages to be faster than PhysX. It is not really a big surprise: SAT is not a great algorithm for large convexes, we knew that.

On the other hand what we didn’t know is how much slower 2.8.4 could be, compared to the others. I think it is time to upgrade to PhysX 3.3 if you are using large convexes.

Another thing to mention is the memory usage for 3.2. We saw this trend before, and it seems to be worse in this scene: 3.2 is using more memory than it should. The issue has been fixed in 3.3 though.

shopfr.org cialis