Box pruning revisited - part 10 - integer SIMD

Part 10 – integer SIMD

Interlude:

Similar to what we did in version 7, and in sake of completeness, I tried using integer comparisons in the SIMD version as well.

The changes are straightforward: encode floats as integers like we did in version 7, replace SIMD floating-point intrinsics with SIMD integer intrinsics.

There are no special traps here, and not much to report, because at the end of the day the results are slower:

Home PC:

Complete test (brute force): found 11811 intersections in 781780 K-cycles.
17454 K-cycles.
16681 K-cycles.
16669 K-cycles.
17145 K-cycles.
16703 K-cycles.
16683 K-cycles.
16668 K-cycles.
16667 K-cycles.
16862 K-cycles.
16707 K-cycles.
16670 K-cycles.
16689 K-cycles.
16668 K-cycles.
16949 K-cycles.
16710 K-cycles.
16667 K-cycles.
Complete test (box pruning): found 11715 intersections in 16667 K-cycles.

Office PC:

Complete test (brute force): found 11811 intersections in 810906 K-cycles.
16607 K-cycles.
15927 K-cycles.
15648 K-cycles.
15971 K-cycles.
15648 K-cycles.
15960 K-cycles.
15742 K-cycles.
15990 K-cycles.
15837 K-cycles.
15741 K-cycles.
15970 K-cycles.
15651 K-cycles.
16247 K-cycles.
15649 K-cycles.
15834 K-cycles.
15738 K-cycles.
Complete test (box pruning): found 11715 intersections in 15648 K-cycles.

The gains are summarized here:

Home PC

Timings (K-Cycles)

Delta (K-Cycles)

Speedup

Overall X factor

(Version1)

(101662)

Version2 - base

98822

0

0%

1.0

Version3

93138

~5600

~5%

~1.06

Version4

81834

~11000

~12%

~1.20

Version5

78140

~3600

~4%

~1.26

Version6a

60579

~17000

~22%

~1.63

Version6b

41605

~18000

~31%

~2.37

(Version7)

(40906)

-

-

-

(Version8)

(31383)

(~10000)

(~24%)

(~3.14)

Version9a

34486

~7100

~17%

~2.86

Version9b - unsafe

32477

~2000

~5%

~3.04

Version9b - safe

32565

~1900

~5%

~3.03

Version9c - unsafe

16223

~16000

~50%

~6.09

Version9c - safe

14802

~17000

~54%

~6.67

Version10

(16667)

-

-

-

Office PC

Timings (K-Cycles)

Delta (K-Cycles)

Speedup

Overall X factor

(Version1)

(96203)

Version2 - base

92885

0

0%

1.0

Version3

88352

~4500

~5%

~1.05

Version4

77156

~11000

~12%

~1.20

Version5

73778

~3300

~4%

~1.25

Version6a

58451

~15000

~20%

~1.58

Version6b

45634

~12000

~21%

~2.03

(Version7)

(43987)

-

-

-

(Version8)

(29083)

(~16000)

(~36%)

(~3.19)

Version9a

31864

~13000

~30%

~2.91

Version9b - unsafe

15097

~16000

~52%

~6.15

Version9b - safe

15116

~16000

~52%

~6.14

Version9c - unsafe

12707

~2300

~15%

~7.30

Version9c - safe

12562

~2500

~16%

~7.39

Version10

(15648)

-

-

-

What we learnt:

Don’t bother with integer comparisons anymore.

We are still using integer SIMD in PhysX for this, so it appears that the PhysX version is sub-optimal. Expect some performance gains in PhysX 3.5.

GitHub code for part 10

Comments are closed.

shopfr.org cialis