Daniele Panozzo: Geometry processing in the era of parallel computing
The improvements on modern processors are mainly in their parallel
capabilities, in the form of vectorized instructions or additional
processing cores. Unfortunately, their performance on a serial task is
improving very slowly and it is unlikely that this trend will change in the
future.
The majority of geometry processing algorithms are, by design, not able to
efficiently use this new hardware: they have been invented to minimize the
number of operations, since floating point computation has always been the
major computational bottleneck. However, this is not true anymore: the
bottleneck is currently in inconsistent memory accesses, while computation
is cheap since it can be executed in parallel.
This is a major paradigm change that will require our community to revise
many of our algorithmic foundations, from the data structures we use to
encode a mesh to avoiding sparse linear algebra operations, since they
cannot fully benefit from a parallel architecture.
In this talk, I will present a brief overview of modern processing
primitives from a geometry processing point of view, describing the
constraints that should be respected to fully utilize vectorized
instructions, multi-core parallelization and their relation to cache
coherence. As a practical example, I will demonstrate how to adapt
well-known global parametrization-based remeshing approaches, which rely on
sparse linear algebra, into a simple and fully parallelizable algorithm that
can convert a mesh with 372 million triangles into a coarse quad mesh,
taking less than 10 minutes on a parallel workstation with 16 cores.
Surprisingly, designing the algorithm under these new hardware constraints
forced us to design a simple method which can be implemented in less than
100 lines of self-contained Python code, compared to hundred of thousands of
lines required for competing methods that rely on complex optimization
|