This tutorial focuses on how techniques from computer science allow highperformance
programming to be elevated from an art that is practiced by the
high priests of high performance to a science that exposes a systematic methodology
that is accessible to the masses and naturally supports the multicore
revolution of architecture design that has arrived.
The arrival of massively parallel architectures in the early 1990s was an opportunity
to start retiring legacy codes and embracing abstraction to clean up
our habits. Unfortunately, by and large, the reaction from the scientific computing
community was to roll up the sleeves and insist on evolution rather than
revolution. It is easy to find examples of programs, evolved from legacy code,
in support of scientific computing that are broadly viewed by computational
scientists as glorious examples of beauty while many of us in computer science
hold these same examples up as representative of what used to be the stateof-
the-art, but best retired now. A great opportunity to lead in the area of
parallel programming was lost, even as great success in the area of the practical
application of parallel computing was attained.
With the advent of multicore and the realization that parallelism has to be
tackled for and by the masses, it is no longer acceptable to evolve legacy codes.
Capturing parallelism at a high level of abstraction is critical to the success
of multicore architectures as multicore evolves into many-multicore. And thus
we need to understand examples for which abstraction has been successfully
employed to manage the complexity of parallel programming. In this tutorial we
familiarize the audience with such an example and use it to illustrate techniques
that are applicable beyond dense linear algebra.