cfaed Seminar Series

cfaed Seminar Series

Dr. Michel Steuwer , University of Edinburgh, UK

Structured Parallel Programming - From High-Level Functional Expressions to High-Performance OpenCL Code

25.08.2016 (Thursday) , 10:00 - 11:30
Andreas-Pfitzmann-Bau 1004 (Großes Ratszimmer) , Nöthnitzer Str. 46 , 01187 Dresden

Computers have become increasingly complex with the emergence of heterogeneous
hardware combining multicore CPUs and GPUs. These parallel systems exhibit tremendous
computational power at the cost of increased programming effort resulting in a tension
between performance and code portability. Typically, code is either tuned in a low-level
imperative language using hardware-specific optimizations to achieve maximum performance
or is written in a high-level, possibly functional, language to achieve portability at the expense of
performance.

In this talk, I will present two connected approaches aiming to combine high-level
programming, code portability, and high-performance. Both approaches are built around the
idea of representing parallel programs using regular, structured parallel patterns, a.k.a.
algorithmic skeletons.
1) The SkelCL high-level programming model and its implementation as a C++ library simplifies
GPU programming by providing parallel container data types which simplify data management,
expressing computations using high-level algorithmic skeletons and simplify programming of
multi-GPU systems using declarative data distributions.
2) The Lift code generation approach starts from a high-level functional expression and applies
a simple set of provably correct rewrite rules to transform it into a low-level functional
representation, close to the OpenCL programming model, from which eventually OpenCL code
is generated. Our rewrite rules define a space of possible implementations which we
automatically explore to generate hardware-specific OpenCL implementations.


Using algorithmic skeletons parallel programs are significantly shorter and easier to understand,
while our experiments show that we can automatically derive hardware specific
implementations from these simple functional high-level algorithmic expressions offering
performance on a par with highly tuned code for multicore CPUs and GPUs written by experts.

 

Bio:
Michel Steuwer is a Postdoctoral Research Associate in the compiler group at the University of Edinburgh. He received his Ph.D. in 2015 from the University of Muenster in Germany. His research interests span all areas of parallel programming from languages and programming models to their implementation in compilers and libraries as well as their execution at runtime. His research has particularly focused on structured parallel programming models, heterogeneous and GPU computing, and novel compilation techniques.

Go back