CPU free Programming: When the GPU takes the Lead

CPU free Programming: When the GPU takes the Lead

New tools can remove inefficiencies in GPU computing, but can we turn that into real performance gains?

Traditional GPU programming has followed a pattern where the CPU acts as the host that controls the execution of short-lived kernels on the GPU and controls the overall flow of computation and communication. However, with recent additions to the CUDA toolbox, persistent kernels that control program execution and GPU initiated communication routines such as NVSHMEM have become available. Using these tools, we can eliminate latencies and inefficiencies of frequent CPU-GPU communication.


The goal of this thesis is to implement a CPU-free version of an existing major GPU application and show the benefits of the new technique through rigorous benchmarking.


  • Experience with C/C++
  • Familiarity with GPU programming is very helpful
  • Familiarity with parallel programming in MPI is helpful

Associated contacts

Johannes Langguth

Senior Research Scientist

Xing Cai

ProfessorHead of departmentChief Research Scientist

James D Trotter

Postdoctoral Fellow