How to use GPUs

From Computer Laboratory System Administration
Jump to navigationJump to search

If you think your work would benefit from use of GPUs, there are several options available to you:

First, consider whether you need to use GPUs, or whether CPUs might be a better option. Please read the following guidelines for choosing between CPU and GPU (h/t Aaron Zhao).
GPUs contain a large number of processing units, and are efficient in handling massively parallel programs. In this case, if the task is too simple or too sequential, it is usually hard to achieve time-savings on GPU. For instance, tasks like image processing are likely to be embarrassingly parallel and are suitable workloads for GPUs.
Skewed or asynchronous computations are bad for GPUs. Since all threads in GPUs are synced, the computation time will be bounded by the longest running thread. In addition, if some threads have dependencies, these parallel cores are forced to wait for each other and compute serially so that they can resolve the dependencies.
If your code has many cache misses (does not fit into GPU’s shared memory ), it is likely to run less efficiently on GPUs. GPUs have slow global memory access compared to CPUs. However, if your data fits entirely on the shared memory or the shared memory miss rate is low, GPU is likely to run more efficiently. This effect, however, highly depends on how the memory access pattern is interleaved with your compute.
Also there's PCIe latency. GPUs nowadays are normally connected to CPUs through PCIe links. If your code needs to sync data between CPUs and GPUs frequently, running the code purely on CPUs might be faster since it avoids PCIe crosstalk.
Sign up for an HPC (High Performance Computing) account here.
All members of the University may apply for a free service level 3 (SL3) account.
Additionally the Department has made some funds available for use of the higher-priority service level 2 (SL2). More information ...
Or, check with your project supervisor whether there are any grant-specific funds available for paid use of the HPC.
Allow a week for your application to be approved (often it's faster). Once signed up, check out the documentation and note that use of the HPC is charged per hour per core, so use it with care! Try to use SL3 when starting out or with low priority tasks. Also be sure that you need to use GPU: it may be that the CPU services offered by the HPC will be sufficient for your needs. From November 2021 onwards, use of standard CPU (Peta-Icelake) costs 1p per hour and the use of GPU on Wilkes3 costs 55p per hour (price list). There's also environmental cost differences to consider...
Note that queueing times can be lengthy, especially in term-time and for GPU. Therefore allow for slack time in your project activity, and try to run your HPC jobs well ahead of any deadlines you may have.
As a few starter tips (for detail see the current documentation): you specify in your log-in host address whether you want to use CPU or GPU, you'll likely need to load software-specific modules into your environment to get started, there are example batch submission scripts for you to modify in /usr/local/Cluster-Docs/SLURM -- which differ depending whether you are using CPU (Ice Lake) or GPU (Ampere), and if you're on a paying tier you can check the balance on the account you're using with the “mybalance” command (this also lists the HPC projects you are associated with).
Note that the HPC is also known as the CSD3 (Cambridge Service for Data-Driven Discovery) but "HPC" is how it's more commonly known. It's also part of the wider Research Computing Services at Cambridge.
Ask sys-admin if you can have access to a virtual machine running on the Department's GPUs.
More info at the sys-admin GPU page including how to apply for a VM. Convenience and local support are major advantages of these GPU, and our sys-admin are very responsive and helpful in setting you up with one. Note that in general they may be "used as a development platform for code which will then move on to HPC when ready to scale up, or where HPC's software/hardware environment is not suitable" (Malcolm Scott, sys-admin).
Check with your supervisor / others in your group.
Has your group bought its own GPUs? You might be granted access to them if so.
PIs might also be interested in HPC services available outside Cambridge.
for instance by application to the EPSRC or the Oxford-based JADE facility.

This wiki page was put together by Andrew Caines (apc38) thanks to help from Daniel Bates, Chris Hadley, Markus Kuhn, Malcolm Scott, Graham Titmus, Aaron Zhao and Noa Zilberman. Any errors are my own: feel free to send me feedback or add your own hints and tips on GPU use. Last update: 22 Dec 2022.

If you have questions, there's also the cl-gpu-users mailing list for GPU users in the department. You can find the archives and manage your subscription here.