This half-day course is dedicated to learning how to efficiently use the GPU accelerated part of Karolina for Deep and Machine Learning.
Technical details of the GPU partition of Karolina supercomputer
The accelerated part consists of 72 servers and each of them is equipped with 8 GPU accelerators providing a performance of 11 PFlop/s for standard HPC simulations and up to 180 PFlop/s for artificial intelligence computations.
72 compute nodes with 2x AMD Zen 2 EPYC™ 7763 processors with 64 cores and 2.45 GHz and 8x NVIDIA A100 GPU accelerators, 40 GB HBM2.
Tutors
Mgr. Branislav Jansík, Ph.D.
Georg Zitzlsberger
Ing. Stanislav Böhm, Ph.D.
Prerequisites
Experience with using GPU accelerated systems.
Language
English
Schedule
Access to Karolina's GPU accelerated part
Branislav Jansík (60 minutes)
- Short introduction of the Karolina supercomputer
- How to access the Karolina GPU nodes
- First login
- Computing environment and available software libraries and tools
- HPC resources allocation, PBS
- Scratch and Project storages
- Special tools (Nodes availability overview, ...)
Efficient multi-GPU and multi-node execution of Deep and Machine Learning frameworks
Georg Zitzlsberger (60 minutes)
- Introduction to Data Parallel Deep Learning with Horovod
- Multi-node/-GPU aware Data Processing Pipelines
- Demonstration of Multi-node/-GPU Examples using Tensorflow
- Multi-node/-GPU Machine Learning with scikit-learn
Introduction to HyperQueue
Stanislav Böhm (45 minutes)
- Efficient execution of a large number of small tasks transparently over HPC schedulers (SLURM/PBS) using HyperQueue
- Guided examples
Acknowledgements
This event is partially supported by the EuroCC project. This project has received funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 951732. The JU receives support from the European Union’s Horizon 2020 research and innovation programme and Germany, Bulgaria, Austria, Croatia, Cyprus, the Czech Republic, Denmark, Estonia, Finland, Greece, Hungary, Ireland, Italy, Lithuania, Latvia, Poland, Portugal, Romania, Slovenia, Spain, Sweden, the United Kingdom, France, the Netherlands, Belgium, Luxembourg, Slovakia, Norway, Switzerland, Turkey, Republic of North Macedonia, Iceland, Montenegro.
This event is partially supported by the LIGATE project. This project has received funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 956137. The JU receives support from the European Union’s Horizon 2020 research and innovation programme and Italy, Sweden, Austria, the Czech Republic, Switzerland.
This course is supported by the Ministry of Education, Youth and Sports of the Czech Republic through the e-INFRA CZ (ID:90140).