This course is organized by the LUMI User Support Team (LUST) and EuroCC National Competence Centers (NCCs) in Finland and Czech Republic.
Annotation
Join our two-day workshop, “Getting Started with AI on LUMI,” designed to familiarize you with the capabilities of the LUMI supercomputer for artificial intelligence applications. This workshop is ideal for those looking to transition from smaller-scale computing environments like laptops, workstations, or cloud VMs to the robust, GPU-intensive LUMI platform.
Participants are invited to bring their own AI training scripts to the workshop, where they will receive personalized support to adapt and run them on LUMI’s advanced GPU system. Whether you aim to leverage a single GPU or scale up to multiple GPUs, our workshop will provide valuable insights and practical skills to enhance your AI projects with LUMI’s powerful computing infrastructure.
Learning outcomes
Attending the workshop, you will acquire an understanding of the LUMI-G architecture tailored for AI training, including an introduction to SLURM, ROCm, the Lustre/LUMI-O file systems, and the Slingshot 11 interconnect. Specifically, you will:
- Learn to utilize existing AI containers on LUMI and build your own using the container build tool, cotainr
- Get to know how to run on a single GPU and how to monitor the efficiency
- Learn to distribute AI workloads across multiple GPUs within a single LUMI-G node
- Gain insight into advanced topics for optimizing AI training processes on the LUMI supercomputer
Agenda
The workshop consists of a mix of short lectures and hands-on exercises, that cover the following key topics:
- LUMI-G architecture overview and its applications in AI
- Introduction to the LUMI web-interface for development and monitoring
- Using the AI framework PyTorch on LUMI
- Building and deploying custom AI containers on LUMI
- Strategies for scaling AI workloads across multiple GPUs
- Get support to adapt and run your own AI training script on LUMI (only for on-site participants)
Each day will run from 9:00 to 17:00 CET, with breaks scheduled throughout.
Requirements
Participants are expected to have basic experience with:
- Working on a Linux command line
- Using Python and one or more of the Python AI frameworks PyTorch, Tensorflow, or JAX
- Training an AI model on at least a single GPU, e.g. using a laptop, workstation, or cloud service
- Managing Python environments, e.g. using the Conda and/or pip package managers
Participants are expected to bring a laptop to the workshop, including a charger.
Language
English
Registration
Acknowledgements
This project has received funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 101101903. The JU receives support from the Digital Europe Programme and Germany, Bulgaria, Austria, Croatia, Cyprus, Czech Republic, Denmark, Estonia, Finland, Greece, Hungary, Ireland, Italy, Lithuania, Latvia, Poland, Portugal, Romania, Slovenia, Spain, Sweden, France, Netherlands, Belgium, Luxembourg, Slovakia, Norway, Türkiye, Republic of North Macedonia, Iceland, Montenegro, Serbia. This project has received funding from the Ministry of Education, Youth, and Sports of the Czech Republic.
This course was supported by the Ministry of Education, Youth and Sports of the Czech Republic through the e-INFRA CZ (ID:90254).