[ONLINE] Python and scikit-learn for HPC (PTC course)

Europe/Prague
[ONLINE]

[ONLINE]

Description

Annotation

Data Science, Machine Learning, and Big Data are predominantly driven by Python. This one day course provides an insight into the most common Python extensions used to manage large data sets with Pandas, run distributed computations on HPC clusters with Dask, and leverage classical Machine Learning with scikit-learn.

Target audience: Data Scientists

Level

basic-intermediate

Language

English

About the tutor(s)

Stanislav Böhm completed his PhD study in 2014. He is currently a senior researcher at IT4Innovations National Supercomputing Center (the Czech Republic) in the area of distributed systems, verification, and machine learning.

Georg Zitzlsberger is a research specialist for Machine and Deep Learning at IT4Innovations. He has for over two years been certified by NVIDIA as a University Ambassador of the NVIDIA Deep Learning Institute (DLI) programme. This certification allows him to offer NVIDIA DLI courses to academic users of IT4Innovations' HPC services. In addition, in collaboration with Bayncore, he is a trainer for Intel HPC and AI workshops and conferences carried out across Europe. He has been contributing to these events, which are held for audiences from industry and academia, for four years.

Acknowledgements

This event is a PRACE Training Centre (PTC) course, co-funded by the Partnership of Advanced Computing in Europe (PRACE). The main web page of the course is located on the PRACE Events Portal.

 

This event is partially supported by The Ministry of Education, Youth and Sports from the Large Infrastructures for Research, Experimental Development and Innovations project  “e-Infrastruktura CZ – LM2018140”, partially by the PRACE-6IP project - the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 823767 and by the ExaQUte project - the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 800898.

    • 08:45 09:00
      Presentation
    • 09:00 10:30
      Introduction to Pandas
    • 10:30 10:45
      Coffee Break 15m
    • 10:45 12:00
      Python in Distributed Environments
    • 12:00 13:00
      Lunch Break 1h
    • 13:00 14:45
      Introduction to scikit-learn
    • 14:45 15:15
      Optimizations for CPUs, GPUs and numerical stability
    • 15:15 15:30
      Coffee Break 15m
    • 15:30 16:45
      Hands-on exercises
    • 16:45 17:00
      Q&A