Summary and benefits for the attendees
Numerical simulations conducted on current high-performance computing (HPC) systems face an ever growing need for scalability. Larger HPC platforms provide opportunities to push the limitations on size and properties of what can be accurately simulated. Serial approaches on handling I/O in a parallel application will dominate the performance on massively parallel systems. Heterogeneity of platforms can impose a high level of maintenance, when different data representations are needed. Portable, self-describing data formats such as HDF5 are examples of already widely used data formats within certain communities.
The course will present the main concepts relevant to IO on high-end parallel systems. MPI-IO, the underlying standard for parallel IO will be introduced, and then two high-level IO libraries (HDF5 and SIONlib) will be presented.
Writing a large number of files in the same directory can cause trouble for some parallel file system meta-data drivers and leads to under-performance. The purpose of MPI-IO is to provide a high performance, portable, parallel I/O interface for high performance parallel MPI programs.
The HDF5 library provides a set of high level library functions for describing and storing simple and complex data structures. It also allows for parallel I/O through MPI-IO.
SIONlib is a library for writing and reading binary data to/from several thousands of processors into one or a small number of physical files. The SIONlib file layout and API allow the application to take advantage of the scaling behaviour and asynchronous access of a logical task-local pattern while keeping the number of files independent of and significantly smaller than the number of processes.
About the tutor(s)
After a PhD in fluid mechanics, Nicole Audiffren worked for several years in computational atmospheric sciences at the Observatoire de Physique du Globe de Clermont-Ferrand (France). In 2002, she joined the Support Team at CINES (Centre Informatique de l'Enseignement Supérieur, Montpellier) who hosts one of the major french Tier-1: the supercomputer Occigen (3.5 Pflops). Her main work is giving advice to scientists dealing with large I/O and specially on Lustre systems. She is also regularly involved in the bidding processes for the Tier-1 machine. Besides that, she is a regular member of national hiring panels and is a member of an European Tier-1 resources allocation Board.
Sebastian Lührs studied Technomathematics at the University of Applied Science Aachen. Since 2014 he is working as part of the cross-sectional team application optimization in the division application support of the Jülich Supercomputing Centre of Forschungszentrum Jülich GmbH. Beside general application optimization, especially in context of I/O, for the Tier-0 and Tier-1 systems in Jülich his main work areas are tool development and user training. In addition he is involved in several industrial, national and EU funded projects like PRACE and EoCoE.
This course is organized as a joined event of the IT4Innovations National Supercomputing Center, Czech PTC (PRACE Training Centre), Maison de la Simulation and CINES, PATC (PRACE Advanced Training Centre) in France, and Jülich Supercomputing Centre, PATC in Germany. All those centres provide high-level HPC training for the Partnership for Advanced Computing in Europe (PRACE).