Speaker
Dr
Antoni Portero
(IT4Innovations, National Supercomputing Center)
Description
High-performance modeling and simulation are performing a driving role in decision making and forecast. For time critical emergency support applications such as severe weather
forecasting, or flood modeling, late results can be unusable. Forecast models are primarily executed and the data analyzed while their predictions can still be applied. These on-demand large-scale computations cannot wait infinitely in a job queue for supercomputer resources to become available. Neither can the community retain multimillion-dollar infrastructures idle until required by urgent computation. Even if it rarely happens,
the best effort could not be sufficient when critical applications are competing in the job queue with other users. A specific support is needed to provide computing resources quickly, automatically, and reliably.
An increasing number of High-Performance Applications demand some form of time predictability, in particular in scenarios where correctness depends on both performance
and timing requirements and the failure to meet either of them is critical. Consequently, a more predictable HPC system is required, particularly for an emerging class of adaptive real-time HPC applications. Here we present our runtime approach which produces the results in the predictable time with the minimized allocation of hardware resources. The paper describes the advantages regarding execution time reliability and the trade-offs regarding power/energy consumption and temperature of the system compared to the current GNU/Linux governors.
Primary author
Dr
Antoni Portero
(IT4Innovations, National Supercomputing Center)
Co-authors
Mr
Jiri Sevcik
(IT4Innovations, National Supercomputing Center)
Mr
Martin Golasowski
(IT4Innovations, National Supercomputing Center)
Mr
Radim Vavrik
(IT4Innovations, National Supercomputing Center)
Prof.
Vít Vondrak
(IT4Innovations, VSB-Technical University of Ostrava)