4 June 2018
VŠB - Technical University Ostrava, IT4Innovations building
Europe/Prague timezone

Annotation

Real-world applications often encompass end-to-end data processing pipelines composed of a large number (millions) of interconnected computational tasks of various granularity. We introduce HyperLoom as a platform for defining and executing such pipelines in distributed environments using a Python API.

Scientific pipelines such those in machine learning compose of multiple data processing tasks. HyperLoom users can easily define dependencies between computational tasks and create a pipeline which can then be executed on HPC systems. The high-performance core of HyperLoom dynamically orchestrates the tasks over available resources respecting task requirements. The entire system was designed to have minimal overhead and to efficiently deal with varying computational times of the tasks. HyperLoom allows to execute pipelines that contain basic built-in tasks, user-defined Python tasks, tasks wrapping third-party applications or combinations of those.

This course will introduce HyperLoom and possibility of its usage in HPC environments based on our experience with HyperLoom deployment at IT4Innovations national supercomputing center.

Level

beginner - intermediate

Language

English

Purpose of the course (benefits for the attendees)

Attendees will learn the key concepts of HyperLoom, its architecture, and usage explained through practical examples.

About the tutor(s)

Stanislav Böhm is a computer science researcher at Advanced Data Analysis and Simulations Lab at IT4Innovations and Institute of Formal and Applied Linguistics at Charles University. He is interested in distributed systems and verification problems. He received his Ph.D. in 2014. He is the main author and team leader of the following software projects related to HPC:  HyperLoom (framework for distributed pipelines), Aislinn (verification tool for MPI programs), Kaira (high-level development environment for MPI programs), and Haydi (combinatorial framework).

Vojtěch Cima is affiliated as a research assistant and Ph.D. student at Advanced Data Analysis and Simulations Lab at IT4Innovations where he actively participates in national and European research projects focusing on workload distribution and machine learning.

Starts
Ends
Europe/Prague
VŠB - Technical University Ostrava, IT4Innovations building
207
Studentská 6231/1B 708 33 Ostrava–Poruba Czech Republic

Practicalities

Prerequisites

  • laptop with an SSH client

  • basic knowledge of Python

  • basic knowledge of Linux

Registration

Obligatory registration - registration form here; for the deadline (extended) see above, or exhausted course capacity.

Capacity and Fees

30 participants. The event is provided free of charge.

Practicalities

  • See the links below for how to get to the campus of  VŠB - Technical University Ostrava and to the IT4Innovations building.

  • Documentation for IT4Innovations' computer systems is available at https://docs.it4i.cz/.