Description
Numerical simulations are the best approximation to experimental laboratories in cosmology. However, running the simulation is only the first step. Interpreting and analysing the outputs is an essential component of the discovery process, but the large size and high dimensionality of cosmological simulations severely limit the interpretability of their predictions. We will present a new assumption-free approach and tools to maximize scientific discoveries using cosmological simulations. The tools can be applied to today’s largest simulations and will be essential to solve the extreme data access, exploration, and analysis challenges posed by the exascale computing era. Our software tools can run on both local machines and HPC resources. They automatically learn compact representations of complex objects such as simulated or observed galaxies in a low-dimensional space that naturally describes their features. The data is then seamlessly projected onto this representation space for interactive inspection, visual interpretation, sample selection, and local analysis. We will demonstrate the workflow using ~60k simulated galaxies from IllustrisTNG to render an interactive visualization of a morphological similarity space on the surface of a hierarchical sphere designed to handle arbitrarily large simulations containing millions of galaxies. Lastly, we will discuss the potential use of the tool for the robust comparison of simulations with multimodal data from large galaxy surveys, including model selection and simulation-based inference.