Kathryn Mohror is a Distinguished Member of the Technical Staff and Director of the Center for Applied Scientific Computing at Lawrence Livermore National Laboratory. Kathryn’s current research focuses primarily I/O performance and portability for HPC. Her work includes UnifyFS, a scalable file system for in-system storage on HPC systems through (2024 R&D100 Award Winner), and the Scalable Checkpoint/Restart Library (SCR) framework, a multilevel checkpointing library that has been shown to significantly reduce checkpointing overhead (2019 R&D100 Award Winner). Additionally, she leads the IOPP project, funded by ASCR ECRP, that is working towards fundamental understanding of the needs of current and emerging HPC I/O workloads and developing support based on that understanding.
Presentation Title:
A Storage System Fit for Exascale I/O
Presentation Abstract:
HPC applications increasingly face significant challenges in I/O performance because traditional storage systems provide one-size-fits-all solutions that struggle to keep pace with the growing and variable data demands of scientific simulations and workflows. In this talk, I will give an overview of a novel storage architecture designed to address the I/O needs of applications on LLNL’s El Capitan exascale supercomputer, the Rabbit Storage System. I’ll describe how the Rabbit Storage System can be dynamically configured for each job to tailor the I/O system for ease of use and to provide better performance than the traditional file system and provide early results demonstrating the system’s capabilities and performance.