Preliminary information - subject to change.
Please register with this mailing list ftp-ssc@gwdg.de in order to get informed when contents are updated.

Participating teams will have to provide the best solutions and IO-performance for the following 6 tasks (3 benchmarks, 3 common HPC use cases, 1 secret task):

TaskDescription
IO500The IO500 benchmark is a comprehensive performance evaluation tool designed to assess the efficiency and scalability of HPC storage systems. It consists of several tests, including IOR for sequential read/write performance and mdtest for metadata operations, ensuring a thorough analysis of storage subsystems. The benchmark helps organizations and researchers to identify bottlenecks, compare different storage solutions, and guide the development of optimized storage architectures. Regularly updated results and rankings foster a competitive environment, encouraging continuous innovation in HPC storage technologies.
MD-WorkbenchThe MD-Workbench benchmark is a specialized benchmark designed to evaluate the performance of metadata operations in HPC filesystems. By simulating a range of metadata-intensive workloads, such as file creation, deletion, and attribute modification, MD-Workbench provides detailed insights into the efficiency and scalability of file system metadata handling.
ElbenchoElbencho is a benchmark designed to evaluate the performance of storage systems under various workloads. The benchmark assesses the read and write capabilities of file systems, including network file systems, by generating synthetic workloads that simulate real-world usage patterns. Elbencho supports both single-threaded and multi-threaded operations, allowing for comprehensive performance analysis across different configurations and scales.
Quantum ESPRESSOQuantum ESPRESSO is an open-source application for electronic-structure calculations and materials modeling at the nanoscale. Utilizing density functional theory (DFT), plane waves, and pseudopotentials, it enables researchers to perform a wide range of simulations, including structural optimization, molecular dynamics, and electronic properties analysis. Quantum ESPRESSO is widely used in both academic and industrial research for studying the physical properties of materials.
NVIDIA DALINVIDIA DALI (Data Loading Library) is an advanced library designed to accelerate data preprocessing and augmentation for deep learning applications. By leveraging GPU acceleration, DALI streamlines the data pipeline, significantly reducing the time required to load, transform, and prepare data for neural network training. It supports a wide range of operations, including image and video decoding, resizing, cropping, and normalization, all performed with high efficiency. DALI can be integrated with popular deep learning frameworks like TensorFlow and PyTorch, allowing for easy incorporation into existing workflows.
Python environment using condaConda is an open-source package management and environment management system for Python, widely used by data scientists, developers, and researchers. It simplifies the process of installing, updating, and managing software packages and their dependencies, ensuring compatibility across various platforms. With Conda, users can create and switch between isolated environments, each with its own set of installed packages, which helps prevent conflicts between different projects. This flexibility is not only beneficial for complex workflows that require specific library versions but also brings challenges for the storage performance.

Scoring for tasks

The scoring for the SSC is outlined in the table below. Any Scores that are tied to benchmarking will result in more points for higher scores. The maximum points are the number of teams in the competition, meaning for 5 teams the point spread is as follows:

The acronym DLIO refers to the “Deep learning IO Benchmark” and is used as a reerence here.

ApplicationTaskPoints
IO500Submission to IO500 Webpage (Research section)
Full submission, partly missing description 2 points, 1 point for reproducibility questionnaire
3
10 Client setup
Scoring based on results
5
Description of the configurations measured and performance improvement made
At least 5 different node/process combinations with reasoning on one summary page
5
MD WorkbenchLowest (maximum latency) for the fixed configuration
Scoring base on results
5
ElbenchoRun for arbitrary number of client nodes, 100 KByte files (same file size as DLIO)
Running the benchmark
2
Run 10 Client nodes, 100 KByte file (same file size as DLIO) vs. 100 KByte access in shared files
Scoring base on results
5
Run 1 client node, single 100 MByte block, ensuring cache blows out
Scoring base on results
5
NVIDIA DALI PipelineLarge file access pipeline with prepared python code
Scoring base on results
5
Conda environmentStartup time of prepared environment, uncached
Scoring base on results
5
Performance analysisWritten report
Comparison between benchmarks and theoretical maximum,1-2 pages of description, analysis, reasoning, and conclusion
10
Secret tasks??