Results

Submissions: 69 (70% of accepted papers) including 2 major revision rejections

Evaluation Results:

  • 66 Artifact Available (+2 rejected major revision)
  • 52 Artifact Functional (+2 rejected major revision)
  • 38 Results Reproduced (+2 rejected major revision)

SC21 Best Reproducibility Advancement Award Finalists:

  • Weihao Cui, Han Zhao, Quan Chen, Ningxin Zheng, Jingwen Leng, Jieru Zhao, Zhuo Song, Tao Ma, Yong Yang, Chao Li, Minyi Guo: “Enable Simultaneous DNN Services Based on Deterministic Operator Overlap and Precise Latency Prediction”
  • Ankit Srivastava, Sriram Chockalingam, Maneesha Aluru, Srinivas Aluru: “Parallel Construction of Module Networks”
  • William S. Moses, Valentin Churavy, Ludger Paehler, Jan Hückelheim, Sri Hari Krishna Narayanan, Michel Schanen, Johannes Doerfert: “Reverse-Mode Automatic Differentiation and Optimization of GPU Kernels via Enzyme”
  • Sunil Kumar, Akshat Gupta, Vivek Kumar, Sridutt Bhalachandra: “Cuttlefish: Library for Achieving Energy Efficiency in Multicore Parallel Programs”
  • Konstantinos Parasyris, Giorgis Georgakoudis, Harshitha Menon, James Diffenderfer, Ignacio Laguna, Daniel Osei-Kuffuor, Markus Schordan: “HPAC: Evaluating Approximate Computing Techniques on HPC OpenMP Applications”
Paper title Available Functional Reproduced Artifact
Temporal Vectorization for Stencils Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Artifact
LCCG: A Locality-Centric Hardware Accelerator for High Throughput of Concurrent Graph Processing Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Parallel Construction of Module Networks Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Online Evolutionary Batch Size Orchestration for Scheduling Deep Learning Workloads in GPU Clusters Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Krill: A Compiler and Runtime System for Concurrent Graph Processing Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Exploiting User Activeness for Data Retention in HPC Systems Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Artifact
LogECMem: Coupling Erasure-Coded In-Memory Key-Value Stores with Parity Logging Artifacts Available (v1.1) Artifact
KAISA: An Adaptive Second-order Optimizer Framework for Deep Neural Networks Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Artifact
Understanding, Predicting and Scheduling Serverless Workloads under Partial Interference Artifacts Available (v1.1) Artifact
Cuttlefish: Library for Achieving Energy Efficiency in Multicore Parallel Programs Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Accelerating XOR-based Erasure Coding using Program Optimization Techniques Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Accelerating Applications using Edge Tensor Processing Units Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
On the Parallel I/O Optimality of Linear Algebra Kernels: Near-Optimal Matrix Factorizations Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
SV-Sim: Scalable PGAS-based State Vector Simulation of Quantum Circuits Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Artifact
DeltaFS: A Scalable No-Ground-Truth Filesystem For Massively-Parallel Computing Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Artifact
Reducing Redundancy in Data Organization and Arithmetic Calculation for Stencil Computations Artifacts Available (v1.1) Artifact
CAKE: Matrix Multiplication Using Constant-Bandwidth Blocks Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Hardware-supported Remote Persistence for Distributed Persistent Memory Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
STM-Multifrontal QR: Streaming Task Mapping Multifrontal QR Factorization Empowered by GCN Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Clairvoyant Prefetching for Distributed Machine Learning I/O Artifacts Available (v1.1) Artifact
Discovering and Balancing Fundamental Cycles in Large Signed Graphs Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
High Performance Uncertainty Quantification with Parallelized Multilevel Markov Chain Monte Carlo Artifacts Available (v1.1) Artifact
3D Acoustic-Elastic Coupling with Gravity: The Dynamics of the 2018 Palu, Sulawesi Earthquake and Tsunami Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Enable Simultaneous DNN Services Based on Deterministic Operator Overlap and Precise Latency Prediction Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Index Launches: Scalable, Flexible Representation of Parallel Task Groups Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
PAGANI: A Parallel Adaptive GPU Algorithm for Numerical Integration full strip note Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Minimizing privilege for building HPC containers Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Efficient Tensor Core-based GPU Kernels for Structured Sparsity under Reduced Precision Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Preparing an Incompressible-Flow Fluid Dynamics Code for Exascale-Class Wind Energy Simulations Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Efficient Large-Scale Language Model Training on GPU Clusters Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Artifact
Pinpointing Crash-Consistency Bugs in the HPC I/O Stack: A Cross-Layer Approach Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Artifact
AgEBO-Tabular: Joint Neural Architecture and Hyperparameter Search with Autotuned Data-Parallel Training for Tabular Data full strip note Artifacts Available (v1.1) Artifact
Bootstrapping In-situ Workflow Auto-Tuning via Combining Performance Models of Component Applications full strip note Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Artifact
Systematically Inferring I/O Performance Variability by Examining Repetitive Job Behavior Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
MAPA: Multi-Accelerator Pattern Allocation Policy for Multi-Tenant GPU Servers Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Artifact
Productivity, Portability, Performance: Data-Centric Python Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Flare: Flexible In-Network Allreduce Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Non-Recurring Engineering (NRE) Best Practices: A Case Study with the NERSC/NVIDIA OpenMP Contract Artifacts Available (v1.1) Artifact
SEEC: Stochastic Escape Express Channel Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Paths to OpenMP in the Kernel Artifacts Available (v1.1) Artifact
The Hidden cost of the Edge: A Performance Comparison ofEdge and Cloud Latencies Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Scalable adaptive PDE solvers in arbitrary domains Artifacts Available (v1.1) Artifact
ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Artifact
Accelerating large scale de novo metagenome assembly using GPUs Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Online Optimization of File Transfers in High-Speed Networks Artifacts Available (v1.1) Artifact
In-Depth Analyses of Unified Virtual Memory System for GPU Accelerated Computing Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
HatRPC: Hint-Accelerated Thrift RPC over RDMA Artifacts Available (v1.1) Artifact
Characterization and Prediction of Deep Learning Workloads in Large-Scale GPU Datacenters Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Reverse-Mode Automatic Differentiation and Optimization of GPU Kernels via Enzyme full strip note Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
LibShalom: Optimizing Small and Irregular-shaped Matrix Multiplications on ARMv8 Multi-Cores Artifacts Available (v1.1) Artifact
Simurgh: A Fully Decentralized and Secure NVMM User Space File System Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
HPAC: Evaluating Approximate Computing Techniques on HPC OpenMP Applications. Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Ribbon: Cost-Effective and QoS-Aware Deep Learning Model Inference using a Diverse Pool of Cloud Computing Instances Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Lunule: An Agile and Judicious Metadata Load Balancer for CephFS Artifacts Available (v1.1) Artifact
A Next-Generation Discontinuous Galerkin Fluid Dynamics Solver with Application to High-Resolution Lung Airflow Simulations Artifacts Available (v1.1) Artifact
LMFF: Efficient and Scalable Layered Materials Force Field on Heterogeneous Many-Core Processors Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Artifact
Whale: Efficient One-to-Many Data Partitioning in RDMA-assisted Distributed Stream Processing Systems Artifacts Available (v1.1) Artifact
ndzip-gpu: Efficient Lossless Compression of Scientific Floating-Point Data on GPUs Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
cuTS: Scaling Subgraph Isomorphism on Distributed Multi-GPU Systems Using Trie Based Data Structure Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Representation of Women in High-Performance Computing Conferences Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
PEPPA-X: finding program test inputs to bound silent data corruption vulnerability in HPC applications Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Tensor processing primitives: a programming abstraction for efficiency and portability in deep learning workloads Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
DistGNN: scalable distributed training for large-scale graph neural networks Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Artifact
Dr. Top-k: delegate-centric Top-k on GPUs Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Artifact
Hybrid, scalable, trace-driven performance modeling of GPGPUs Artifacts Available (v1.1) Artifacts Evaluated - Functional (v1.1) Results Reproduced (v1.1) Artifact
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines Artifacts Evaluated - Functional (v1.1)