LUsim: A Framework for Simulation-Based Performance Modeling and Prediction of Parallel Sparse LU Factorization
Skip to main content
eScholarship
Open Access Publications from the University of California

LUsim: A Framework for Simulation-Based Performance Modeling and Prediction of Parallel Sparse LU Factorization

Abstract

Sparse parallel factorization is among the most complicated and irregular algorithms to analyze and optimize. Performance depends both on system characteristics such as the floating point rate, the memory hierarchy, and the interconnect performance, as well as input matrix characteristics such as such as the number and location of nonzeros. We present LUsim, a simulation framework for modeling the performance of sparse LU factorization. Our framework uses micro-benchmarks to calibrate the parameters of machine characteristics and additional tools to facilitate real-time performance modeling. We are using LUsim to analyze an existing parallel sparse LU factorization code, and to explore a latency tolerant variant. We developed and validated a model of the factorization in SuperLU_DIST, then we modeled and implemented a new variant of slud, replacing a blocking collective communication phase with a non-blocking asynchronous point-to-point one. Our strategy realized a mean improvement of 11percent over a suite of test matrices.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View