Skip to Main content Skip to Navigation
New interface

Towards Reproducible, Accurately Rounded and Efficient BLAS

Chemseddine Chohra 1 
1 DALI - Digits, Architectures et Logiciels Informatiques
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, UPVD - Université de Perpignan Via Domitia
Abstract : Numerical reproducibility failures rise in parallel computation because floating-point summation is non-associative. Massively parallel systems dynamically modify the order of floating-point operations. Hence, numerical results might change from one run to another. We propose to ensure reproducibility by extending as far as possible the IEEE-754 correct rounding property to larger computing sequences. We introduce RARE-BLAS a reproducible and accurate BLAS library that benefits from recent accurate and efficient summation algorithms. Solutions for level 1 (asum, dot and nrm2) and level 2 (gemv and trsv) routines are designed. Implementations relying on parallel programming API (OpenMP, MPI) and SIMD extensions are proposed. Their efficiency is studied compared to optimized library (Intel MKL) and other existing reproducible algorithms.
Complete list of metadata

Cited literature [83 references]  Display  Hide  Download
Contributor : Chemseddine Chohra Connect in order to contact the contributor
Submitted on : Tuesday, February 19, 2019 - 8:50:31 PM
Last modification on : Friday, August 5, 2022 - 2:56:33 PM
Long-term archiving on: : Monday, May 20, 2019 - 5:53:49 PM


These Finale.pdf
Files produced by the author(s)


  • HAL Id : tel-02025855, version 1



Chemseddine Chohra. Towards Reproducible, Accurately Rounded and Efficient BLAS. Computer Arithmetic. Université de Perpignan Via Domitia (UPVD), 2017. English. ⟨NNT : ⟩. ⟨tel-02025855⟩



Record views


Files downloads