hrvatski jezikClear Cookie - decide language by browser settings

Scalable QR Factorisation of Ill-Conditioned Tall-and-Skinny Matrices on Distributed GPU Systems

Mijić, Nenad; Kaushik, Abhiram; Živković, Dario; Davidović, Davor (2025) Scalable QR Factorisation of Ill-Conditioned Tall-and-Skinny Matrices on Distributed GPU Systems. Mathematics, 13 (22). ISSN 2227-7390

[img] PDF - Published Version - article
Available under License Creative Commons Attribution.

Download (919kB)

Abstract

The QR factorisation is a cornerstone of numerical linear algebra, essential for solving overdetermined linear systems, eigenvalue problems, and various scientific computing tasks. However, computing it for ill-conditioned tall-and-skinny (TS) matrices on large-scale distributed-memory systems, particularly those with multiple GPUs, presents significant challenges in balancing numerical stability, high performance, and efficient communication. Traditional Householder-based QR methods provide numerical stability but perform poorly on TS matrices due to their reliance on memory-bound kernels. This paper introduces a novel algorithm for computing the QR factorisation of ill-conditioned TS matrices based on CholeskyQR methods. Although CholeskyQR is fast, it typically fails due to severe loss of orthogonality for ill-conditioned inputs. To solve this, our new algorithm, mCQRGSI+, combines the speed of CholeskyQR with stabilising techniques from the Gram–Schmidt process. It is specifically optimised for distributed multi-GPU systems, using adaptive strategies to balance computation and communication. Our analysis shows the method achieves accuracy comparable to Householder QR, even for extremely ill-conditioned matrices (condition numbers up to 1016). Scaling experiments demonstrate speedups of up to 12× over ScaLAPACK and 16× over SLATE’s CholeskyQR2. This work delivers a method that is both robust and highly parallel, advancing the state-of-the-art for this challenging class of problems.

Item Type: Article
Uncontrolled Keywords: QR factorisation; CholeskyQR; Gram–Schmidt orthogonalisation; tall-and-skinny matrices; ill-conditioned matrices; graphic processing units; distributed systems; parallel algorithms
Subjects: NATURAL SCIENCES > Mathematics > Numerical Mathematics
TECHNICAL SCIENCES > Computing > Process Computing
Divisions: Center for Informatics and Computing
Projects:
Project titleProject leaderProject codeProject type
Skalabilni algoritmi visokih performansi za buduće heterogene distribuirane računalne sustave-HybridScaleDavor DavidovićUIP-2020-02-4559HRZZ
Depositing User: Lorena Palameta
Date Deposited: 26 Nov 2025 14:58
URI: http://fulir.irb.hr/id/eprint/10247
DOI: 10.3390/math13223608

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

Contrast
Increase Font
Decrease Font
Dyslexic Font
Accessibility