Mijić, Nenad; Kaushik, Abhiram; Živković, Dario; Davidović, Davor (2025) Scalable QR Factorisation of Ill-Conditioned Tall-and-Skinny Matrices on Distributed GPU Systems. Mathematics, 13 (22). ISSN 2227-7390
|
PDF
- Published Version
- article
Available under License Creative Commons Attribution. Download (919kB) |
Abstract
The QR factorisation is a cornerstone of numerical linear algebra, essential for solving overdetermined linear systems, eigenvalue problems, and various scientific computing tasks. However, computing it for ill-conditioned tall-and-skinny (TS) matrices on large-scale distributed-memory systems, particularly those with multiple GPUs, presents significant challenges in balancing numerical stability, high performance, and efficient communication. Traditional Householder-based QR methods provide numerical stability but perform poorly on TS matrices due to their reliance on memory-bound kernels. This paper introduces a novel algorithm for computing the QR factorisation of ill-conditioned TS matrices based on CholeskyQR methods. Although CholeskyQR is fast, it typically fails due to severe loss of orthogonality for ill-conditioned inputs. To solve this, our new algorithm, mCQRGSI+, combines the speed of CholeskyQR with stabilising techniques from the Gram–Schmidt process. It is specifically optimised for distributed multi-GPU systems, using adaptive strategies to balance computation and communication. Our analysis shows the method achieves accuracy comparable to Householder QR, even for extremely ill-conditioned matrices (condition numbers up to 1016). Scaling experiments demonstrate speedups of up to 12× over ScaLAPACK and 16× over SLATE’s CholeskyQR2. This work delivers a method that is both robust and highly parallel, advancing the state-of-the-art for this challenging class of problems.
| Item Type: | Article | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Uncontrolled Keywords: | QR factorisation; CholeskyQR; Gram–Schmidt orthogonalisation; tall-and-skinny matrices; ill-conditioned matrices; graphic processing units; distributed systems; parallel algorithms | ||||||||
| Subjects: | NATURAL SCIENCES > Mathematics > Numerical Mathematics TECHNICAL SCIENCES > Computing > Process Computing |
||||||||
| Divisions: | Center for Informatics and Computing | ||||||||
| Projects: |
|
||||||||
| Depositing User: | Lorena Palameta | ||||||||
| Date Deposited: | 26 Nov 2025 14:58 | ||||||||
| URI: | http://fulir.irb.hr/id/eprint/10247 | ||||||||
| DOI: | 10.3390/math13223608 |
Actions (login required)
![]() |
View Item |




Altmetric
Altmetric



