aoclsparse_?trsv() - 5.2 English - 68552

AOCL API Guide (68552)

Document ID
68552
Release Date
2025-12-29
Version
5.2 English
aoclsparse_status aoclsparse_strsv(aoclsparse_operation trans, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const float *b, float *x)#
aoclsparse_status aoclsparse_dtrsv(aoclsparse_operation trans, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const double *b, double *x)#
aoclsparse_status aoclsparse_ctrsv(aoclsparse_operation trans, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_float_complex *b, aoclsparse_float_complex *x)#
aoclsparse_status aoclsparse_ztrsv(aoclsparse_operation trans, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_double_complex *b, aoclsparse_double_complex *x)#

Sparse triangular solver for real/complex single and double data precisions.

The function aoclsparse_strsv() and variants solve sparse lower (or upper) triangular linear system of equations. The system is defined by the sparse \(m \times m\) matrix \(A\), the dense solution \(m\)-vector \(x\), and the right-hand side dense \(m\)-vector \(b\). Vector \(b\) is multiplied by \(\alpha\). The solution \(x\) is estimated by solving

\[ op(L) \cdot x = \alpha \cdot b, \quad \text{ or } \quad op(U) \cdot x = \alpha \cdot b, \]
where \(L = \text{tril}(A)\) is the lower triangle of matrix \(A\), similarly, \(U = \text{triu}(A)\) is the upper triangle of matrix \(A\). The operator \(op()\) is regarded as the matrix linear operation,
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]

Notes

1. This routine supports sparse matrices in CSR, CSC, and TCSR formats.

2. If the matrix descriptor descr specifies that the matrix \(A\) is to be regarded as having a unitary diagonal, then the main diagonal entries of matrix \(A\) are not accessed and are all considered to be unitary.

3. The input matrix need not be (upper or lower) triangular matrix, in the descr, the fill_mode entity specifies which triangle to consider, namely, if fill_mode = aoclsparse_fill_mode_lower, then

\[ op(L) \cdot x = \alpha \cdot b, \]

otherwise, if fill_mode = aoclsparse_fill_mode_upper, then

\[ op(U) \cdot x = \alpha \cdot b, \]

is solved.

4. To increase performance and if the matrix \(A\) is to be used more than once to solve for different right-hand sides b, then it is encouraged to provide hints using aoclsparse_set_sv_hint() and aoclsparse_optimize(), otherwise, the optimization for the matrix will be done by the solver on entry.

5. There is a kid (Kernel ID) variation of TRSV, namely with a suffix of _kid, aoclsparse_strsv_kid() (and variations) where it is possible to specify the TRSV kernel to use (if possible).

  1/* ************************************************************************
  2 * Copyright (c) 2022-2023 Advanced Micro Devices, Inc.
  3 *
  4 * Permission is hereby granted, free of charge, to any person obtaining a copy
  5 * of this software and associated documentation files (the "Software"), to deal
  6 * in the Software without restriction, including without limitation the rights
  7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  8 * copies of the Software, and to permit persons to whom the Software is
  9 * furnished to do so, subject to the following conditions:
 10 *
 11 * The above copyright notice and this permission notice shall be included in
 12 * all copies or substantial portions of the Software.
 13 *
 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 20 * THE SOFTWARE.
 21 *
 22 * ************************************************************************ */
 23
 24#include "aoclsparse.h"
 25
 26#include <iomanip>
 27#include <iostream>
 28
 29int main()
 30{
 31    std::cout << "-------------------------------" << std::endl
 32              << " Triangle solve sample program" << std::endl
 33              << "-------------------------------" << std::endl
 34              << std::endl;
 35
 36    // Create a tri-diagonal matrix in CSR format
 37    // | 1  2  0  0 |
 38    // | 3  1  2  0 |
 39    // | 0  3  1  2 |
 40    // | 0  0  3  1 |
 41    aoclsparse_int    n = 4, m = 4, nnz = 10;
 42    aoclsparse_int    icrow[5] = {0, 2, 5, 8, 10};
 43    aoclsparse_int    icol[18] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};
 44    double            aval[18] = {1, 2, 3, 1, 2, 3, 1, 2, 3, 1};
 45    aoclsparse_status status;
 46
 47    // create aoclsparse matrix and its descriptor
 48    aoclsparse_matrix     A;
 49    aoclsparse_index_base base = aoclsparse_index_base_zero;
 50    aoclsparse_mat_descr  descr_a;
 51    aoclsparse_operation  trans = aoclsparse_operation_none;
 52    aoclsparse_create_dcsr(&A, base, m, n, nnz, icrow, icol, aval);
 53    aoclsparse_create_mat_descr(&descr_a);
 54
 55    // A = L + D + U where:
 56    // - L and U are the strict lower and upper triangle of A
 57    // - D is the diagonal
 58    // Use the matrix descriptor to solve (L+D)x = b
 59    // b = [1, 4, 4, 4]^t
 60    double b[4] = {1, 4, 4, 4};
 61    aoclsparse_set_mat_type(descr_a, aoclsparse_matrix_type_triangular);
 62    aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_lower);
 63    aoclsparse_set_mat_index_base(descr_a, base);
 64    status = aoclsparse_set_sv_hint(A, trans, descr_a, 1);
 65    if(status != aoclsparse_status_success)
 66    {
 67        std::cerr << "Error setting SV hint, status = " << status << std::endl;
 68        return 1;
 69    }
 70    status = aoclsparse_optimize(A);
 71    if(status != aoclsparse_status_success)
 72    {
 73        std::cerr << "Error returned from aoclsparse_optimize, status = " << status << "."
 74                  << std::endl;
 75        return 2;
 76    }
 77    // Call the triangle solver
 78    double alpha = 1.0;
 79    double x[n];
 80    status = aoclsparse_dtrsv(trans, alpha, A, descr_a, b, x);
 81    if(status != aoclsparse_status_success)
 82    {
 83        std::cerr << "Error returned from aoclsparse_dtrsv, status = " << status << "."
 84                  << std::endl;
 85        return 3;
 86    }
 87
 88    // Print the result
 89    std::cout << "Solving (L+D)x = b: " << std::endl << "  x = ";
 90    std::cout << std::fixed;
 91    std::cout.precision(1);
 92    for(aoclsparse_int i = 0; i < n; i++)
 93        std::cout << x[i] << "  ";
 94    std::cout << std::endl;
 95
 96    // The same method can be used to solve (U+D)x = b
 97    aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_upper);
 98    status = aoclsparse_dtrsv(trans, alpha, A, descr_a, b, x);
 99    if(status != aoclsparse_status_success)
100    {
101        std::cerr << "Error returned from aoclsparse_dtrsv, status = " << status << "."
102                  << std::endl;
103        return 3;
104    }
105
106    // Print the result
107    std::cout << std::endl;
108    std::cout << "Solving (U+D)x = b: " << std::endl << "  x = ";
109    for(aoclsparse_int i = 0; i < n; i++)
110        std::cout << x[i] << "  ";
111    std::cout << std::endl;
112
113    // destroy the aoclsparse memory
114    aoclsparse_destroy_mat_descr(descr_a);
115    aoclsparse_destroy(&A);
116
117    return 0;
118}

  1/* ************************************************************************
  2 * Copyright (c) 2023-2025 Advanced Micro Devices, Inc.
  3 *
  4 * Permission is hereby granted, free of charge, to any person obtaining a copy
  5 * of this software and associated documentation files (the "Software"), to deal
  6 * in the Software without restriction, including without limitation the rights
  7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  8 * copies of the Software, and to permit persons to whom the Software is
  9 * furnished to do so, subject to the following conditions:
 10 *
 11 * The above copyright notice and this permission notice shall be included in
 12 * all copies or substantial portions of the Software.
 13 *
 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 20 * THE SOFTWARE.
 21 *
 22 * ************************************************************************ */
 23
 24#include "aoclsparse.h"
 25
 26#include <assert.h>
 27#include <complex>
 28#include <iomanip>
 29#include <iostream>
 30#include <limits>
 31#include <vector>
 32
 33int main()
 34{
 35    std::cout << "---------------------------------" << std::endl
 36              << " TCSR matrix, TRSV sample program " << std::endl
 37              << "---------------------------------" << std::endl
 38              << std::endl;
 39
 40    /* The complex matrix A is the tri-diagonal matrix in TCSR format
 41     * | 1+3i  2+5i     0     0 |
 42     * | 3     1+2i  2-2i     0 |
 43     * |    0     3     1  2+3i |
 44     * |    0     0  3+1i  1+1i |
 45     */
 46    const aoclsparse_int                   n = 4, m = 4, nnz = 10;
 47    aoclsparse_int                         row_ptr_L[5] = {0, 1, 3, 5, 7};
 48    aoclsparse_int                         row_ptr_U[5] = {0, 2, 4, 6, 7};
 49    aoclsparse_int                         col_idx_L[7] = {0, 0, 1, 1, 2, 2, 3};
 50    aoclsparse_int                         col_idx_U[7] = {0, 1, 1, 2, 2, 3, 3};
 51    std::vector<aoclsparse_double_complex> val_L
 52        = {{1., 3.}, {3., 0.}, {1., 2.}, {3., 0.}, {1., 0.}, {3., 1.}, {1., 1.}};
 53    std::vector<aoclsparse_double_complex> val_U
 54        = {{1., 3.}, {2., 5.}, {1., 2.}, {2., -2.}, {1., 0.}, {2., 3.}, {1., 1.}};
 55    aoclsparse_status                      status;
 56    std::vector<aoclsparse_double_complex> ref;
 57
 58    // create aoclsparse matrix and its descriptor
 59    aoclsparse_matrix     A;
 60    aoclsparse_index_base base = aoclsparse_index_base_zero;
 61    aoclsparse_mat_descr  descr_a;
 62    aoclsparse_operation  trans = aoclsparse_operation_none;
 63
 64    // Create TCSR matrix
 65    status = aoclsparse_create_ztcsr(&A,
 66                                     base,
 67                                     m,
 68                                     n,
 69                                     nnz,
 70                                     row_ptr_L,
 71                                     row_ptr_U,
 72                                     col_idx_L,
 73                                     col_idx_U,
 74                                     val_L.data(),
 75                                     val_U.data());
 76    if(status != aoclsparse_status_success)
 77    {
 78        std::cerr << "Error creating the matrix, status = " << status << std::endl;
 79        return 1;
 80    }
 81    aoclsparse_create_mat_descr(&descr_a);
 82
 83    /* Solve the lower triangular system Lx = b,
 84     * here alpha=1 and b = [1+i, 4+2i, 4+i, 4].
 85     */
 86    std::vector<aoclsparse_double_complex> b = {{1., 1.}, {4., 2.}, {4., 1}, {4., 0}};
 87    aoclsparse_set_mat_type(descr_a, aoclsparse_matrix_type_triangular);
 88    aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_lower);
 89    status = aoclsparse_set_sv_hint(A, trans, descr_a, 1);
 90    if(status != aoclsparse_status_success)
 91    {
 92        std::cerr << "Error setting SV hint, status = " << status << std::endl;
 93        return 1;
 94    }
 95    status = aoclsparse_optimize(A);
 96    if(status != aoclsparse_status_success)
 97    {
 98        std::cerr << "Error returned from aoclsparse_optimize, status = " << status << "."
 99                  << std::endl;
100        return 2;
101    }
102
103    // Call the triangle solver
104    aoclsparse_double_complex alpha = {1.0, 0.};
105    aoclsparse_double_complex x[n];
106    status = aoclsparse_ztrsv(trans, alpha, A, descr_a, b.data(), x);
107    if(status != aoclsparse_status_success)
108    {
109        std::cerr << "Error returned from aoclsparse_ztrsv, status = " << status << "."
110                  << std::endl;
111        return 3;
112    }
113
114    // Print and check the result
115    double tol = 20 * std::numeric_limits<double>::epsilon();
116    ref.assign({{0.4, -0.2}, {1.6, -0.6}, {-0.8, 2.8}, {0.8, -8.4}});
117    std::cout << "Solving Lx = alpha b: " << std::endl << "  x = ";
118    std::cout << std::fixed;
119    std::cout.precision(1);
120    bool oki, ok = true;
121    for(aoclsparse_int i = 0; i < n; i++)
122    {
123        oki = std::abs(x[i].real - ref[i].real) <= tol && std::abs(x[i].imag - ref[i].imag) <= tol;
124        std::cout << "(" << x[i].real << ", " << x[i].imag << "i) " << (oki ? " " : "!  ");
125        ok &= oki;
126    }
127    std::cout << std::endl;
128
129    /* Solve the lower triangular system U^Hx = b,
130     * here alpha=1-i and b is unchanged.
131     */
132    alpha = {1, -1};
133    // Indicate to use only the conjugate transpose of the upper part of A
134    trans = aoclsparse_operation_conjugate_transpose;
135    aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_upper);
136    status = aoclsparse_ztrsv(trans, alpha, A, descr_a, b.data(), x);
137    if(status != aoclsparse_status_success)
138    {
139        std::cerr << "Error returned from aoclsparse_ztrsv, status = " << status << "."
140                  << std::endl;
141        return 3;
142    }
143
144    // Print and check the result
145    ref.assign({{0.2, 0.6}, {1.4, 0.6}, {3.4, -7.0}, {-1.0, 19.2}});
146    std::cout << std::endl;
147    std::cout << "Solving U^Hx = alpha*b: " << std::endl << "  x = ";
148    for(aoclsparse_int i = 0; i < n; i++)
149    {
150        oki = std::abs(x[i].real - ref[i].real) <= tol && std::abs(x[i].imag - ref[i].imag) <= tol;
151        std::cout << "(" << x[i].real << ", " << x[i].imag << "i) " << (oki ? " " : "!  ");
152        ok &= oki;
153    }
154    std::cout << std::endl;
155
156    // Destroy the aoclsparse memory
157    aoclsparse_destroy_mat_descr(descr_a);
158    aoclsparse_destroy(&A);
159
160    return ok ? 0 : 4;
161}
Parameters:
Return values:
  • aoclsparse_status_success – the operation completed successfully and \(x\) contains the solution to the linear system of equations.

  • aoclsparse_status_invalid_size – matrix \(A\) or \(op(A)\) is invalid.

  • aoclsparse_status_invalid_pointer – One or more of A, descr, x, b are invalid pointers.

  • aoclsparse_status_internal_error – an internal error occurred.

  • aoclsparse_status_not_implemented – the requested operation is not yet implemented.

  • other – possible failure values from a call to aoclsparse_optimize.

aoclsparse_status aoclsparse_strsv_strided(aoclsparse_operation trans, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const float *b, const aoclsparse_int incb, float *x, const aoclsparse_int incx)#
aoclsparse_status aoclsparse_dtrsv_strided(aoclsparse_operation trans, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const double *b, const aoclsparse_int incb, double *x, const aoclsparse_int incx)#
aoclsparse_status aoclsparse_ctrsv_strided(aoclsparse_operation trans, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_float_complex *b, const aoclsparse_int incb, aoclsparse_float_complex *x, const aoclsparse_int incx)#
aoclsparse_status aoclsparse_ztrsv_strided(aoclsparse_operation trans, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_double_complex *b, const aoclsparse_int incb, aoclsparse_double_complex *x, const aoclsparse_int incx)#

This is a variation of TRSV, namely with a suffix of _strided, allows to set the stride for the dense vectors b and x.

For full details refer to aoclsparse_?trsv().

Parameters:
aoclsparse_status aoclsparse_strsv_kid(aoclsparse_operation trans, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const float *b, float *x, aoclsparse_int kid)#
aoclsparse_status aoclsparse_dtrsv_kid(aoclsparse_operation trans, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const double *b, double *x, aoclsparse_int kid)#
aoclsparse_status aoclsparse_ctrsv_kid(aoclsparse_operation trans, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_float_complex *b, aoclsparse_float_complex *x, aoclsparse_int kid)#
aoclsparse_status aoclsparse_ztrsv_kid(aoclsparse_operation trans, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_double_complex *b, aoclsparse_double_complex *x, aoclsparse_int kid)#

Sparse triangular solver for real/complex single and double data precisions (kernel flag variation).

For full details refer to aoclsparse_?trsv().

This variation of TRSV, namely with a suffix of _kid, allows to choose which TRSV kernel to use (if possible). Currently the possible choices are:

kid=0

Reference implementation (No explicit AVX instructions).

kid=1

Alias to kid=2 (Kernel Template AVX 256-bit implementation)

kid=2

Kernel Template version using AVX2 extensions.

kid=3

Kernel Template version using AVX512F+ CPU extensions.

Any other Kernel ID value will default to kid = 0.

Parameters: