-
aoclsparse_status aoclsparse_strsv(aoclsparse_operation trans, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const float *b, float *x)#
-
aoclsparse_status aoclsparse_dtrsv(aoclsparse_operation trans, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const double *b, double *x)#
-
aoclsparse_status aoclsparse_ctrsv(aoclsparse_operation trans, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_float_complex *b, aoclsparse_float_complex *x)#
-
aoclsparse_status aoclsparse_ztrsv(aoclsparse_operation trans, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_double_complex *b, aoclsparse_double_complex *x)#
Sparse triangular solver for real/complex single and double data precisions.
The function aoclsparse_strsv() and variants solve sparse lower (or upper) triangular linear system of equations. The system is defined by the sparse \(m \times m\) matrix \(A\), the dense solution \(m\)-vector \(x\), and the right-hand side dense \(m\)-vector \(b\). Vector \(b\) is multiplied by \(\alpha\). The solution \(x\) is estimated by solving
\[ op(L) \cdot x = \alpha \cdot b, \quad \text{ or } \quad op(U) \cdot x = \alpha \cdot b, \]where \(L = \text{tril}(A)\) is the lower triangle of matrix \(A\), similarly, \(U = \text{triu}(A)\) is the upper triangle of matrix \(A\). The operator \(op()\) is regarded as the matrix linear operation,\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]Notes
1. This routine supports sparse matrices in CSR, CSC, and TCSR formats.
2. If the matrix descriptor
descrspecifies that the matrix \(A\) is to be regarded as having a unitary diagonal, then the main diagonal entries of matrix \(A\) are not accessed and are all considered to be unitary.3. The input matrix need not be (upper or lower) triangular matrix, in the
descr, thefill_modeentity specifies which triangle to consider, namely, iffill_mode= aoclsparse_fill_mode_lower, then\[ op(L) \cdot x = \alpha \cdot b, \]otherwise, if
fill_mode= aoclsparse_fill_mode_upper, then\[ op(U) \cdot x = \alpha \cdot b, \]is solved.
4. To increase performance and if the matrix \(A\) is to be used more than once to solve for different right-hand sides
b, then it is encouraged to provide hints using aoclsparse_set_sv_hint() and aoclsparse_optimize(), otherwise, the optimization for the matrix will be done by the solver on entry.5. There is a
kid(Kernel ID) variation of TRSV, namely with a suffix of_kid, aoclsparse_strsv_kid() (and variations) where it is possible to specify the TRSV kernel to use (if possible).1/* ************************************************************************ 2 * Copyright (c) 2022-2023 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <iomanip> 27#include <iostream> 28 29int main() 30{ 31 std::cout << "-------------------------------" << std::endl 32 << " Triangle solve sample program" << std::endl 33 << "-------------------------------" << std::endl 34 << std::endl; 35 36 // Create a tri-diagonal matrix in CSR format 37 // | 1 2 0 0 | 38 // | 3 1 2 0 | 39 // | 0 3 1 2 | 40 // | 0 0 3 1 | 41 aoclsparse_int n = 4, m = 4, nnz = 10; 42 aoclsparse_int icrow[5] = {0, 2, 5, 8, 10}; 43 aoclsparse_int icol[18] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3}; 44 double aval[18] = {1, 2, 3, 1, 2, 3, 1, 2, 3, 1}; 45 aoclsparse_status status; 46 47 // create aoclsparse matrix and its descriptor 48 aoclsparse_matrix A; 49 aoclsparse_index_base base = aoclsparse_index_base_zero; 50 aoclsparse_mat_descr descr_a; 51 aoclsparse_operation trans = aoclsparse_operation_none; 52 aoclsparse_create_dcsr(&A, base, m, n, nnz, icrow, icol, aval); 53 aoclsparse_create_mat_descr(&descr_a); 54 55 // A = L + D + U where: 56 // - L and U are the strict lower and upper triangle of A 57 // - D is the diagonal 58 // Use the matrix descriptor to solve (L+D)x = b 59 // b = [1, 4, 4, 4]^t 60 double b[4] = {1, 4, 4, 4}; 61 aoclsparse_set_mat_type(descr_a, aoclsparse_matrix_type_triangular); 62 aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_lower); 63 aoclsparse_set_mat_index_base(descr_a, base); 64 status = aoclsparse_set_sv_hint(A, trans, descr_a, 1); 65 if(status != aoclsparse_status_success) 66 { 67 std::cerr << "Error setting SV hint, status = " << status << std::endl; 68 return 1; 69 } 70 status = aoclsparse_optimize(A); 71 if(status != aoclsparse_status_success) 72 { 73 std::cerr << "Error returned from aoclsparse_optimize, status = " << status << "." 74 << std::endl; 75 return 2; 76 } 77 // Call the triangle solver 78 double alpha = 1.0; 79 double x[n]; 80 status = aoclsparse_dtrsv(trans, alpha, A, descr_a, b, x); 81 if(status != aoclsparse_status_success) 82 { 83 std::cerr << "Error returned from aoclsparse_dtrsv, status = " << status << "." 84 << std::endl; 85 return 3; 86 } 87 88 // Print the result 89 std::cout << "Solving (L+D)x = b: " << std::endl << " x = "; 90 std::cout << std::fixed; 91 std::cout.precision(1); 92 for(aoclsparse_int i = 0; i < n; i++) 93 std::cout << x[i] << " "; 94 std::cout << std::endl; 95 96 // The same method can be used to solve (U+D)x = b 97 aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_upper); 98 status = aoclsparse_dtrsv(trans, alpha, A, descr_a, b, x); 99 if(status != aoclsparse_status_success) 100 { 101 std::cerr << "Error returned from aoclsparse_dtrsv, status = " << status << "." 102 << std::endl; 103 return 3; 104 } 105 106 // Print the result 107 std::cout << std::endl; 108 std::cout << "Solving (U+D)x = b: " << std::endl << " x = "; 109 for(aoclsparse_int i = 0; i < n; i++) 110 std::cout << x[i] << " "; 111 std::cout << std::endl; 112 113 // destroy the aoclsparse memory 114 aoclsparse_destroy_mat_descr(descr_a); 115 aoclsparse_destroy(&A); 116 117 return 0; 118}
1/* ************************************************************************ 2 * Copyright (c) 2023-2025 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <assert.h> 27#include <complex> 28#include <iomanip> 29#include <iostream> 30#include <limits> 31#include <vector> 32 33int main() 34{ 35 std::cout << "---------------------------------" << std::endl 36 << " TCSR matrix, TRSV sample program " << std::endl 37 << "---------------------------------" << std::endl 38 << std::endl; 39 40 /* The complex matrix A is the tri-diagonal matrix in TCSR format 41 * | 1+3i 2+5i 0 0 | 42 * | 3 1+2i 2-2i 0 | 43 * | 0 3 1 2+3i | 44 * | 0 0 3+1i 1+1i | 45 */ 46 const aoclsparse_int n = 4, m = 4, nnz = 10; 47 aoclsparse_int row_ptr_L[5] = {0, 1, 3, 5, 7}; 48 aoclsparse_int row_ptr_U[5] = {0, 2, 4, 6, 7}; 49 aoclsparse_int col_idx_L[7] = {0, 0, 1, 1, 2, 2, 3}; 50 aoclsparse_int col_idx_U[7] = {0, 1, 1, 2, 2, 3, 3}; 51 std::vector<aoclsparse_double_complex> val_L 52 = {{1., 3.}, {3., 0.}, {1., 2.}, {3., 0.}, {1., 0.}, {3., 1.}, {1., 1.}}; 53 std::vector<aoclsparse_double_complex> val_U 54 = {{1., 3.}, {2., 5.}, {1., 2.}, {2., -2.}, {1., 0.}, {2., 3.}, {1., 1.}}; 55 aoclsparse_status status; 56 std::vector<aoclsparse_double_complex> ref; 57 58 // create aoclsparse matrix and its descriptor 59 aoclsparse_matrix A; 60 aoclsparse_index_base base = aoclsparse_index_base_zero; 61 aoclsparse_mat_descr descr_a; 62 aoclsparse_operation trans = aoclsparse_operation_none; 63 64 // Create TCSR matrix 65 status = aoclsparse_create_ztcsr(&A, 66 base, 67 m, 68 n, 69 nnz, 70 row_ptr_L, 71 row_ptr_U, 72 col_idx_L, 73 col_idx_U, 74 val_L.data(), 75 val_U.data()); 76 if(status != aoclsparse_status_success) 77 { 78 std::cerr << "Error creating the matrix, status = " << status << std::endl; 79 return 1; 80 } 81 aoclsparse_create_mat_descr(&descr_a); 82 83 /* Solve the lower triangular system Lx = b, 84 * here alpha=1 and b = [1+i, 4+2i, 4+i, 4]. 85 */ 86 std::vector<aoclsparse_double_complex> b = {{1., 1.}, {4., 2.}, {4., 1}, {4., 0}}; 87 aoclsparse_set_mat_type(descr_a, aoclsparse_matrix_type_triangular); 88 aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_lower); 89 status = aoclsparse_set_sv_hint(A, trans, descr_a, 1); 90 if(status != aoclsparse_status_success) 91 { 92 std::cerr << "Error setting SV hint, status = " << status << std::endl; 93 return 1; 94 } 95 status = aoclsparse_optimize(A); 96 if(status != aoclsparse_status_success) 97 { 98 std::cerr << "Error returned from aoclsparse_optimize, status = " << status << "." 99 << std::endl; 100 return 2; 101 } 102 103 // Call the triangle solver 104 aoclsparse_double_complex alpha = {1.0, 0.}; 105 aoclsparse_double_complex x[n]; 106 status = aoclsparse_ztrsv(trans, alpha, A, descr_a, b.data(), x); 107 if(status != aoclsparse_status_success) 108 { 109 std::cerr << "Error returned from aoclsparse_ztrsv, status = " << status << "." 110 << std::endl; 111 return 3; 112 } 113 114 // Print and check the result 115 double tol = 20 * std::numeric_limits<double>::epsilon(); 116 ref.assign({{0.4, -0.2}, {1.6, -0.6}, {-0.8, 2.8}, {0.8, -8.4}}); 117 std::cout << "Solving Lx = alpha b: " << std::endl << " x = "; 118 std::cout << std::fixed; 119 std::cout.precision(1); 120 bool oki, ok = true; 121 for(aoclsparse_int i = 0; i < n; i++) 122 { 123 oki = std::abs(x[i].real - ref[i].real) <= tol && std::abs(x[i].imag - ref[i].imag) <= tol; 124 std::cout << "(" << x[i].real << ", " << x[i].imag << "i) " << (oki ? " " : "! "); 125 ok &= oki; 126 } 127 std::cout << std::endl; 128 129 /* Solve the lower triangular system U^Hx = b, 130 * here alpha=1-i and b is unchanged. 131 */ 132 alpha = {1, -1}; 133 // Indicate to use only the conjugate transpose of the upper part of A 134 trans = aoclsparse_operation_conjugate_transpose; 135 aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_upper); 136 status = aoclsparse_ztrsv(trans, alpha, A, descr_a, b.data(), x); 137 if(status != aoclsparse_status_success) 138 { 139 std::cerr << "Error returned from aoclsparse_ztrsv, status = " << status << "." 140 << std::endl; 141 return 3; 142 } 143 144 // Print and check the result 145 ref.assign({{0.2, 0.6}, {1.4, 0.6}, {3.4, -7.0}, {-1.0, 19.2}}); 146 std::cout << std::endl; 147 std::cout << "Solving U^Hx = alpha*b: " << std::endl << " x = "; 148 for(aoclsparse_int i = 0; i < n; i++) 149 { 150 oki = std::abs(x[i].real - ref[i].real) <= tol && std::abs(x[i].imag - ref[i].imag) <= tol; 151 std::cout << "(" << x[i].real << ", " << x[i].imag << "i) " << (oki ? " " : "! "); 152 ok &= oki; 153 } 154 std::cout << std::endl; 155 156 // Destroy the aoclsparse memory 157 aoclsparse_destroy_mat_descr(descr_a); 158 aoclsparse_destroy(&A); 159 160 return ok ? 0 : 4; 161}
- Parameters:
trans – [in] matrix operation type, either aoclsparse_operation_none, aoclsparse_operation_transpose, or aoclsparse_operation_conjugate_transpose.
alpha – [in] scalar \(\alpha\), used to multiply right-hand side vector \(b\).
A – [inout] matrix containing data used to represent the \(m \times m\) triangular linear system to solve.
descr – [in] matrix descriptor. Supported matrix types are aoclsparse_matrix_type_symmetric and aoclsparse_matrix_type_triangular.
b – [in] array of
melements, storing the right-hand side.x – [out] array of
melements, storing the solution if solver returns aoclsparse_status_success.
- Return values:
aoclsparse_status_success – the operation completed successfully and \(x\) contains the solution to the linear system of equations.
aoclsparse_status_invalid_size – matrix \(A\) or \(op(A)\) is invalid.
aoclsparse_status_invalid_pointer – One or more of
A,descr,x,bare invalid pointers.aoclsparse_status_internal_error – an internal error occurred.
aoclsparse_status_not_implemented – the requested operation is not yet implemented.
other – possible failure values from a call to aoclsparse_optimize.
-
aoclsparse_status aoclsparse_strsv_strided(aoclsparse_operation trans, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const float *b, const aoclsparse_int incb, float *x, const aoclsparse_int incx)#
-
aoclsparse_status aoclsparse_dtrsv_strided(aoclsparse_operation trans, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const double *b, const aoclsparse_int incb, double *x, const aoclsparse_int incx)#
-
aoclsparse_status aoclsparse_ctrsv_strided(aoclsparse_operation trans, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_float_complex *b, const aoclsparse_int incb, aoclsparse_float_complex *x, const aoclsparse_int incx)#
-
aoclsparse_status aoclsparse_ztrsv_strided(aoclsparse_operation trans, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_double_complex *b, const aoclsparse_int incb, aoclsparse_double_complex *x, const aoclsparse_int incx)#
This is a variation of TRSV, namely with a suffix of
_strided, allows to set the stride for the dense vectorsbandx.For full details refer to
aoclsparse_?trsv().- Parameters:
trans – [in] matrix operation type, either aoclsparse_operation_none, aoclsparse_operation_transpose, or aoclsparse_operation_conjugate_transpose.
alpha – [in] scalar \(\alpha\), used to multiply right-hand side vector \(b\).
A – [inout] matrix containing data used to represent the \(m \times m\) triangular linear system to solve.
descr – [in] matrix descriptor. Supported matrix types are aoclsparse_matrix_type_symmetric and aoclsparse_matrix_type_triangular.
b – [in] array of
melements, storing the right-hand side.incb – [in] a positive integer holding the stride value for \(b\) vector.
x – [out] array of
melements, storing the solution if solver returns aoclsparse_status_success.incx – [in] a positive integer holding the stride value for \(x\) vector.
-
aoclsparse_status aoclsparse_strsv_kid(aoclsparse_operation trans, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const float *b, float *x, aoclsparse_int kid)#
-
aoclsparse_status aoclsparse_dtrsv_kid(aoclsparse_operation trans, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const double *b, double *x, aoclsparse_int kid)#
-
aoclsparse_status aoclsparse_ctrsv_kid(aoclsparse_operation trans, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_float_complex *b, aoclsparse_float_complex *x, aoclsparse_int kid)#
-
aoclsparse_status aoclsparse_ztrsv_kid(aoclsparse_operation trans, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_double_complex *b, aoclsparse_double_complex *x, aoclsparse_int kid)#
Sparse triangular solver for real/complex single and double data precisions (kernel flag variation).
For full details refer to
aoclsparse_?trsv().This variation of TRSV, namely with a suffix of
_kid, allows to choose which TRSV kernel to use (if possible). Currently the possible choices are:kid=0Reference implementation (No explicit AVX instructions).
kid=1Alias to
kid=2(Kernel Template AVX 256-bit implementation)kid=2Kernel Template version using
AVX2extensions.kid=3Kernel Template version using
AVX512F+CPU extensions.
Any other Kernel ID value will default to
kid= 0.- Parameters:
trans – [in] matrix operation type, either aoclsparse_operation_none, aoclsparse_operation_transpose, or aoclsparse_operation_conjugate_transpose.
alpha – [in] scalar \(\alpha\), used to multiply right-hand side vector \(b\).
A – [inout] matrix containing data used to represent the \(m \times m\) triangular linear system to solve.
descr – [in] matrix descriptor. Supported matrix types are aoclsparse_matrix_type_symmetric and aoclsparse_matrix_type_triangular.
b – [in] array of
melements, storing the right-hand side.x – [out] array of
melements, storing the solution if solver returns aoclsparse_status_success.kid – [in] Kernel ID, hints a request on which TRSV kernel to use.