Sparse BLAS level 1, 2, and 3 functions#
aoclsparse_functions.h provides AMD CPU hardware optimized level 1, 2, and 3
Sparse Linear Algebra Subprograms (Sparse BLAS).
Level 1#
The sparse level 1 routines describe operations between a vector in sparse format and a vector in dense format.
This section describes all provided level 1 sparse linear algebra functions.
aoclsparse_?axpyi()#
-
aoclsparse_status aoclsparse_saxpyi(const aoclsparse_int nnz, const float a, const float *x, const aoclsparse_int *indx, float *y)#
-
aoclsparse_status aoclsparse_daxpyi(const aoclsparse_int nnz, const double a, const double *x, const aoclsparse_int *indx, double *y)#
-
aoclsparse_status aoclsparse_caxpyi(const aoclsparse_int nnz, const void *a, const void *x, const aoclsparse_int *indx, void *y)#
-
aoclsparse_status aoclsparse_zaxpyi(const aoclsparse_int nnz, const void *a, const void *x, const aoclsparse_int *indx, void *y)#
A variant of sparse vector-vector addition between compressed sparse vector and dense vector.
aoclsparse_?axpyiadds a scalar multiple of compressed sparse vector to a dense vector.Let \(y\in R^m\) (or \(C^m\)) be a dense vector, \(x\) be a compressed sparse vector and \(I_x\) be the nonzero indices set for
xof length at leastnnzdescribed byindx, then\[ y_{I_{x_{i}}} = a\,x_i + y_{I_{x_{i}}}, \quad i\in\{1,\ldots,{\bf\mathsf{nnz}}\}. \]1/* ************************************************************************ 2 * Copyright (c) 2023-2024 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <cfloat> 27#include <cmath> 28#include <complex> 29#include <iomanip> 30#include <iostream> 31#include <vector> 32 33/* Sample program to illustrate the usage of axpyi API which performs vector-vector 34 * addition operation of the form y = ax + y, where 35 * a: scalar value 36 * x: compressed sparse vector 37 * y: dense vector 38 * Below example is for complex double precision. Using other precisions are also similar. 39 */ 40int main(void) 41{ 42 std::cout << "--------------------------------" << std::endl 43 << "----- axpyi sample program -----" << std::endl 44 << "--------------------------------" << std::endl 45 << std::endl; 46 47 // Number of non-zeros of the sparse vector 48 const aoclsparse_int nnz = 3; 49 // Scalar value 50 const std::complex<double> a = {10, 20}; 51 52 // Sparse index vector (does not need to be ordered) 53 std::vector<aoclsparse_int> indx = {0, 6, 3}; 54 // Sparse value vector in compressed form 55 std::vector<std::complex<double>> x{{2, 3}, {4, 1}, {5, 1}}; 56 57 // Output vector 58 std::vector<std::complex<double>> y{{0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {65, -999}}; 59 60 // Expected output vector 61 std::vector<std::complex<double>> y_exp{ 62 {-40, 70}, {0, 0}, {0, 0}, {30, 110}, {0, 0}, {0, 0}, {85, -909}}; 63 64 aoclsparse_int ny = y.size(); 65 66 aoclsparse_status status; 67 68 std::cout << "Invoking aoclsparse_zaxpyi...\n"; 69 //Invoke complex axpyi 70 status = aoclsparse_zaxpyi(nnz, &a, x.data(), indx.data(), y.data()); 71 if(status != aoclsparse_status_success) 72 { 73 std::cerr << "Error returned from aoclsparse_zaxpyi, status = " << status << "." 74 << std::endl; 75 return 3; 76 } 77 78 // Check and print the result 79 std::cout << "vector-vector addition of y: " << std::endl 80 << std::setw(10) << "y" 81 << " " << std::setw(20) << "expected y" << std::endl; 82 std::cout << std::fixed; 83 std::cout.precision(4); 84 bool oki, ok = true; 85 for(aoclsparse_int i = 0; i < ny; i++) 86 { 87 oki = y[i] == y_exp[i]; 88 ok &= oki; 89 std::cout << std::setw(15) << y[i] << std::setw(5) << "" << std::setw(15) << y_exp[i] 90 << std::setw(5) << (oki ? "" : " ! Error") << std::endl; 91 } 92 return (ok ? 0 : 6); 93}
Note
The contents of the vectors are not checked for NaNs.
- Parameters
nnz – [in] The number of elements in \(x\) and
indx.a – [in] Scalar value.
x – [in] Sparse vector stored in compressed form of at least
nnzelements.indx – [in] Nonzero indices set, \(I_x\), of
xdescribed by this array of length at leastnnz. The elements in this vector are only checked for non-negativity. The caller should make sure that all indices are less than the size ofy. Array is assumed to be in zero base.y – [inout] Array of at least \(\max(I_{x_i}, i \in \{ 1,\ldots,{\bf\mathsf{nnz}}\})\) elements.
- Return values
aoclsparse_status_success – The operation completed successfully.
aoclsparse_status_invalid_pointer – At least one of the pointers
x,indx,yis invalid.aoclsparse_status_invalid_size – Indicates that provided
nnzis less than zero.aoclsparse_status_invalid_index_value – At least one of the indices in indx is negative.
aoclsparse_?dotci()#
-
aoclsparse_status aoclsparse_cdotci(const aoclsparse_int nnz, const void *x, const aoclsparse_int *indx, const void *y, void *dot)#
-
aoclsparse_status aoclsparse_zdotci(const aoclsparse_int nnz, const void *x, const aoclsparse_int *indx, const void *y, void *dot)#
Sparse conjugate dot product for single and double data precision complex types.
aoclsparse_cdotci() (complex float) and aoclsparse_zdotci() (complex double) compute the dot product of the conjugate of a complex vector stored in a compressed format and a complex dense vector. Let \(x\) and \(y\) be respectively a sparse and dense vectors in \(C^m\) with
indx( \(I_x\)) the nonzero indices array ofxof length at leastnnzthat is used to index into the entries of dense vector \(y\), then these functions return\[ {\bf\mathsf{dot}} = \sum_{i=0}^{{\bf\mathsf{nnz}}-1} \overline{\,x_i\,} \cdot y_{I_{x_i}}. \]1/* ************************************************************************ 2 * Copyright (c) 2023 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include <complex> 25#include <iostream> 26#include <vector> 27 28/* One can use std::complex<double> instead of aoclsparse_double_complex 29#define STD_CMPLX (or pass it as a compilation flag -DSTD_CMPLX) 30#ifdef STD_CMPLX 31#define aoclsparse_double_complex std::complex<double> 32#endif 33*/ 34 35#include "aoclsparse.h" 36 37int main(void) 38{ 39 aoclsparse_status status; 40 const aoclsparse_int nnz = 3; 41 std::vector<aoclsparse_int> indx = {0, 1, 2}; 42 std::vector<aoclsparse_double_complex> x{{1, 1}, {2, 2}, {3, 3}}; 43 std::vector<aoclsparse_double_complex> y{ 44 {1, 1}, {1, 2}, {2, 3}, {1, 0}, {4, 1}, {1.2, -1}, {7, -1}, {0, 2}, {-2, 3}}; 45 aoclsparse_double_complex dot_exp{-5, 23}; 46 aoclsparse_double_complex dotc_exp{23, 5}; 47 aoclsparse_double_complex dotp; 48 bool ok; 49 50 std::cout << "Invoking aoclsparse_zdotui...\n"; 51 52 //Invoke complex double dot product 53 status = aoclsparse_zdotui(nnz, x.data(), indx.data(), y.data(), &dotp); 54 if(status != aoclsparse_status_success) 55 { 56 std::cerr << "Error returned from aoclsparse_cdotui, status = " << status << "." 57 << std::endl; 58 return 1; 59 } 60 61#ifdef STD_CMPLX 62 std::cout << "Dot product: " << dotp << ", Expected dot product: " << dot_exp << std::endl; 63 ok = (dotp == dot_exp); 64#else 65 std::cout << "Dot product: (" << dotp.real << ", " << dotp.imag << "), Expected dot product: (" 66 << dot_exp.real << ", " << dot_exp.imag << ")" << std::endl; 67 ok = ((dotp.real == dot_exp.real) && (dotp.imag == dot_exp.imag)); 68#endif 69 70 std::cout << "Invoking aoclsparse_zdotci...\n"; 71 72 //Invoke complex double dot product 73 status = aoclsparse_zdotci(nnz, x.data(), indx.data(), y.data(), &dotp); 74 if(status != aoclsparse_status_success) 75 { 76 std::cerr << "Error returned from aoclsparse_cdotci, status = " << status << "." 77 << std::endl; 78 return 2; 79 } 80#ifdef STD_CMPLX 81 std::cout << "Conjugated dot product: " << dotp 82 << ", Expected conjugated dot product: " << dotc_exp << std::endl; 83 ok &= (dotp == dotc_exp); 84 return ok ? 0 : 3; 85 86#else 87 std::cout << "Dot product: (" << dotp.real << ", " << dotp.imag << "), Expected dot product: (" 88 << dotc_exp.real << ", " << dotc_exp.imag << ")" << std::endl; 89 ok &= ((dotp.real == dotc_exp.real) && (dotp.imag == dotc_exp.imag)); 90 return ok ? 0 : 4; 91#endif 92}
Note
The contents of the vectors are not checked for NaNs.
- Parameters
nnz – [in] The number of elements (length) of vectors
xandindx.x – [in] Array of at least
nnzcomplex elements.indx – [in] Nonzero indices set, \(I_x\), of
xdescribed by this array of length at leastnnz. Each entry must contain a valid index intoyand be unique. The entries ofindxare not checked for validity.y – [in] Array of at least \(\max(I_{x_i}, i \in \{ 1,\ldots,{\bf\mathsf{nnz}}\})\) complex elements.
dot – [out] The dot product of the conjugate of \(x\) and \(y\) when
nnz\(> 0\). Ifnnz\(\le 0\),dotis set to 0.
- Return values
aoclsparse_status_success – The operation completed successfully.
aoclsparse_status_invalid_pointer – At least one of the pointers
x,indx,y,dotis invalid.aoclsparse_status_invalid_size – Indicates that the provided
nnzis not positive.
aoclsparse_?dotui()#
-
aoclsparse_status aoclsparse_cdotui(const aoclsparse_int nnz, const void *x, const aoclsparse_int *indx, const void *y, void *dot)#
-
aoclsparse_status aoclsparse_zdotui(const aoclsparse_int nnz, const void *x, const aoclsparse_int *indx, const void *y, void *dot)#
Sparse dot product for single and double data precision complex types.
aoclsparse_cdotui() (complex float) and aoclsparse_zdotui() (complex double) compute the dot product of a complex vector stored in a compressed format and a complex dense vector. Let \(x\) and \(y\) be respectively a sparse and dense vectors in \(C^m\) with
indx( \(I_x\)) the nonzero indices array ofxof length at leastnnzthat is used to index into the entries of dense vector \(y\), then these functions return\[ {\bf\mathsf{dot}} = \sum_{i=0}^{{\bf\mathsf{nnz}}-1} x_{i} \cdot y_{I_{x_i}}. \]1/* ************************************************************************ 2 * Copyright (c) 2023 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include <complex> 25#include <iostream> 26#include <vector> 27 28/* One can use std::complex<double> instead of aoclsparse_double_complex 29#define STD_CMPLX (or pass it as a compilation flag -DSTD_CMPLX) 30#ifdef STD_CMPLX 31#define aoclsparse_double_complex std::complex<double> 32#endif 33*/ 34 35#include "aoclsparse.h" 36 37int main(void) 38{ 39 aoclsparse_status status; 40 const aoclsparse_int nnz = 3; 41 std::vector<aoclsparse_int> indx = {0, 1, 2}; 42 std::vector<aoclsparse_double_complex> x{{1, 1}, {2, 2}, {3, 3}}; 43 std::vector<aoclsparse_double_complex> y{ 44 {1, 1}, {1, 2}, {2, 3}, {1, 0}, {4, 1}, {1.2, -1}, {7, -1}, {0, 2}, {-2, 3}}; 45 aoclsparse_double_complex dot_exp{-5, 23}; 46 aoclsparse_double_complex dotc_exp{23, 5}; 47 aoclsparse_double_complex dotp; 48 bool ok; 49 50 std::cout << "Invoking aoclsparse_zdotui...\n"; 51 52 //Invoke complex double dot product 53 status = aoclsparse_zdotui(nnz, x.data(), indx.data(), y.data(), &dotp); 54 if(status != aoclsparse_status_success) 55 { 56 std::cerr << "Error returned from aoclsparse_cdotui, status = " << status << "." 57 << std::endl; 58 return 1; 59 } 60 61#ifdef STD_CMPLX 62 std::cout << "Dot product: " << dotp << ", Expected dot product: " << dot_exp << std::endl; 63 ok = (dotp == dot_exp); 64#else 65 std::cout << "Dot product: (" << dotp.real << ", " << dotp.imag << "), Expected dot product: (" 66 << dot_exp.real << ", " << dot_exp.imag << ")" << std::endl; 67 ok = ((dotp.real == dot_exp.real) && (dotp.imag == dot_exp.imag)); 68#endif 69 70 std::cout << "Invoking aoclsparse_zdotci...\n"; 71 72 //Invoke complex double dot product 73 status = aoclsparse_zdotci(nnz, x.data(), indx.data(), y.data(), &dotp); 74 if(status != aoclsparse_status_success) 75 { 76 std::cerr << "Error returned from aoclsparse_cdotci, status = " << status << "." 77 << std::endl; 78 return 2; 79 } 80#ifdef STD_CMPLX 81 std::cout << "Conjugated dot product: " << dotp 82 << ", Expected conjugated dot product: " << dotc_exp << std::endl; 83 ok &= (dotp == dotc_exp); 84 return ok ? 0 : 3; 85 86#else 87 std::cout << "Dot product: (" << dotp.real << ", " << dotp.imag << "), Expected dot product: (" 88 << dotc_exp.real << ", " << dotc_exp.imag << ")" << std::endl; 89 ok &= ((dotp.real == dotc_exp.real) && (dotp.imag == dotc_exp.imag)); 90 return ok ? 0 : 4; 91#endif 92}
Note
The contents of the vectors are not checked for NaNs.
- Parameters
nnz – [in] The number of elements (length) of vectors \(x\) and \(indx\).
x – [in] Array of at least
nnzcomplex elements.indx – [in] Nonzero indices set, \(I_x\), of
xdescribed by this array of length at leastnnz. Each entry must contain a valid index intoyand be unique. The entries ofindxare not checked for validity.y – [in] Array of at least \(\max(I_{x_i}, i \in \{ 1,\ldots,{\bf\mathsf{nnz}}\})\) complex elements.
dot – [out] The dot product of
xandywhennnz\(> 0\). Ifnnz\(\le 0\),dotis set to 0.
- Return values
aoclsparse_status_success – The operation completed successfully.
aoclsparse_status_invalid_pointer – At least one of the pointers
x,indx,y,dotis invalid.aoclsparse_status_invalid_size – Indicates that the provided
nnzis not positive.
aoclsparse_?doti()#
-
float aoclsparse_sdoti(const aoclsparse_int nnz, const float *x, const aoclsparse_int *indx, const float *y)#
-
double aoclsparse_ddoti(const aoclsparse_int nnz, const double *x, const aoclsparse_int *indx, const double *y)#
Sparse dot product for single and double data precision real types.
aoclsparse_sdoti() and aoclsparse_ddoti() compute the dot product of a real vector stored in a compressed format and a real dense vector. Let \(x\) and \(y\) be respectively a sparse and dense vectors in \(R^m\) with
indx( \(I_x\)) an indices array of length at leastnnzthat is used to index into the entries of dense vector \(y\), then these functions return\[ {\bf\mathsf{dot}} = \sum_{i=0}^{{\bf\mathsf{nnz}}-1} x_{i} \cdot y_{I_{x_i}}. \]Note
The contents of the vectors are not checked for NaNs.
- Parameters
nnz – [in] The number of elements to access in vectors
xandindx.x – [in] Array of at least
nnzelements.indx – [in] Nonzero indices set, \(I_x\), of
xdescribed by this array of length at leastnnz. Each entry must contain a valid index intoyand be unique. The entries ofindxare not checked for validity.y – [in] Array of at least \(\max(I_{x_i}, i \in \{ 1,\ldots,{\bf\mathsf{nnz}}\})\) elements.
- Return values
dot – Value of the dot product if
nnzis positive, otherwise returns 0.
aoclsparse_?sctr()#
-
aoclsparse_status aoclsparse_ssctr(const aoclsparse_int nnz, const float *x, const aoclsparse_int *indx, float *y)#
-
aoclsparse_status aoclsparse_dsctr(const aoclsparse_int nnz, const double *x, const aoclsparse_int *indx, double *y)#
-
aoclsparse_status aoclsparse_csctr(const aoclsparse_int nnz, const void *x, const aoclsparse_int *indx, void *y)#
-
aoclsparse_status aoclsparse_zsctr(const aoclsparse_int nnz, const void *x, const aoclsparse_int *indx, void *y)#
Sparse scatter for single and double precision real and complex types.
aoclsparse_?sctrscatter the elements of a compressed sparse vector into a dense vector.Let \(y\in R^m\) (or \(C^m\)) be a dense vector, and \(x\) be a compressed sparse vector with \(I_x\) be its nonzero indices set of length at least
nnzand described by the arrayindx, then\[ y_{I_{x_{i}}} = x_i, \quad i\in\{1,\ldots,{\bf\mathsf{nnz}}\}. \]1/* ************************************************************************ 2 * Copyright (c) 2023 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <cfloat> 27#include <cmath> 28#include <complex> 29#include <iomanip> 30#include <iostream> 31#include <vector> 32 33// Sample program to illustrate the usage of scatter of a compressed sparse vector to the full storage form (complex double precision) 34// Using other precisions are also similar to below example 35int main(void) 36{ 37 std::cout << "--------------------------------" << std::endl 38 << "---- Scatter sample program ----" << std::endl 39 << "--------------------------------" << std::endl 40 << std::endl; 41 42 // Number of non-zeros of the sparse vector 43 const aoclsparse_int nnz = 3; 44 45 // Sparse index vector (does not need to be ordered) 46 std::vector<aoclsparse_int> indx = {0, 3, 6}; 47 // Sparse value vector in compressed form 48 std::vector<std::complex<double>> x{{1.01, -1.13}, {2.4, -2.0}, {-0.3, 1.3}}; 49 50 // Output vector 51 std::vector<std::complex<double>> y{{0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}}; 52 53 // Expected output vector 54 std::vector<std::complex<double>> y_exp{ 55 {1.01, -1.13}, {0, 0}, {0, 0}, {2.4, -2.0}, {0, 0}, {0, 0}, {-0.3, 1.3}}; 56 57 aoclsparse_int ny = y.size(); 58 59 aoclsparse_status status; 60 61 std::cout << "Invoking aoclsparse_zsctr...\n"; 62 //Invoke complex scatter 63 status = aoclsparse_zsctr(nnz, x.data(), indx.data(), y.data()); 64 if(status != aoclsparse_status_success) 65 { 66 std::cerr << "Error returned from aoclsparse_zsctr, status = " << status << "." 67 << std::endl; 68 return 3; 69 } 70 71 // Check and print the result 72 std::cout << "Scatter to vector y: " << std::endl 73 << std::setw(10) << "y" 74 << " " << std::setw(17) << "expected y" << std::endl; 75 std::cout << std::fixed; 76 std::cout.precision(4); 77 bool oki, ok = true; 78 for(aoclsparse_int i = 0; i < ny; i++) 79 { 80 oki = y[i] == y_exp[i]; 81 ok &= oki; 82 std::cout << std::setw(11) << y[i] << std::setw(3) << "" << std::setw(11) << y_exp[i] 83 << std::setw(2) << (oki ? "" : " ! Error") << std::endl; 84 } 85 return (ok ? 0 : 6); 86}
Note
The contents of the vectors are not checked for NaNs.
- Parameters
nnz – [in] The number of elements to use from \(x\) and \({\bf\mathsf{indx}}\).
x – [in] Dense array of at least size \({\bf\mathsf{nnz}}\). The first \({\bf\mathsf{nnz}}\) elements are to be scattered.
indx – [in] Nonzero index set for \(x\) of size at least \({\bf\mathsf{nnz}}\). The first \({\bf\mathsf{nnz}}\) indices are used for the scattering. The elements in this vector are only checked for non-negativity. The user should make sure that index is less than the size of
y.y – [out] Array of at least \(\max(I_{x_i}, i \in \{ 1,\ldots,{\bf\mathsf{nnz}}\})\) elements.
- Return values
aoclsparse_status_success – The operation completed successfully.
aoclsparse_status_invalid_pointer – At least one of the pointers
x,indx,yis invalid.aoclsparse_status_invalid_size – Indicates that provided
nnzis less than zero.aoclsparse_status_invalid_index_value – At least one of the indices in indx is negative.
sparse_?sctrs()#
-
aoclsparse_status aoclsparse_ssctrs(const aoclsparse_int nnz, const float *x, aoclsparse_int stride, float *y)#
-
aoclsparse_status aoclsparse_dsctrs(const aoclsparse_int nnz, const double *x, aoclsparse_int stride, double *y)#
-
aoclsparse_status aoclsparse_csctrs(const aoclsparse_int nnz, const void *x, aoclsparse_int stride, void *y)#
-
aoclsparse_status aoclsparse_zsctrs(const aoclsparse_int nnz, const void *x, aoclsparse_int stride, void *y)#
Sparse scatter with stride for real/complex single and double data precisions.
aoclsparse_?sctrsscatters the elements of a compressed sparse vector into a dense vector using a stride.Let \(y\) be a dense vector of length \(n>0\), \(x\) be a compressed sparse vector with
nnz> 0 nonzeros, andstridebe a striding distance, then\[ y_{{\bf\mathsf{stride}} \times i} = x_i,\quad i\in\{1,\ldots,{\bf\mathsf{nnz}}\}.\]Note
Contents of the vector
xare accessed but not checked.- Parameters
nnz – [in] Number of nonzero elements to access in \(x\).
x – [in] Array of at least
nnzelements. The firstnnzelements are to be scattered intoy.stride – [in] (Positive) striding distance used to store elements in vector
y.y – [out] Array of size at least
stride\(\times\)nnz.
- Return values
aoclsparse_status_success – The operation completed successfully.
aoclsparse_status_invalid_pointer – At least one of the pointers
x,yis invalid.aoclsparse_status_invalid_size – Indicates that one or more of the values provided in
nnzorstrideis not positive.
aoclsparse_?roti()#
-
aoclsparse_status aoclsparse_sroti(const aoclsparse_int nnz, float *x, const aoclsparse_int *indx, float *y, const float c, const float s)#
-
aoclsparse_status aoclsparse_droti(const aoclsparse_int nnz, double *x, const aoclsparse_int *indx, double *y, const double c, const double s)#
Apply Givens rotation to single or double precision real vectors.
aoclsparse_sroti() and aoclsparse_droti() apply the Givens rotation on elements of two real vectors.
Let \(y\in R^m\) be a vector in full storage form, \(x\) be a vector in a compressed form and \(I_x\) its nonzero indices set of length at least
nnzdescribed by the arrayindx, then\[ x_i = {\bf\mathsf{c}} * x_i + {\bf\mathsf{s}} * y_{I_{x_{i}}}, \]\[ y_{I_{x_{i}}} = {\bf\mathsf{c}} * y_{I_{x_{i}}} - {\bf\mathsf{s}} * x_i, \]for \(i\in 1, \ldots, {\bf\mathsf{nnz}}\). The elements
c,sare scalars.1/* ************************************************************************ 2 * Copyright (c) 2023 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <cfloat> 27#include <cmath> 28#include <iomanip> 29#include <iostream> 30#include <limits> 31#include <vector> 32 33// Sample program to illustrate the usage of Givens rotation on a compressed sparse vector and a full storage vector(double precision) 34int main(void) 35{ 36 std::cout << "-----------------------------" << std::endl 37 << "---- roti sample program ----" << std::endl 38 << "-----------------------------" << std::endl 39 << std::endl; 40 41 // Input data 42 43 // Number of non-zeros of the sparse vector 44 const aoclsparse_int nnz = 5; 45 46 // Sparse index vector (does not need to be ordered) 47 std::vector<aoclsparse_int> indx{0, 3, 6, 7, 9}; 48 49 // Sparse value vector in compressed form 50 std::vector<double> x{-0.75, 4, -9.5, 46, 1.25}; 51 52 // Output vector 53 std::vector<double> y{-0.75, 0, 0, 4, 0, 0, -9.5, 46, 0, 1.25}; 54 55 // Expected output vectors 56 std::vector<double> y_exp{0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; 57 std::vector<double> x_exp{-3, 16, -38, 184, 5}; 58 59 const double c = 2; 60 const double s = 2; 61 62 aoclsparse_int ny = y.size(); 63 aoclsparse_int nx = x.size(); 64 65 aoclsparse_status status; 66 std::cout << "Invoking aoclsparse_droti...\n"; 67 //Invoke aoclsparse_droti to apply Givens rotation 68 status = aoclsparse_droti(nnz, x.data(), indx.data(), y.data(), c, s); 69 70 if(status != aoclsparse_status_success) 71 { 72 std::cerr << "Error returned from aoclsparse_droti, status = " << status << "." 73 << std::endl; 74 return 3; 75 } 76 77 // Check and print the result 78 std::cout << "The vector y after Givens rotation: " << std::endl 79 << std::setw(11) << "y" 80 << " " << std::setw(11) << "expected" << std::endl; 81 std::cout << std::fixed; 82 83 std::cout.precision(2); 84 85 bool oki, ok_y = true, ok_x = true; 86 87 //Initializing precision tolerance range for double 88 const double tol = std::sqrt(std::numeric_limits<double>::epsilon()); 89 90 for(aoclsparse_int i = 0; i < ny; i++) 91 { 92 oki = std::abs(y[i] - y_exp[i]) <= tol; 93 ok_y &= oki; 94 std::cout << std::setw(11) << y[i] << std::setw(3) << "" << std::setw(11) << y_exp[i] 95 << std::setw(2) << (oki ? "" : " !") << std::endl; 96 } 97 std::cout << std::endl; 98 std::cout << "The vector x after Givens rotation: " << std::endl 99 << std::setw(11) << "x" 100 << " " << std::setw(11) << "expected" << std::endl; 101 std::cout << std::fixed; 102 for(aoclsparse_int i = 0; i < nx; i++) 103 { 104 oki = std::abs(x[i] - x_exp[i]) <= tol; 105 ok_x &= oki; 106 std::cout << std::setw(11) << x[i] << std::setw(3) << "" << std::setw(11) << x_exp[i] 107 << std::setw(2) << (oki ? "" : " !") << std::endl; 108 } 109 return ((ok_y && ok_x) ? 0 : 6); 110}
Note
The contents of the vectors are not checked for NaNs.
- Parameters
nnz – [in] The number of elements to use from \(x\) and \({\bf\mathsf{indx}}\).
x – [inout] Array \(x\) of at least \({\bf\mathsf{nnz}}\) elements in compressed form. The elements of the array are updated after applying the Givens rotation.
indx – [in] Nonzero index set of \(x\), \(I_x\), with at least \({\bf\mathsf{nnz}}\) elements. The first \({\bf\mathsf{nnz}}\) elements are used to apply the Givens rotation. The elements in this vector are only checked for non-negativity. The caller should make sure that each entry is less than the size of
yand are all distinct.y – [inout] Dense array of at least \(\max(I_{x_i}, \text{ for } i \in \{ 1,\ldots, {\bf\mathsf{nnz}}\})\) elements in full storage form. The elements of the array are updated after applying the Givens rotation.
c – [in] A scalar.
s – [in] A scalar.
- Return values
aoclsparse_status_success – The operation completed successfully.
aoclsparse_status_invalid_pointer – At least one of the pointers
x,indx,yis invalid.aoclsparse_status_invalid_size – Indicates that provided
nnzis less than zero.aoclsparse_status_invalid_index_value – At least one of the indices in indx is negative. With this error, the values of vectors x and y are undefined.
aoclsparse_?gthr()#
-
aoclsparse_status aoclsparse_sgthr(aoclsparse_int nnz, const float *y, float *x, const aoclsparse_int *indx)#
-
aoclsparse_status aoclsparse_dgthr(aoclsparse_int nnz, const double *y, double *x, const aoclsparse_int *indx)#
-
aoclsparse_status aoclsparse_cgthr(aoclsparse_int nnz, const void *y, void *x, const aoclsparse_int *indx)#
-
aoclsparse_status aoclsparse_zgthr(aoclsparse_int nnz, const void *y, void *x, const aoclsparse_int *indx)#
Gather elements from a dense vector and store them into a sparse vector.
The
aoclsparse_?gthris a group of functions that gather the elements indexed inindxfrom the dense vectoryinto the sparse vectorx.Let \(y\in R^m\) (or \(C^m\)) be a dense vector, \(x\) be a sparse vector from the same space and \(I_x\) be a set of indices of size \(0<\)
nnz\(\le m\) described byindx, then\[ x_i = y_{I_{x_i}}, i\in\{1,\ldots,{\bf\mathsf{nnz}}\}. \]For double precision complex vectors use aoclsparse_zgthr() and for single precision complex vectors use aoclsparse_cgthr().
1/* ************************************************************************ 2 * Copyright (c) 2023 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <cfloat> 27#include <cmath> 28#include <complex> 29#include <iomanip> 30#include <iostream> 31#include <vector> 32 33int main() 34{ 35 std::cout << "-------------------------------" << std::endl 36 << "---- Gather sample program ----" << std::endl 37 << "-------------------------------" << std::endl 38 << std::endl; 39 40 // Input data 41 42 // Number of non-zeros of the sparse vector 43 const aoclsparse_int nnz = 3; 44 45 // Sparse index vector (does not need to be ordered) 46 std::vector<aoclsparse_int> indx = {0, 5, 3}; 47 48 // Sparse value vector 49 std::vector<std::complex<double>> x(3); 50 51 // Dense vector 52 std::vector<std::complex<double>> y{ 53 {1, 1}, {1, 2}, {2, 3}, {1, 0}, {4, 1}, {1.2, -1}, {7, -1}, {0, 2}, {-2, 3}}; 54 55 // Expected sparse value vector 56 std::vector<std::complex<double>> x_exp{{1, 1}, {1.2, -1}, {1, 0}}; 57 58 // Expected dense vector 59 std::vector<std::complex<double>> y_exp{ 60 {0, 0}, {1, 2}, {2, 3}, {0, 0}, {4, 1}, {0, 0}, {7, -1}, {0, 2}, {-2, 3}}; 61 62 aoclsparse_int ny = y.size(); 63 64 aoclsparse_status status; 65 66 // Call the gather function 67 status = aoclsparse_zgthr(nnz, y.data(), x.data(), indx.data()); 68 if(status != aoclsparse_status_success) 69 { 70 std::cerr << "Error returned from aoclsparse_zgthr, status = " << status << "." 71 << std::endl; 72 return 1; 73 } 74 75 // Check and print the result 76 std::cout << "Gather from vector y: " << std::endl 77 << std::setw(11) << "x" 78 << " " << std::setw(11) << "expected" << std::endl; 79 std::cout << std::fixed; 80 std::cout.precision(1); 81 bool oki, ok = true; 82 for(aoclsparse_int i = 0; i < nnz; i++) 83 { 84 oki = x[i] == x_exp[i]; 85 ok &= oki; 86 std::cout << std::setw(11) << x[i] << std::setw(3) << "" << std::setw(11) << x_exp[i] 87 << std::setw(2) << (oki ? "" : " !") << std::endl; 88 } 89 90 // Call the gather with zero function 91 status = aoclsparse_zgthrz(nnz, y.data(), x.data(), indx.data()); 92 if(status != aoclsparse_status_success) 93 { 94 std::cerr << "Error returned from aoclsparse_zgthr, status = " << status << "." 95 << std::endl; 96 return 3; 97 } 98 99 // Check and print the result 100 std::cout << std::endl 101 << "Gather from vector y: " << std::endl 102 << std::setw(11) << "y" << std::setw(3) << "" << std::setw(11) << "expected" 103 << std::setw(3) << "" << std::setw(11) << "x" << std::setw(3) << "" << std::setw(11) 104 << "expected" << std::endl; 105 std::cout << std::fixed; 106 std::cout.precision(1); 107 bool oky; 108 for(aoclsparse_int i = 0; i < ny; i++) 109 { 110 oky = y[i] == y_exp[i]; 111 ok &= oky; 112 std::cout << std::setw(11) << y[i] << std::setw(3) << "" << std::setw(11) << y_exp[i] 113 << std::setw(2) << (oky ? "" : " !"); 114 if(i < nnz) 115 { 116 oki = x[i] == x_exp[i]; 117 ok &= oki; 118 std::cout << " " << std::setw(11) << x[i] << std::setw(3) << "" << std::setw(11) 119 << x_exp[i] << std::setw(2) << (oki ? "" : " !"); 120 } 121 std::cout << std::endl; 122 } 123 124 return (ok ? 0 : 6); 125}
Note
These functions assume that the indices stored in
indxare less than \(m\) without duplicate elements, and thatxandindxare pointers to vectors of size at leastnnz.- Parameters
nnz – [in] number of non-zero entries of \(x\). If
nnzis zero, then none of the entries of vectorsx,y, andindxare touched.y – [in] pointer to dense vector \(y\) of size at least \(m\).
x – [out] pointer to sparse vector \(x\) with at least
nnznon-zero elements.indx – [in] index vector of size
nnz, containing the indices of the non-zero values of \(x\). Indices should range from 0 to \(m-1\), need not be ordered. The elements in this vector are only checked for non-negativity. The user should make sure that no index is out-of-bound and that it does not contains any duplicates.
- Return values
aoclsparse_status_success – the operation completed successfully
aoclsparse_status_invalid_size –
nnzparameter value is negativeaoclsparse_status_invalid_pointer – at least one of the pointers
y,xorindxis invalidaoclsparse_status_invalid_index_value – at least one of the indices in
indxis negative
aoclsparse_?gthrz()#
-
aoclsparse_status aoclsparse_sgthrz(aoclsparse_int nnz, float *y, float *x, const aoclsparse_int *indx)#
-
aoclsparse_status aoclsparse_dgthrz(aoclsparse_int nnz, double *y, double *x, const aoclsparse_int *indx)#
-
aoclsparse_status aoclsparse_cgthrz(aoclsparse_int nnz, void *y, void *x, const aoclsparse_int *indx)#
-
aoclsparse_status aoclsparse_zgthrz(aoclsparse_int nnz, void *y, void *x, const aoclsparse_int *indx)#
Gather and zero out elements from a dense vector and store them into a sparse vector.
The
aoclsparse_?gthrzis a group of functions that gather the elementsindexed in
indxfrom the dense vectoryinto the sparse vectorx. The gathered elements in \(y\) are replaced by zero.Let \(y\in R^m\) (or \(C^m\)) be a dense vector, \(x\) be a sparse vector from the same space and \(I_x\) be a set of indices of size \(0<\)
nnz\(\le m\) described byindx, then\[ x_i = y_{I_{x_i}}, i\in\{1,\ldots,{\bf\mathsf{nnz}}\}, \text{ and after the assignment, } y_{I_{x_i}}=0, i\in\{1,\ldots,{\bf\mathsf{nnz}}\}. \]For double precision complex vectors use aoclsparse_zgthrz() and for single precision complex vectors use aoclsparse_cgthrz().
Note
These functions assume that the indices stored in
indxare less than \(m\) without duplicate elements, and thatxandindxare pointers to vectors of size at leastnnz.- Parameters
nnz – [in] number of non-zero entries of \(x\). If
nnzis zero, then none of the entries of vectorsx,y, andindxare touched.y – [in] pointer to dense vector \(y\) of size at least \(m\).
x – [out] pointer to sparse vector \(x\) with at least
nnznon-zero elements.indx – [in] index vector of size
nnz, containing the indices of the non-zero values of \(x\). Indices should range from 0 to \(m-1\), need not be ordered. The elements in this vector are only checked for non-negativity. The user should make sure that no index is out-of-bound and that it does not contains any duplicates.
- Return values
aoclsparse_status_success – the operation completed successfully
aoclsparse_status_invalid_size –
nnzparameter value is negativeaoclsparse_status_invalid_pointer – at least one of the pointers
y,xorindxis invalidaoclsparse_status_invalid_index_value – at least one of the indices in
indxis negative
aoclsparse_?gthrs()#
-
aoclsparse_status aoclsparse_sgthrs(aoclsparse_int nnz, const float *y, float *x, aoclsparse_int stride)#
-
aoclsparse_status aoclsparse_dgthrs(aoclsparse_int nnz, const double *y, double *x, aoclsparse_int stride)#
-
aoclsparse_status aoclsparse_cgthrs(aoclsparse_int nnz, const void *y, void *x, aoclsparse_int stride)#
-
aoclsparse_status aoclsparse_zgthrs(aoclsparse_int nnz, const void *y, void *x, aoclsparse_int stride)#
Gather elements from a dense vector using a stride and store them into a sparse vector.
The
aoclsparse_?gthrsis a group of functions that gather the elements from the dense vectoryusing a fixed stride distance and copies them into the sparse vectorx.Let \(y\in R^m\) (or \(C^m\)) be a dense vector, \(x\) be a sparse vector from the same space and
stridebe a (positive) striding distance, then \( x_i = y_{\text{stride} \times i}, \quad i\in\{1,\ldots,{\bf\mathsf{nnz}}\}. \)- Parameters
nnz – [in] Number of non-zero entries of \(x\). If
nnzis zero, then none of the entries of vectorsxandyare accessed. Note thatnnzmust be such thatstride\(\times\)nnzmust be less or equal to \(m\).y – [in] Pointer to dense vector \(y\) of size at least \(m\).
x – [out] Pointer to sparse vector \(x\) with at least
nnznon-zero elements.stride – [in] Striding distance used to access elements in the dense vector
y. It must be such thatstride\(\times\)nnzis less or equal to \(m\).
- Return values
aoclsparse_status_success – the operation completed successfully.
aoclsparse_status_invalid_size – at least one of the parameters
nnzorstridehas a negative value.aoclsparse_status_invalid_pointer – at least one of the pointers
y, orxis invalid.
Level 2#
This module holds all sparse level 2 routines.
The sparse level 2 routines describe operations between a matrix in sparse format and a vector in dense or sparse format.
aoclsparse_?mv()#
-
aoclsparse_status aoclsparse_smv(aoclsparse_operation op, const float *alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const float *x, const float *beta, float *y)#
-
aoclsparse_status aoclsparse_dmv(aoclsparse_operation op, const double *alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const double *x, const double *beta, double *y)#
-
aoclsparse_status aoclsparse_cmv(aoclsparse_operation op, const aoclsparse_float_complex *alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_float_complex *x, const aoclsparse_float_complex *beta, aoclsparse_float_complex *y)#
-
aoclsparse_status aoclsparse_zmv(aoclsparse_operation op, const aoclsparse_double_complex *alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_double_complex *x, const aoclsparse_double_complex *beta, aoclsparse_double_complex *y)#
Compute sparse matrix-vector multiplication for real/complex single and double data precisions.
The
aoclsparse_?mvperform sparse matrix-vector products of the form\[ y = \alpha \, op(A) \, x + \beta \, y, \]where, \(x\) and \(y\) are dense vectors, \(\alpha\) and \(\beta\) are scalars, and \(A\) is a sparse matrix structure. The matrix operation \(op()\) is defined as:\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]1/* ************************************************************************ 2 * Copyright (c) 2020-2023 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <iostream> 27 28#define M 5 29#define N 5 30#define NNZ 8 31 32int main(void) 33{ 34 aoclsparse_operation trans = aoclsparse_operation_none; 35 36 double alpha = 1.0; 37 double beta = 0.0; 38 39 // Print aoclsparse version 40 std::cout << aoclsparse_get_version() << std::endl; 41 42 // Create matrix descriptor 43 aoclsparse_mat_descr descr; 44 // aoclsparse_create_mat_descr set aoclsparse_matrix_type to aoclsparse_matrix_type_general 45 // and aoclsparse_index_base to aoclsparse_index_base_zero. 46 aoclsparse_create_mat_descr(&descr); 47 48 aoclsparse_index_base base = aoclsparse_index_base_zero; 49 50 // Initialise matrix 51 // 1 0 0 2 0 52 // 0 3 0 0 0 53 // 0 0 4 0 0 54 // 0 5 0 6 7 55 // 0 0 0 0 8 56 aoclsparse_int csr_row_ptr[M + 1] = {0, 2, 3, 4, 7, 8}; 57 aoclsparse_int csr_col_ind[NNZ] = {0, 3, 1, 2, 1, 3, 4, 4}; 58 double csr_val[NNZ] = {1, 2, 3, 4, 5, 6, 7, 8}; 59 aoclsparse_matrix A; 60 aoclsparse_create_dcsr(&A, base, M, N, NNZ, csr_row_ptr, csr_col_ind, csr_val); 61 62 // Initialise vectors 63 double x[N] = {1.0, 2.0, 3.0, 4.0, 5.0}; 64 double y[M]; 65 66 //to identify hint id(which routine is to be executed, destroyed later) 67 aoclsparse_set_mv_hint(A, trans, descr, 1); 68 69 // Optimize the matrix, "A" 70 aoclsparse_optimize(A); 71 72 std::cout << "Invoking aoclsparse_dmv.."; 73 //Invoke SPMV API (double precision) 74 aoclsparse_dmv(trans, &alpha, A, descr, x, &beta, y); 75 std::cout << "Done." << std::endl; 76 std::cout << "Output Vector:" << std::endl; 77 for(aoclsparse_int i = 0; i < M; i++) 78 std::cout << y[i] << std::endl; 79 80 aoclsparse_destroy_mat_descr(descr); 81 aoclsparse_destroy(&A); 82 return 0; 83}
1/* ************************************************************************ 2 * Copyright (c) 2023 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <math.h> 27#include <stdio.h> 28 29#define M 5 30#define N 5 31#define NNZ 8 32 33int main(void) 34{ 35 aoclsparse_status status; 36 aoclsparse_matrix A = NULL; 37 aoclsparse_mat_descr descr = NULL; 38 aoclsparse_operation trans = aoclsparse_operation_none; 39 aoclsparse_index_base base = aoclsparse_index_base_zero; 40 41 double alpha = 1.0; 42 double beta = 0.0; 43 44 // Input matrix 45 // 1 0 0 2 0 46 // 0 3 0 0 0 47 // 0 0 4 0 0 48 // 0 5 0 6 7 49 // 0 0 0 0 8 50 aoclsparse_int csr_row_ptr[M + 1] = {0, 2, 3, 4, 7, 8}; 51 aoclsparse_int csr_col_ind[NNZ] = {0, 3, 1, 2, 1, 3, 4, 4}; 52 double csr_val[NNZ] = {1, 2, 3, 4, 5, 6, 7, 8}; 53 54 // Input vectors 55 double x[N] = {1.0, 2.0, 3.0, 4.0, 5.0}; 56 double y[M]; 57 58 // Expected reference result & tolerance 59 double y_exp[M] = {9.0, 6.0, 12.0, 69.0, 40.0}; 60 double tol = 1e-12; 61 aoclsparse_int oki, ok = 1; 62 63 // Print aoclsparse version 64 printf("%s\n", aoclsparse_get_version()); 65 66 // Create matrix descriptor 67 // aoclsparse_create_mat_descr sets aoclsparse_matrix_type to aoclsparse_matrix_type_general 68 // and aoclsparse_index_base to aoclsparse_index_base_zero. 69 aoclsparse_create_mat_descr(&descr); 70 71 // Initialise sparse matrix 72 status = aoclsparse_create_dcsr(&A, base, M, N, NNZ, csr_row_ptr, csr_col_ind, csr_val); 73 if(status != aoclsparse_status_success) 74 { 75 printf("Error while creating a sparse matrix, status = %i\n", status); 76 return 1; 77 } 78 79 // Hint the system what operation to expect // to identify hint id(which routine is to be executed, destroyed later) 80 status = aoclsparse_set_mv_hint(A, trans, descr, 1); 81 if(status != aoclsparse_status_success) 82 { 83 printf("Error while hinting operation, status = %i\n", status); 84 return 1; 85 } 86 87 // Optimize the matrix 88 status = aoclsparse_optimize(A); 89 if(status != aoclsparse_status_success) 90 { 91 printf("Error while optimizing the matrix, status = %i\n", status); 92 return 1; 93 } 94 95 // Invoke SPMV API (double precision) 96 printf("Invoking aoclsparse_dmv...\n"); 97 status = aoclsparse_dmv(trans, &alpha, A, descr, x, &beta, y); 98 if(status != aoclsparse_status_success) 99 { 100 printf("Error while computing SPMV, status = %i\n", status); 101 return 1; 102 } 103 104 // Print and check the results 105 printf("Output vector:\n"); 106 for(aoclsparse_int i = 0; i < M; i++) 107 { 108 oki = fabs(y[i] - y_exp[i]) <= tol; 109 printf(" %lf %c\n", y[i], oki ? ' ' : '!'); 110 ok = ok && oki; 111 } 112 113 aoclsparse_destroy_mat_descr(descr); 114 aoclsparse_destroy(&A); 115 return ok ? 0 : 2; 116}
- Parameters
op – [in] Matrix operation,
opcan be one of aoclsparse_operation_none, aoclsparse_operation_conjugate_transpose, or aoclsparse_operation_conjugate_transpose.alpha – [in] Scalar \(\alpha\).
A – [in] The sparse matrix created using e.g. aoclsparse_create_scsr() or other variant. Matrix is considered of size \(m\) by \(n\).
descr – [in] Descriptor of the matrix. These functions support the following aoclsparse_matrix_type types: aoclsparse_matrix_type_general, aoclsparse_matrix_type_triangular, aoclsparse_matrix_type_symmetric, and aoclsparse_matrix_type_hermitian. Both base-zero and base-one are supported, however, the index base needs to match with the one defined in matrix
A.x – [in] An array of
nelements if \(op(A) = A\); or ofmelements if \(op(A) = A^T\) or \(op(A) = A^H\).beta – [in] Scalar \(\beta\).
y – [inout] An array of
melements if \(op(A) = A\); or ofnelements if \(op(A) = A^T\) or \(op(A) = A^H\).
- Return values
aoclsparse_status_success – The operation completed successfully.
aoclsparse_status_invalid_size – The value of
m,nornnzis invalid.aoclsparse_status_invalid_pointer –
descr,alpha, internal structures related to the sparse matrixA,x,betaoryhas an invalid pointer.aoclsparse_status_not_implemented – The requested functionality is not implemented.
aoclsparse_?trsv()#
-
aoclsparse_status aoclsparse_strsv(aoclsparse_operation trans, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const float *b, float *x)#
-
aoclsparse_status aoclsparse_dtrsv(aoclsparse_operation trans, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const double *b, double *x)#
-
aoclsparse_status aoclsparse_ctrsv(aoclsparse_operation trans, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_float_complex *b, aoclsparse_float_complex *x)#
-
aoclsparse_status aoclsparse_ztrsv(aoclsparse_operation trans, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_double_complex *b, aoclsparse_double_complex *x)#
Sparse triangular solver for real/complex single and double data precisions.
The function aoclsparse_strsv() and variants solve sparse lower (or upper) triangular linear system of equations. The system is defined by the sparse \(m \times m\) matrix \(A\), the dense solution \(m\)-vector \(x\), and the right-hand side dense \(m\)-vector \(b\). Vector \(b\) is multiplied by \(\alpha\). The solution \(x\) is estimated by solving
\[ op(L) \cdot x = \alpha \cdot b, \quad \text{ or } \quad op(U) \cdot x = \alpha \cdot b, \]where \(L = \text{tril}(A)\) is the lower triangle of matrix \(A\), similarly, \(U = \text{triu}(A)\) is the upper triangle of matrix \(A\). The operator \(op()\) is regarded as the matrix linear operation,\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]Notes
1. This routine supports only sparse matrices in CSR format.
2. If the matrix descriptor
descrspecifies that the matrix \(A\) is to be regarded has having a unitary diagonal, then the main diagonal entries of matrix \(A\) are not accessed and are all considered to be unitary.3. The input matrix need not be (upper or lower) triangular matrix, in the
descr, thefill_modeentity specifies which triangle to consider, namely, iffill_mode= aoclsparse_fill_mode_lower, then\[ op(L) \cdot x = \alpha \cdot b, \]otherwise, if
fill_mode= aoclsparse_fill_mode_upper, then\[ op(U) \cdot x = \alpha \cdot b, \]is solved.
4. To increase performance and if the matrix \(A\) is to be used more than once to solve for different right-hand sides
b, then it is encouraged to provide hints using aoclsparse_set_sv_hint() and aoclsparse_optimize(), otherwise, the optimization for the matrix will be done by the solver on entry.5. There is a
kid(Kernel ID) variation of TRSV, namely with a suffix of_kid, aoclsparse_strsv_kid() (and variations) where it is possible to specify the TRSV kernel to use (if possible).1/* ************************************************************************ 2 * Copyright (c) 2022-2023 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <iomanip> 27#include <iostream> 28 29int main() 30{ 31 std::cout << "-------------------------------" << std::endl 32 << " Triangle solve sample program" << std::endl 33 << "-------------------------------" << std::endl 34 << std::endl; 35 36 // Create a tri-diagonal matrix in CSR format 37 // | 1 2 0 0 | 38 // | 3 1 2 0 | 39 // | 0 3 1 2 | 40 // | 0 0 3 1 | 41 aoclsparse_int n = 4, m = 4, nnz = 10; 42 aoclsparse_int icrow[5] = {0, 2, 5, 8, 10}; 43 aoclsparse_int icol[18] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3}; 44 double aval[18] = {1, 2, 3, 1, 2, 3, 1, 2, 3, 1}; 45 aoclsparse_status status; 46 47 // create aoclsparse matrix and its descriptor 48 aoclsparse_matrix A; 49 aoclsparse_index_base base = aoclsparse_index_base_zero; 50 aoclsparse_mat_descr descr_a; 51 aoclsparse_operation trans = aoclsparse_operation_none; 52 aoclsparse_create_dcsr(&A, base, m, n, nnz, icrow, icol, aval); 53 aoclsparse_create_mat_descr(&descr_a); 54 55 // A = L + D + U where: 56 // - L and U are the strict lower and upper triangle of A 57 // - D is the diagonal 58 // Use the matrix descriptor to solve (L+D)x = b 59 // b = [1, 4, 4, 4]^t 60 double b[4] = {1, 4, 4, 4}; 61 aoclsparse_set_mat_type(descr_a, aoclsparse_matrix_type_triangular); 62 aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_lower); 63 aoclsparse_set_mat_index_base(descr_a, base); 64 status = aoclsparse_set_sv_hint(A, trans, descr_a, 1); 65 if(status != aoclsparse_status_success) 66 { 67 std::cerr << "Error setting SV hint, status = " << status << std::endl; 68 return 1; 69 } 70 status = aoclsparse_optimize(A); 71 if(status != aoclsparse_status_success) 72 { 73 std::cerr << "Error returned from aoclsparse_optimize, status = " << status << "." 74 << std::endl; 75 return 2; 76 } 77 // Call the triangle solver 78 double alpha = 1.0; 79 double x[n]; 80 status = aoclsparse_dtrsv(trans, alpha, A, descr_a, b, x); 81 if(status != aoclsparse_status_success) 82 { 83 std::cerr << "Error returned from aoclsparse_dtrsv, status = " << status << "." 84 << std::endl; 85 return 3; 86 } 87 88 // Print the result 89 std::cout << "Solving (L+D)x = b: " << std::endl << " x = "; 90 std::cout << std::fixed; 91 std::cout.precision(1); 92 for(aoclsparse_int i = 0; i < n; i++) 93 std::cout << x[i] << " "; 94 std::cout << std::endl; 95 96 // The same method can be used to solve (U+D)x = b 97 aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_upper); 98 status = aoclsparse_dtrsv(trans, alpha, A, descr_a, b, x); 99 if(status != aoclsparse_status_success) 100 { 101 std::cerr << "Error returned from aoclsparse_dtrsv, status = " << status << "." 102 << std::endl; 103 return 3; 104 } 105 106 // Print the result 107 std::cout << std::endl; 108 std::cout << "Solving (U+D)x = b: " << std::endl << " x = "; 109 for(aoclsparse_int i = 0; i < n; i++) 110 std::cout << x[i] << " "; 111 std::cout << std::endl; 112 113 // destroy the aoclsparse memory 114 aoclsparse_destroy_mat_descr(descr_a); 115 aoclsparse_destroy(&A); 116 117 return 0; 118}
1/* ************************************************************************ 2 * Copyright (c) 2023-2024 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <assert.h> 27#include <complex> 28#include <iomanip> 29#include <iostream> 30#include <limits> 31#include <vector> 32 33int main() 34{ 35 std::cout << "-------------------------------" << std::endl 36 << " Triangle solve sample program" << std::endl 37 << "-------------------------------" << std::endl 38 << std::endl; 39 40 /* Solve two complex linear systems of equations 41 * Lx = alpha b, and U^Tx = alpha b, 42 * where L is the lower triangular part of A and 43 * U is the upper triangular part of A. U^H is 44 * the conjugate transpose of U (a lower triangular matrix). 45 * The complex matrix A is the tri-diagonal matrix in CSR format 46 * 47 * | 1+3i 2+5i 0 0 | 48 * | 3 1+2i 2-2i 0 | 49 * | 0 3 1 2+3i | 50 * | 0 0 3+1i 1+1i | 51 */ 52 const aoclsparse_int n = 4, m = 4, nnz = 10; 53 aoclsparse_int icrow[5] = {0, 2, 5, 8, 10}; 54 aoclsparse_int icol[18] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3}; 55 std::vector<aoclsparse_double_complex> aval = {{1., 3.}, 56 {2., 5.}, 57 {3., 0.}, 58 {1., 2.}, 59 {2., -2.}, 60 {3., 0.}, 61 {1., 0.}, 62 {2., 3.}, 63 {3., 1.}, 64 {1., 1.}}; 65 aoclsparse_status status; 66 std::vector<aoclsparse_double_complex> ref; 67 68 // create aoclsparse matrix and its descriptor 69 aoclsparse_matrix A; 70 aoclsparse_index_base base = aoclsparse_index_base_zero; 71 aoclsparse_mat_descr descr_a; 72 aoclsparse_operation trans = aoclsparse_operation_none; 73 status = aoclsparse_create_zcsr(&A, base, m, n, nnz, icrow, icol, aval.data()); 74 if(status != aoclsparse_status_success) 75 { 76 std::cerr << "Error creating the matrix, status = " << status << std::endl; 77 return 1; 78 } 79 aoclsparse_create_mat_descr(&descr_a); 80 81 /* Solve the lower triangular system Lx = b, 82 * here alpha=1 and b = [1+i, 4+2i, 4+i, 4]. 83 */ 84 std::vector<aoclsparse_double_complex> b = {{1., 1.}, {4., 2.}, {4., 1}, {4., 0}}; 85 aoclsparse_set_mat_type(descr_a, aoclsparse_matrix_type_triangular); 86 aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_lower); 87 status = aoclsparse_set_sv_hint(A, trans, descr_a, 1); 88 if(status != aoclsparse_status_success) 89 { 90 std::cerr << "Error setting SV hint, status = " << status << std::endl; 91 return 1; 92 } 93 status = aoclsparse_optimize(A); 94 if(status != aoclsparse_status_success) 95 { 96 std::cerr << "Error returned from aoclsparse_optimize, status = " << status << "." 97 << std::endl; 98 return 2; 99 } 100 101 // Call the triangle solver 102 aoclsparse_double_complex alpha = {1.0, 0.}; 103 aoclsparse_double_complex x[n]; 104 status = aoclsparse_ztrsv(trans, alpha, A, descr_a, b.data(), x); 105 if(status != aoclsparse_status_success) 106 { 107 std::cerr << "Error returned from aoclsparse_ztrsv, status = " << status << "." 108 << std::endl; 109 return 3; 110 } 111 112 // Print and check the result 113 double tol = 20 * std::numeric_limits<double>::epsilon(); 114 ref.assign({{0.4, -0.2}, {1.6, -0.6}, {-0.8, 2.8}, {0.8, -8.4}}); 115 std::cout << "Solving Lx = alpha b: " << std::endl << " x = "; 116 std::cout << std::fixed; 117 std::cout.precision(1); 118 bool oki, ok = true; 119 for(aoclsparse_int i = 0; i < n; i++) 120 { 121 oki = std::abs(x[i].real - ref[i].real) <= tol && std::abs(x[i].imag - ref[i].imag) <= tol; 122 std::cout << "(" << x[i].real << ", " << x[i].imag << "i) " << (oki ? " " : "! "); 123 ok &= oki; 124 } 125 std::cout << std::endl; 126 127 /* Solve the lower triangular system U^Hx = b, 128 * here alpha=1-i and b is unchanged. 129 */ 130 alpha = {1, -1}; 131 // Indicate to use only the conjugate transpose of the upper part of A 132 trans = aoclsparse_operation_conjugate_transpose; 133 aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_upper); 134 status = aoclsparse_ztrsv(trans, alpha, A, descr_a, b.data(), x); 135 if(status != aoclsparse_status_success) 136 { 137 std::cerr << "Error returned from aoclsparse_ztrsv, status = " << status << "." 138 << std::endl; 139 return 3; 140 } 141 142 // Print and check the result 143 ref.assign({{0.2, 0.6}, {1.4, 0.6}, {3.4, -7.0}, {-1.0, 19.2}}); 144 std::cout << std::endl; 145 std::cout << "Solving U^Hx = alpha*b: " << std::endl << " x = "; 146 for(aoclsparse_int i = 0; i < n; i++) 147 { 148 oki = std::abs(x[i].real - ref[i].real) <= tol && std::abs(x[i].imag - ref[i].imag) <= tol; 149 std::cout << "(" << x[i].real << ", " << x[i].imag << "i) " << (oki ? " " : "! "); 150 ok &= oki; 151 } 152 std::cout << std::endl; 153 154 // Destroy the aoclsparse memory 155 aoclsparse_destroy_mat_descr(descr_a); 156 aoclsparse_destroy(&A); 157 158 return ok ? 0 : 4; 159}
- Parameters
trans – [in] matrix operation type, either aoclsparse_operation_none, aoclsparse_operation_transpose, or aoclsparse_operation_conjugate_transpose.
alpha – [in] scalar \(\alpha\), used to multiply right-hand side vector \(b\).
A – [inout] matrix containing data used to represent the \(m \times m\) triangular linear system to solve.
descr – [in] matrix descriptor. Supported matrix types are aoclsparse_matrix_type_symmetric and aoclsparse_matrix_type_triangular.
b – [in] array of
melements, storing the right-hand side.x – [out] array of
melements, storing the solution if solver returns aoclsparse_status_success.
- Return values
aoclsparse_status_success – the operation completed successfully and \(x\) contains the solution to the linear system of equations.
aoclsparse_status_invalid_size – matrix \(A\) or \(op(A)\) is invalid.
aoclsparse_status_invalid_pointer – One or more of
A,descr,x,bare invalid pointers.aoclsparse_status_internal_error – an internal error occurred.
aoclsparse_status_not_implemented – the requested operation is not yet implemented.
other – possible failure values from a call to aoclsparse_optimize.
-
aoclsparse_status aoclsparse_strsv_strided(aoclsparse_operation trans, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const float *b, const aoclsparse_int incb, float *x, const aoclsparse_int incx)#
-
aoclsparse_status aoclsparse_dtrsv_strided(aoclsparse_operation trans, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const double *b, const aoclsparse_int incb, double *x, const aoclsparse_int incx)#
-
aoclsparse_status aoclsparse_ctrsv_strided(aoclsparse_operation trans, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_float_complex *b, const aoclsparse_int incb, aoclsparse_float_complex *x, const aoclsparse_int incx)#
-
aoclsparse_status aoclsparse_ztrsv_strided(aoclsparse_operation trans, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_double_complex *b, const aoclsparse_int incb, aoclsparse_double_complex *x, const aoclsparse_int incx)#
This is a variation of TRSV, namely with a suffix of
_strided, allows to set the stride for the dense vectorsbandx.For full details refer to
aoclsparse_?trsv().- Parameters
trans – [in] matrix operation type, either aoclsparse_operation_none, aoclsparse_operation_transpose, or aoclsparse_operation_conjugate_transpose.
alpha – [in] scalar \(\alpha\), used to multiply right-hand side vector \(b\).
A – [inout] matrix containing data used to represent the \(m \times m\) triangular linear system to solve.
descr – [in] matrix descriptor. Supported matrix types are aoclsparse_matrix_type_symmetric and aoclsparse_matrix_type_triangular.
b – [in] array of
melements, storing the right-hand side.incb – [in] a positive integer holding the stride value for \(b\) vector.
x – [out] array of
melements, storing the solution if solver returns aoclsparse_status_success.incx – [in] a positive integer holding the stride value for \(x\) vector.
-
aoclsparse_status aoclsparse_strsv_kid(aoclsparse_operation trans, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const float *b, float *x, aoclsparse_int kid)#
-
aoclsparse_status aoclsparse_dtrsv_kid(aoclsparse_operation trans, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const double *b, double *x, aoclsparse_int kid)#
-
aoclsparse_status aoclsparse_ctrsv_kid(aoclsparse_operation trans, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_float_complex *b, aoclsparse_float_complex *x, aoclsparse_int kid)#
-
aoclsparse_status aoclsparse_ztrsv_kid(aoclsparse_operation trans, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_double_complex *b, aoclsparse_double_complex *x, aoclsparse_int kid)#
Sparse triangular solver for real/complex single and double data precisions (kernel flag variation).
For full details refer to
aoclsparse_?trsv().This variation of TRSV, namely with a suffix of
_kid, allows to choose which TRSV kernel to use (if possible). Currently the possible choices are:kid=0Reference implementation (No explicit AVX instructions).
kid=1Alias to
kid=2(Kernel Template AVX 256-bit implementation)kid=2Kernel Template version using
AVX2extensions.kid=3Kernel Template version using
AVX512F+CPU extensions.
Any other Kernel ID value will default to
kid= 0.- Parameters
trans – [in] matrix operation type, either aoclsparse_operation_none, aoclsparse_operation_transpose, or aoclsparse_operation_conjugate_transpose.
alpha – [in] scalar \(\alpha\), used to multiply right-hand side vector \(b\).
A – [inout] matrix containing data used to represent the \(m \times m\) triangular linear system to solve.
descr – [in] matrix descriptor. Supported matrix types are aoclsparse_matrix_type_symmetric and aoclsparse_matrix_type_triangular.
b – [in] array of
melements, storing the right-hand side.x – [out] array of
melements, storing the solution if solver returns aoclsparse_status_success.kid – [in] Kernel ID, hints a request on which TRSV kernel to use.
aoclsparse_?dotmv()#
-
aoclsparse_status aoclsparse_sdotmv(const aoclsparse_operation op, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const float *x, const float beta, float *y, float *d)#
-
aoclsparse_status aoclsparse_ddotmv(const aoclsparse_operation op, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const double *x, const double beta, double *y, double *d)#
-
aoclsparse_status aoclsparse_cdotmv(const aoclsparse_operation op, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_float_complex *x, const aoclsparse_float_complex beta, aoclsparse_float_complex *y, aoclsparse_float_complex *d)#
-
aoclsparse_status aoclsparse_zdotmv(const aoclsparse_operation op, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_double_complex *x, const aoclsparse_double_complex beta, aoclsparse_double_complex *y, aoclsparse_double_complex *d)#
Performs sparse matrix-vector multiplication followed by vector-vector multiplication.
aoclsparse_?dotmvmultiplies the scalar \(\alpha\) with a sparse \(m \times n\) matrix, defined in a sparse storage format, and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that\[\begin{split} y := \alpha \, op(A) \, x + \beta \, y, \quad \text{ with } \quad op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]followed by dot product of dense vectors \(x\) and \(y\) such that
\[\begin{split} {\bf\mathsf{d}} = \left\{ \begin{array}{ll} \sum_{i=0}^{\min(m,n)-1} x_{i} \; y_{i}, & \text{real case} \\ \sum_{i=0}^{\min(m,n)-1} \overline{x_i} \; y_{i}, & \text{complex case} \end{array} \right. \end{split}\]1/* ************************************************************************ 2 * Copyright (c) 2023-2024 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <iostream> 27#include <limits> 28 29#define M 6 30#define N 5 31#define NNZ 8 32 33int main(void) 34{ 35 aoclsparse_operation op = aoclsparse_operation_none; 36 37 double alpha = 1.0; 38 double beta = 0.0; 39 40 // Print aoclsparse version 41 std::cout << aoclsparse_get_version() << std::endl; 42 43 // Create matrix descriptor 44 aoclsparse_mat_descr descr; 45 // aoclsparse_create_mat_descr set aoclsparse_matrix_type to aoclsparse_matrix_type_general 46 // and aoclsparse_index_base to aoclsparse_index_base_zero. 47 aoclsparse_create_mat_descr(&descr); 48 49 aoclsparse_index_base base = aoclsparse_index_base_zero; 50 51 // Initialise matrix 52 // 1 0 0 2 0 53 // 0 3 0 0 0 54 // 0 0 4 0 0 55 // 0 5 0 6 7 56 // 0 0 0 0 8 57 // 0 0 0 0 0 58 aoclsparse_int csr_row_ptr[M + 1] = {0, 2, 3, 4, 7, 8, 8}; 59 aoclsparse_int csr_col_ind[NNZ] = {0, 3, 1, 2, 1, 3, 4, 4}; 60 double csr_val[NNZ] = {1, 2, 3, 4, 5, 6, 7, 8}; 61 aoclsparse_matrix A; 62 aoclsparse_create_dcsr(&A, base, M, N, NNZ, csr_row_ptr, csr_col_ind, csr_val); 63 64 // Initialise vectors 65 double x[N] = {1.0, 1.0, 1.0, 1.0, 1.0}; 66 double y[M] = {1.0, 1.0, 1.0, 1.0, 1.0, 1.0}; 67 double d = 0; 68 double y_exp[M] = {3.0, 3.0, 4.0, 18.0, 8.0}; 69 double d_exp = 36; 70 71 //to identify hint id(which routine is to be executed, destroyed later) 72 aoclsparse_set_dotmv_hint(A, op, descr, 1); 73 74 // Optimize the matrix, "A" 75 aoclsparse_optimize(A); 76 77 std::cout << "Invoking aoclsparse_ddotmv.."; 78 //Invoke DOTMV API (double precision) 79 aoclsparse_ddotmv(op, alpha, A, descr, x, beta, y, &d); 80 std::cout << "Done." << std::endl; 81 82 // Print and check the result 83 std::cout << std::endl << "y = "; 84 bool oki, ok = true; 85 double tol = 20 * std::numeric_limits<double>::epsilon(); 86 for(aoclsparse_int i = 0; i < M; i++) 87 { 88 oki = std::abs(y[i] - y_exp[i]) <= tol; 89 std::cout << y[i] << (oki ? " " : "! "); 90 ok &= oki; 91 } 92 93 ok &= (std::abs(d - d_exp) <= tol); 94 std::cout << std::endl << "Output dot product: " << d << (ok ? " " : "! ") << std::endl; 95 96 aoclsparse_destroy_mat_descr(descr); 97 aoclsparse_destroy(&A); 98 return ok ? 0 : 4; 99}
- Parameters
op – [in] matrix operation type.
alpha – [in] scalar \(\alpha\).
A – [in] the sparse \(m \times n\) matrix structure that is created using aoclsparse_create_scsr() or other variation.
descr – [in] descriptor of the sparse CSR matrix. Both base-zero and base-one are supported, however, the index base needs to match the one used when aoclsparse_matrix was created.
x – [in] array of atleast
nelements if \(op(A) = A\) or at leastmelements if \(op(A) = A^T\) or \(A^H\).beta – [in] scalar \(\beta\).
y – [inout] array of atleast
melements if \(op(A) = A\) or at leastnelements if \(op(A) = A^T\) or \(A^H\).d – [out] dot product of y and x.
- Return values
aoclsparse_status_success – the operation completed successfully.
aoclsparse_status_invalid_size –
m,nornnzis invalid.aoclsparse_status_invalid_value – (base index is neither aoclsparse_index_base_zero nor aoclsparse_index_base_one, or matrix base index and descr base index values do not match.
aoclsparse_status_invalid_pointer –
descr,internalstructures related to the sparse matrixA,x,yordare invalid pointer.aoclsparse_status_wrong_type – matrix data type is not supported.
aoclsparse_status_not_implemented – aoclsparse_matrix_type is aoclsparse_matrix_type_hermitian or, aoclsparse_matrix_format_type is not aoclsparse_csr_mat
aoclsparse_?ellmv()#
-
aoclsparse_status aoclsparse_sellmv(aoclsparse_operation trans, const float *alpha, aoclsparse_int m, aoclsparse_int n, aoclsparse_int nnz, const float *ell_val, const aoclsparse_int *ell_col_ind, aoclsparse_int ell_width, const aoclsparse_mat_descr descr, const float *x, const float *beta, float *y)#
-
aoclsparse_status aoclsparse_dellmv(aoclsparse_operation trans, const double *alpha, aoclsparse_int m, aoclsparse_int n, aoclsparse_int nnz, const double *ell_val, const aoclsparse_int *ell_col_ind, aoclsparse_int ell_width, const aoclsparse_mat_descr descr, const double *x, const double *beta, double *y)#
Real single and double precision sparse matrix vector product using ELL storage format.
aoclsparse_?ellmvmultiplies the scalar \(\alpha\) with a sparse \(m \times n\) matrix, defined in ELL storage format, and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that\[ y = \alpha \, op(A) \, x + \beta \, y, \]with\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]Note
Currently, only
trans= aoclsparse_operation_none is supported.- Parameters
trans – [in] matrix operation type.
alpha – [in] scalar \(\alpha\).
m – [in] number of rows of the sparse ELL matrix.
n – [in] number of columns of the sparse ELL matrix.
nnz – [in] number of non-zero entries of the sparse ELL matrix.
descr – [in] descriptor of the sparse ELL matrix. Both, base-zero and base-one input arrays of ELL matrix are supported
ell_val – [in] array that contains the elements of the sparse ELL matrix. Padded elements should be zero.
ell_col_ind – [in] array that contains the column indices of the sparse ELL matrix. Padded column indices should be -1.
ell_width – [in] number of non-zero elements per row of the sparse ELL matrix.
x – [in] array of
nelements ( \(op(A) = A\)) ormelements ( \(op(A) = A^T\) or \(op(A) = A^H\)).beta – [in] scalar \(\beta\).
y – [inout] array of
melements ( \(op(A) = A\)) ornelements ( \(op(A) = A^T\) or \(op(A) = A^H\)).
- Return values
aoclsparse_status_success – the operation completed successfully.
aoclsparse_status_invalid_size –
m,norell_widthis invalid.aoclsparse_status_invalid_pointer –
descr,alpha,ell_val,ell_col_ind,x,betaorypointer is invalid.aoclsparse_status_not_implemented –
transis not aoclsparse_operation_none, or aoclsparse_matrix_type is not aoclsparse_matrix_type_general.
aoclsparse_?diamv()#
-
aoclsparse_status aoclsparse_sdiamv(aoclsparse_operation trans, const float *alpha, aoclsparse_int m, aoclsparse_int n, aoclsparse_int nnz, const float *dia_val, const aoclsparse_int *dia_offset, aoclsparse_int dia_num_diag, const aoclsparse_mat_descr descr, const float *x, const float *beta, float *y)#
-
aoclsparse_status aoclsparse_ddiamv(aoclsparse_operation trans, const double *alpha, aoclsparse_int m, aoclsparse_int n, aoclsparse_int nnz, const double *dia_val, const aoclsparse_int *dia_offset, aoclsparse_int dia_num_diag, const aoclsparse_mat_descr descr, const double *x, const double *beta, double *y)#
Real single and double precision sparse matrix vector product using DIA storage format.
aoclsparse_?diamvmultiplies the scalar \(\alpha\) with a sparse \(m \times n\) matrix, defined in DIA storage format, and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that\[ y = \alpha \, op(A) \, x + \beta \, y, \]with\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]Note
Currently, only
trans= aoclsparse_operation_none is supported.- Parameters
trans – [in] matrix operation type.
alpha – [in] scalar \(\alpha\).
m – [in] number of rows of the matrix.
n – [in] number of columns of the matrix.
nnz – [in] number of non-zero entries of the matrix.
descr – [in] descriptor of the sparse DIA matrix.
dia_val – [in] array that contains the elements of the matrix. Padded elements should be zero.
dia_offset – [in] array that contains the offsets of each diagonal of the matrix.
dia_num_diag – [in] number of diagonals in the matrix.
x – [in] array of
nelements ( \(op(A) = A\)) ormelements ( \(op(A) = A^T\) or \(op(A) = A^H\)).beta – [in] scalar \(\beta\).
y – [inout] array of
melements ( \(op(A) = A\)) ornelements ( \(op(A) = A^T\) or \(op(A) = A^H\)).
- Return values
aoclsparse_status_success – the operation completed successfully.
aoclsparse_status_invalid_size –
m,norell_widthis invalid.aoclsparse_status_invalid_pointer –
descr,alpha,ell_val,ell_col_ind,x,betaorypointer is invalid.aoclsparse_status_not_implemented –
transis not aoclsparse_operation_none, or aoclsparse_matrix_type is not aoclsparse_matrix_type_general.
aoclsparse_?bsrmv()#
-
aoclsparse_status aoclsparse_sbsrmv(aoclsparse_operation trans, const float *alpha, aoclsparse_int mb, aoclsparse_int nb, aoclsparse_int bsr_dim, const float *bsr_val, const aoclsparse_int *bsr_col_ind, const aoclsparse_int *bsr_row_ptr, const aoclsparse_mat_descr descr, const float *x, const float *beta, float *y)#
-
aoclsparse_status aoclsparse_dbsrmv(aoclsparse_operation trans, const double *alpha, aoclsparse_int mb, aoclsparse_int nb, aoclsparse_int bsr_dim, const double *bsr_val, const aoclsparse_int *bsr_col_ind, const aoclsparse_int *bsr_row_ptr, const aoclsparse_mat_descr descr, const double *x, const double *beta, double *y)#
Real single and double precision matrix vector product using BSR storage format.
aoclsparse_?bsrmvmultiplies the scalar \(\alpha\) with a sparsembtimesbsr_dimbynbtimesbsr_dimmatrix, defined in BSR storage format, and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that\[ y = \alpha \, op(A) \, x + \beta \, y, \]with\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]Note
Only
trans= aoclsparse_operation_none is supported.- Parameters
trans – [in] matrix operation type.
mb – [in] number of block rows of the sparse BSR matrix.
nb – [in] number of block columns of the sparse BSR matrix.
alpha – [in] scalar \(\alpha\).
descr – [in] descriptor of the sparse BSR matrix. Both, base-zero and base-one input arrays of BSR matrix are supported.
bsr_val – [in] array of
nnzbblocks of the sparse BSR matrix.bsr_row_ptr – [in] array of
mb+1elements that point to the start of every block row of the sparse BSR matrix.bsr_col_ind – [in] array of
nnzcontaining the block column indices of the sparse BSR matrix.bsr_dim – [in] block dimension of the sparse BSR matrix.
x – [in] array of
nbtimesbsr_dimelements ( \(op(A) = A\)) ormbtimesbsr_dimelements ( \(op(A) = A^T\) or \(op(A) = A^H\)).beta – [in] scalar \(\beta\).
y – [inout] array of
mbtimesbsr_dimelements ( \(op(A) = A\)) ornbtimesbsr_dimelements ( \(op(A) = A^T\) or \(op(A) = A^H\)).
- Return values
aoclsparse_status_success – the operation completed successfully.
aoclsparse_status_invalid_handle – the library context was not initialized.
aoclsparse_status_invalid_size –
mb,nb,nnzborbsr_dimis invalid.aoclsparse_status_invalid_pointer –
descr,alpha,bsr_val,bsr_row_ind,bsr_col_ind,x,betaorypointer is invalid.aoclsparse_status_arch_mismatch – the device is not supported.
aoclsparse_status_not_implemented –
transis not aoclsparse_operation_none, or aoclsparse_matrix_type is not aoclsparse_matrix_type_general.
aoclsparse_?csrmv()#
-
aoclsparse_status aoclsparse_scsrmv(aoclsparse_operation trans, const float *alpha, aoclsparse_int m, aoclsparse_int n, aoclsparse_int nnz, const float *csr_val, const aoclsparse_int *csr_col_ind, const aoclsparse_int *csr_row_ptr, const aoclsparse_mat_descr descr, const float *x, const float *beta, float *y)#
-
aoclsparse_status aoclsparse_dcsrmv(aoclsparse_operation trans, const double *alpha, aoclsparse_int m, aoclsparse_int n, aoclsparse_int nnz, const double *csr_val, const aoclsparse_int *csr_col_ind, const aoclsparse_int *csr_row_ptr, const aoclsparse_mat_descr descr, const double *x, const double *beta, double *y)#
Real single and double precision sparse matrix-vector multiplication using CSR storage format.
aoclsparse_?csrmvmultiplies the scalar \(\alpha\) with a sparse \(m \times n\) matrix, defined in CSR storage format, and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that\[ y = \alpha \, op(A) \, x + \beta \, y, \]with\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]- Parameters
trans – [in] matrix operation type.
alpha – [in] scalar \(\alpha\).
m – [in] number of rows of the sparse CSR matrix.
n – [in] number of columns of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
csr_val – [in] array of
nnzelements of the sparse CSR matrix.csr_col_ind – [in] array of
nnzelements containing the column indices of the sparse CSR matrix.csr_row_ptr – [in] array of
m+1 elements that point to the start of every row of the sparse CSR matrix.descr – [in] descriptor of the sparse CSR matrix. Currently, only aoclsparse_matrix_type_general and aoclsparse_matrix_type_symmetric is supported.
x – [in] array of
nelements ( \(op(A) = A\)) ormelements ( \(op(A) = A^T\) or \(op(A) = A^H\)).beta – [in] scalar \(\beta\).
y – [inout] array of
melements ( \(op(A) = A\)) ornelements ( \(op(A) = A^T\) or \(op(A) = A^H\)).
- Return values
aoclsparse_status_success – the operation completed successfully.
aoclsparse_status_invalid_size –
m,nornnzis invalid.aoclsparse_status_invalid_pointer –
descr,alpha,csr_val,csr_row_ptr,csr_col_ind,x,betaorypointer is invalid.aoclsparse_status_not_implemented –
transis not aoclsparse_operation_none andtransis not aoclsparse_operation_transpose. aoclsparse_matrix_type is not aoclsparse_matrix_type_general, or aoclsparse_matrix_type is not aoclsparse_matrix_type_symmetric.
aoclsparse_?csrsv()#
-
aoclsparse_status aoclsparse_scsrsv(aoclsparse_operation trans, const float *alpha, aoclsparse_int m, const float *csr_val, const aoclsparse_int *csr_col_ind, const aoclsparse_int *csr_row_ptr, const aoclsparse_mat_descr descr, const float *x, float *y)#
-
aoclsparse_status aoclsparse_dcsrsv(aoclsparse_operation trans, const double *alpha, aoclsparse_int m, const double *csr_val, const aoclsparse_int *csr_col_ind, const aoclsparse_int *csr_row_ptr, const aoclsparse_mat_descr descr, const double *x, double *y)#
Sparse triangular solve using CSR storage format for single and double data precisions.
- Deprecated:
This API is superseded by aoclsparse_strsv() and aoclsparse_dtrsv().
aoclsparse_?csrsvsolves a sparse triangular linear system of a sparse \(m \times m\) matrix, defined in CSR storage format, a dense solution vector \(y\) and the right-hand side \(x\) that is multiplied by \(\alpha\), such that\[ op(A) \, y = \alpha \, x, \]with\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]Note
Only
trans= aoclsparse_operation_none is supported.Note
The input matrix has to be sparse upper or lower triangular matrix with unit or non-unit main diagonal. Matrix has to be sorted. No diagonal element can be omitted from a sparse storage if the solver is called with the non-unit indicator.
- Parameters
trans – [in] matrix operation type.
alpha – [in] scalar \(\alpha\).
m – [in] number of rows of the sparse CSR matrix.
csr_val – [in] array of
nnzelements of the sparse CSR matrix.csr_row_ptr – [in] array of
m+1elements that point to the start of every row of the sparse CSR matrix.csr_col_ind – [in] array of
nnzelements containing the column indices of the sparse CSR matrix.descr – [in] descriptor of the sparse CSR matrix.
x – [in] array of
melements, holding the right-hand side.y – [out] array of
melements, holding the solution.
- Return values
aoclsparse_status_success – the operation completed successfully.
aoclsparse_status_invalid_size –
mis invalid.aoclsparse_status_invalid_pointer –
descr,alpha,csr_val,csr_row_ptr,csr_col_ind,xorypointer is invalid.aoclsparse_status_internal_error – an internal error occurred.
aoclsparse_status_not_implemented –
trans= aoclsparse_operation_conjugate_transpose ortrans= aoclsparse_operation_transpose or aoclsparse_matrix_type is not aoclsparse_matrix_type_general.
Level 3#
This module holds all sparse level 3 routines.
The sparse level 3 routines describe operations between matrices.
aoclsparse_?trsm()#
-
aoclsparse_status aoclsparse_strsm(const aoclsparse_operation trans, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const float *B, aoclsparse_int n, aoclsparse_int ldb, float *X, aoclsparse_int ldx)#
-
aoclsparse_status aoclsparse_dtrsm(const aoclsparse_operation trans, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const double *B, aoclsparse_int n, aoclsparse_int ldb, double *X, aoclsparse_int ldx)#
-
aoclsparse_status aoclsparse_ctrsm(aoclsparse_operation trans, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const aoclsparse_float_complex *B, aoclsparse_int n, aoclsparse_int ldb, aoclsparse_float_complex *X, aoclsparse_int ldx)#
-
aoclsparse_status aoclsparse_ztrsm(aoclsparse_operation trans, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const aoclsparse_double_complex *B, aoclsparse_int n, aoclsparse_int ldb, aoclsparse_double_complex *X, aoclsparse_int ldx)#
Solve sparse triangular linear system of equations with multiple right hand sides for real/complex single and double data precisions.
aoclsparse_?trsmsolves a sparse triangular linear system of equations with multiple right hand sides, of the form\[ op(A)\; X = \alpha B, \]where \(A\) is a sparse matrix of size \(m\), \(op()\) is a linear operator, \(X\) and \(B\) are rectangular dense matrices of appropiate size, while \(\alpha\) is a scalar. The sparse matrix \(A\) can be interpreted either as a lower triangular or upper triangular. This is indicated byfill_modefrom the matrix descriptordescrwhere either upper or lower triangular portion of the matrix is only referenced. The matrix can also be of class symmetric in which case only the selected triangular part is used. Matrix \(A\) must be of full rank, that is, the matrix must be invertible. The linear operator \(op()\) can define the transposition or Hermitian transposition operations. By default, no transposition is performed. The right-hand-side matrix \(B\) and the solution matrix \(X\) are dense and must be of the correct size, that is \(m\) by \(n\), seeldbandldxinput parameters for further details.Explicitly, this kernel solves
\[ op(A)\; X = \alpha \; B\text{, with solution } X = \alpha \; (op(A)^{-1}) \; B, \]where\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]If a linear operator is applied then, the possible problems solved are\[ A^T \; X = \alpha \; B\text{, with solution } X = \alpha \; A^{-T} \; B\text{, and } \; A^H \; X = \alpha \; B\text{, with solution } X = \alpha \; A^{-H} \; B. \]Notes
If the matrix descriptor
descrspecifies that the matrix \(A\) is to be regarded as having a unitary diagonal, then the main diagonal entries of matrix \(A\) are not accessed and are considered to all be ones.If the matrix \(A\) is described as upper triangular, then only the upper triangular portion of the matrix is referenced. Conversely, if the matrix \(A\) is described lower triangular, then only the lower triangular portion of the matrix is used.
This set of APIs allocates work array of size \(m\) for each case where the matrices \(B\) or \(X\) are stored in row-major format (ref aoclsparse_order_row).
A subset of kernels are parallel (on parallel builds) and can be expected potential acceleration in the solve. These kernels are available when both dense matrices \(X\) and \(B\) are stored in column-major format (ref aoclsparse_order_column) and thread count is greater than 1 on a parallel build.
There is
aoclsparse_trsm_kid(Kernel ID) variation of TRSM, namely with a suffix of_kid, this solver allows to choose which underlying TRSV kernels to use (if possible). Currently, all the existing kernels avilable inaoclsparse_trsv_kidare supported.This routine supports only sparse matrices in CSR format.
1/* ************************************************************************ 2 * Copyright (c) 2023-2024 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <iomanip> 27#include <iostream> 28#include <limits> 29#include <vector> 30 31int main(void) 32{ 33 std::cout << "------------------------------------------------" << std::endl 34 << " Triangle Solve with Multiple Right Hand Sides" << std::endl 35 << " sample program for real data type" << std::endl 36 << "------------------------------------------------" << std::endl 37 << std::endl; 38 39 /* 40 * This example illustrates how to solve two triangular 41 * linear system of equations. The real matrix used is 42 * | 1 2 0 0 | 43 * A = | 3 1 2 0 | 44 * | 0 3 1 2 | 45 * | 0 0 3 1 | 46 * 47 * The first linear system is L X = alpha * B with L = tril(A), X and B 48 * are two dense matrices of size 4x2 stored in column-major format. 49 * 50 * The second linear system is U X = alpha * B with U = triu(A), X and B 51 * are two dense matrices of size 4x2 stored in row-major format. 52 * 53 */ 54 55 // Create a tri-diagonal matrix in CSR format 56 const aoclsparse_int n = 4, m = 4, nnz = 10; 57 aoclsparse_int icrow[5] = {0, 2, 5, 8, 10}; 58 aoclsparse_int icol[10] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3}; 59 double aval[10] = {1, 2, 3, 1, 2, 3, 1, 2, 3, 1}; 60 aoclsparse_status status; 61 const aoclsparse_int k = 2; 62 aoclsparse_int ldb, ldx; 63 bool oki, ok; 64 double tol = 20.0 * std::numeric_limits<double>::epsilon(); 65 66 // Create aoclsparse matrix and its descriptor 67 aoclsparse_matrix A; 68 aoclsparse_index_base base = aoclsparse_index_base_zero; 69 aoclsparse_order order; 70 aoclsparse_mat_descr descr_a; 71 aoclsparse_operation trans; 72 status = aoclsparse_create_dcsr(&A, base, m, n, nnz, icrow, icol, aval); 73 if(status != aoclsparse_status_success) 74 { 75 std::cerr << "Error returned from aoclsparse_create_dcsr, status = " << status << "." 76 << std::endl; 77 return 3; 78 } 79 aoclsparse_create_mat_descr(&descr_a); 80 81 /* Case 1: Solving the lower triangular system L^T X = alpha*B. 82 * |1, -1| 83 * B = |2, -2| stored in column-major layout 84 * |3, -6| k = 2 and ldb = 4 85 * |4, -8| 86 * 87 * Linear system L X = B, with solution matrix 88 * | 86,-167| 89 * X = |-29, 56| stored in column-major layout 90 * | 9, -18| k = 2 and ldx = 4 91 * | -4, 8| 92 */ 93 94 double alpha = -1.0; 95 double X[m * k]; 96 97 double B[m * k] = {1, 2, 3, 4, -1, -2, -6, -8}; 98 std::vector<double> XRef; 99 XRef.assign({86, -29, 9, -4, -167, 56, -18, 8}); 100 101 // Indicate to only access the lower triangular part of matrix A 102 aoclsparse_set_mat_type(descr_a, aoclsparse_matrix_type_triangular); 103 aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_lower); 104 trans = aoclsparse_operation_transpose; 105 106 order = aoclsparse_order_column; 107 ldb = m; 108 ldx = m; 109 status = aoclsparse_set_sm_hint(A, trans, descr_a, order, 1); 110 if(status != aoclsparse_status_success) 111 { 112 std::cerr << "Error returned from aoclsparse_set_sm_hint, status = " << status << "." 113 << std::endl; 114 return 3; 115 } 116 status = aoclsparse_optimize(A); 117 if(status != aoclsparse_status_success) 118 { 119 std::cerr << "Error returned from aoclsparse_optimize, status = " << status << "." 120 << std::endl; 121 return 3; 122 } 123 // Solve 124 status = aoclsparse_dtrsm(trans, alpha, A, descr_a, order, &B[0], k, ldb, &X[0], ldx); 125 if(status != aoclsparse_status_success) 126 { 127 std::cerr << "Error returned from aoclsparse_dtrsm, status = " << status << "." 128 << std::endl; 129 return 3; 130 } 131 132 // Print and check the result 133 std::cout << "Solving L^T X = alpha B, where L=tril(A), and" << std::endl; 134 std::cout << " X and B are dense rectangular martices (column-major layout)" << std::endl; 135 std::cout << " Solution matrix X = " << std::endl; 136 std::cout << std::fixed; 137 std::cout.precision(1); 138 ok = true; 139 size_t idx; 140 for(int row = 0; row < m; ++row) 141 { 142 for(int col = 0; col < k; ++col) 143 { 144 idx = row + col * ldx; 145 oki = std::abs(X[idx] - XRef[idx]) <= tol; 146 std::cout << X[idx] << (oki ? " " : "! "); 147 ok &= oki; 148 } 149 std::cout << std::endl; 150 } 151 std::cout << std::endl; 152 153 /* Case 2: Solving the lower triangular system U X = alpha*B. 154 * In this case we "reinterpret" B as a col-major layout, having the 155 * form 156 * | 1, 2| 157 * B = | 3, 4| stored in row-major layout 158 * |-1, -2| k = 2 and ldb = 2 159 * |-6, -8| 160 * 161 * Linear system L X = B, with solution matrix 162 * | 39, 50| 163 * X = |-19, -24| stored in row-major layout 164 * | 11, 14| k = 2 and ldx = 2 165 * | -6, -8| 166 */ 167 168 // Store same XRef in row-major format 169 XRef.assign({39, 50, -19, -24, 11, 14, -6, -8}); 170 alpha = 1.0; 171 ldb = k; 172 ldx = k; 173 // Indicate to use only the conjugate transpose of the upper part of A 174 aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_upper); 175 trans = aoclsparse_operation_none; 176 order = aoclsparse_order_row; 177 status = aoclsparse_set_sm_hint(A, trans, descr_a, order, 1); 178 if(status != aoclsparse_status_success) 179 { 180 std::cerr << "Error returned from aoclsparse_set_sm_hint, status = " << status << "." 181 << std::endl; 182 return 3; 183 } 184 status = aoclsparse_optimize(A); 185 if(status != aoclsparse_status_success) 186 { 187 std::cerr << "Error returned from aoclsparse_optimize, status = " << status << "." 188 << std::endl; 189 return 3; 190 } 191 // Solve 192 status = aoclsparse_dtrsm(trans, alpha, A, descr_a, order, &B[0], k, ldb, &X[0], ldx); 193 if(status != aoclsparse_status_success) 194 { 195 std::cerr << "Error returned from aoclsparse_dtrsm, status = " << status << "." 196 << std::endl; 197 return 3; 198 } 199 // Print and check the result 200 std::cout << "Solving U X = alpha B, where U=triu(A), and" << std::endl; 201 std::cout << " X and B are dense rectangular martices (row-major layout)" << std::endl; 202 std::cout << " Solution matrix X = " << std::endl; 203 for(int row = 0; row < m; ++row) 204 { 205 for(int col = 0; col < k; col++) 206 { 207 idx = row * ldx + col; 208 oki = std::abs(X[idx] - XRef[idx]) <= tol; 209 std::cout << X[idx] << (oki ? " " : "! "); 210 ok &= oki; 211 } 212 std::cout << std::endl; 213 } 214 std::cout << std::endl; 215 216 // Destroy the aoclsparse memory 217 aoclsparse_destroy_mat_descr(descr_a); 218 aoclsparse_destroy(&A); 219 220 return ok ? 0 : 5; 221}
1/* ************************************************************************ 2 * Copyright (c) 2023-2024 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <complex> 27#include <iomanip> 28#include <iostream> 29#include <limits> 30#include <vector> 31 32int main(void) 33{ 34 std::cout << "------------------------------------------------" << std::endl 35 << " Triangle Solve with Multiple Right Hand Sides" << std::endl 36 << " sample program for complex data type" << std::endl 37 << "------------------------------------------------" << std::endl 38 << std::endl; 39 40 /* 41 * This example illustrates how to solve two triangular 42 * linear system of equations. The complex matrix used is 43 * | 1+3i 2+5i 0 0 | 44 * A = | 3 1+2i 2-2i 0 | 45 * | 0 3 1 2+3i | 46 * | 0 0 3+1i 1+1i | 47 * 48 * The first linear system is L X = alpha * B with L = tril(A), X and B 49 * are two dense matrices of size 4x2 stored in column-major format. 50 * 51 * The second linear system is U^H X = alpha * B with U = triu(A), X and B 52 * are two dense matrices of size 4x2 stored in row-major format. 53 * 54 */ 55 56 // Create a tri-diagonal matrix in CSR format 57 const aoclsparse_int n = 4, m = 4, nnz = 10; 58 aoclsparse_int icrow[5] = {0, 2, 5, 8, 10}; 59 aoclsparse_int icol[10] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3}; 60 std::vector<aoclsparse_double_complex> aval; 61 // clang-format off 62 aval.assign({{1, 3}, {2, 5}, {3, 0}, {1, 2}, {2, -2}, {3, 0}, 63 {1, 0}, {2, 3}, {3, 1}, {1, 1}}); 64 // clang-format on 65 aoclsparse_status status; 66 aoclsparse_int ldb, ldx; 67 bool oki, ok; 68 double tol = 20 * std::numeric_limits<double>::epsilon(); 69 70 aoclsparse_matrix A; 71 const aoclsparse_int k = 2; // number of columns of X and B 72 aoclsparse_index_base base = aoclsparse_index_base_zero; 73 aoclsparse_order order = aoclsparse_order_column; 74 aoclsparse_mat_descr descr_a; 75 aoclsparse_operation trans = aoclsparse_operation_none; 76 status = aoclsparse_create_zcsr(&A, base, m, n, nnz, icrow, icol, aval.data()); 77 if(status != aoclsparse_status_success) 78 { 79 std::cerr << "Error returned from aoclsparse_create_zcsr, status = " << status << "." 80 << std::endl; 81 return 3; 82 } 83 aoclsparse_create_mat_descr(&descr_a); 84 85 /* Case 1: Solving the lower triangular system L X = B. 86 * |-2+4i, -3+i| 87 * B = |3+13i, 4+11i| stored in column-major layout 88 * |16+7i, 16+2i| k = 2 and ldb = 4 89 * |15+11i,10+16i| 90 * 91 * Linear system L X = B, with solution matrix 92 * |1+i, i| 93 * X = |4+2i, 4| stored in column-major layout 94 * |4+i, 4+2i| k = 2 and ldx = 4 95 * |4, 3+3i| 96 */ 97 98 aoclsparse_double_complex alpha = {1.0, 0.}; 99 aoclsparse_double_complex X[m * k]; 100 101 std::vector<aoclsparse_double_complex> B; 102 B.assign({{-2, 4}, {3, 13}, {16, 7}, {15, 11}, {-3, 1}, {4, 11}, {16, 2}, {10, 16}}); 103 std::vector<aoclsparse_double_complex> XRef; 104 XRef.assign({{1, 1}, {4, 2}, {4, 1}, {4, 0}, {0, 1}, {4, 0}, {4, 2}, {3, 3}}); 105 106 // indicate to only access the lower triangular part of matrix A 107 aoclsparse_set_mat_type(descr_a, aoclsparse_matrix_type_triangular); 108 aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_lower); 109 110 // Prepare and call solver 111 order = aoclsparse_order_column; 112 ldb = m; 113 ldx = m; 114 status = aoclsparse_set_sm_hint(A, trans, descr_a, order, 1); 115 if(status != aoclsparse_status_success) 116 { 117 std::cerr << "Error returned from aoclsparse_set_sm_hint, status = " << status << "." 118 << std::endl; 119 return 3; 120 } 121 status = aoclsparse_optimize(A); 122 if(status != aoclsparse_status_success) 123 { 124 std::cerr << "Error returned from aoclsparse_optimize, status = " << status << "." 125 << std::endl; 126 return 3; 127 } 128 129 // Solve 130 status = aoclsparse_ztrsm(trans, alpha, A, descr_a, order, B.data(), k, ldb, &X[0], ldx); 131 if(status != aoclsparse_status_success) 132 { 133 std::cerr << "Error from aoclsparse_ztrsm, status = " << status << std::endl; 134 return 3; 135 } 136 137 // Print and check the result 138 std::cout << "Solving L X = alpha B, where L=tril(A), and" << std::endl; 139 std::cout << " X and B are dense rectangular martices (column-major layout)" << std::endl; 140 std::cout << " Solution matrix X = " << std::endl; 141 std::cout << std::fixed; 142 std::cout.precision(1); 143 ok = true; 144 size_t idx; 145 for(int row = 0; row < m; ++row) 146 { 147 for(int col = 0; col < k; ++col) 148 { 149 idx = row + col * ldx; 150 oki = std::abs(X[idx].real - XRef[idx].real) <= tol 151 && std::abs(X[idx].imag - XRef[idx].imag) <= tol; 152 std::cout << "(" << X[idx].real << ", " << X[idx].imag << "i) " << (oki ? " " : "! "); 153 ok &= oki; 154 } 155 std::cout << std::endl; 156 } 157 std::cout << std::endl; 158 159 /* Case 2: Solving the lower triangular system U^H X = alpha*B. 160 * |2+4i, -1+3i| 161 * B = |9+15i, 6+9i| stored in row-major layout 162 * |-13+8i,-10+12i| k = 2 and ldx = 2 163 * |14+15i, 8+20i| 164 * 165 * Linear system L X = B, with solution matrix 166 * |1+i, i| 167 * X = |4+2i, 4| stored in row-major layout 168 * |4+i, 4+2i| k = 2 and ldx = 2 169 * |4, 3+3i| 170 */ 171 172 B.assign({{2, 4}, {-1, 3}, {9, 15}, {6, 9}, {-13, 8}, {-10, 12}, {14, 15}, {8, 20}}); 173 // Store same XRef in row-major format 174 XRef.assign({{1, 1}, {0, 1}, {4, 2}, {4, 0}, {4, 1}, {4, 2}, {4, 0}, {3, 3}}); 175 alpha = {0, -1}; 176 ldb = k; 177 ldx = k; 178 // Indicate to use only the conjugate transpose of the upper part of A 179 trans = aoclsparse_operation_conjugate_transpose; 180 order = aoclsparse_order_row; 181 aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_upper); 182 status = aoclsparse_set_sm_hint(A, trans, descr_a, order, 1); 183 if(status != aoclsparse_status_success) 184 { 185 std::cerr << "Error returned from aoclsparse_set_sm_hint, status = " << status << "." 186 << std::endl; 187 return 3; 188 } 189 status = aoclsparse_optimize(A); 190 if(status != aoclsparse_status_success) 191 { 192 std::cerr << "Error returned from aoclsparse_optimize, status = " << status << "." 193 << std::endl; 194 return 3; 195 } 196 // Solve 197 status = aoclsparse_ztrsm(trans, alpha, A, descr_a, order, B.data(), k, ldb, &X[0], ldx); 198 if(status != aoclsparse_status_success) 199 { 200 std::cerr << "Error from aoclsparse_ztrsm, status = " << status << std::endl; 201 return 3; 202 } 203 // Print and check the result 204 std::cout << "Solving U^H X = alpha B, where U=triu(A), and" << std::endl; 205 std::cout << " X and B are dense rectangular martices (row-major layout)" << std::endl; 206 std::cout << " Solution matrix X = " << std::endl; 207 for(int row = 0; row < m; ++row) 208 { 209 for(int col = 0; col < k; col++) 210 { 211 idx = row * ldx + col; 212 oki = std::abs(X[idx].real - XRef[idx].real) <= tol 213 && std::abs(X[idx].imag - XRef[idx].imag) <= tol; 214 std::cout << "(" << X[idx].real << ", " << X[idx].imag << "i) " << (oki ? " " : "! "); 215 ok &= oki; 216 } 217 std::cout << std::endl; 218 } 219 std::cout << std::endl; 220 // Destroy the aoclsparse memory 221 aoclsparse_destroy_mat_descr(descr_a); 222 aoclsparse_destroy(&A); 223 224 return ok ? 0 : 5; 225}
- Parameters
trans – [in] matrix operation to perform on \(A\). Possible values are aoclsparse_operation_none, aoclsparse_operation_transpose, and aoclsparse_operation_conjugate_transpose.
alpha – [in] scalar \(\alpha\).
A – [in] sparse matrix \(A\) of size \(m\).
descr – [in] descriptor of the sparse matrix \(A\).
order – [in] storage order of dense matrices \(B\) and \(X\). Possible options are aoclsparse_order_row and aoclsparse_order_column.
B – [in] dense matrix, potentially rectangular, of size \(m \times n\).
n – [in] \(n,\) number of columns of the dense matrix \(B\).
ldb – [in] leading dimension of \(B\). Eventhough the matrix \(B\) is considered of size \(m \times n\), its memory layout may correspond to a larger matrix (
ldbby \(N>n\)) in which only the submatrix \(B\) is of interest. In this case, this parameter provides means to access the correct elements of \(B\) within the larger layout.matrix layout
row count
column count
\(m\)
ldbwithldb\(\ge n\)ldbwithldb\(\ge m\)\(n\)
X – [out] solution matrix \(X,\) dense and potentially rectangular matrix of size \(m \times n\).
ldx – [in] leading dimension of \(X\). Eventhough the matrix \(X\) is considered of size \(m \times n\), its memory layout may correspond to a larger matrix (
ldxby \(N>n\)) in which only the submatrix \(X\) is of interest. In this case, this parameter provides means to access the correct elements of \(X\) within the larger layout.matrix layout
row count
column count
\(m\)
ldxwithldx\(\ge n\)ldxwithldx\(\ge m\)\(n\)
- Return values
aoclsparse_status_success – indicates that the operation completed successfully.
aoclsparse_status_invalid_size – informs that either
m,n,nnz,ldborldxis invalid.aoclsparse_status_invalid_pointer – informs that either
descr,alpha,A,B, orXpointer is invalid.aoclsparse_status_not_implemented – this error occurs when the provided matrix aoclsparse_matrix_type is aoclsparse_matrix_type_general or aoclsparse_matrix_type_hermitian or when matrix
Ais not in CSR format.
-
aoclsparse_status aoclsparse_strsm_kid(const aoclsparse_operation trans, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const float *B, aoclsparse_int n, aoclsparse_int ldb, float *X, aoclsparse_int ldx, const aoclsparse_int kid)#
-
aoclsparse_status aoclsparse_dtrsm_kid(const aoclsparse_operation trans, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const double *B, aoclsparse_int n, aoclsparse_int ldb, double *X, aoclsparse_int ldx, const aoclsparse_int kid)#
-
aoclsparse_status aoclsparse_ctrsm_kid(aoclsparse_operation trans, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const aoclsparse_float_complex *B, aoclsparse_int n, aoclsparse_int ldb, aoclsparse_float_complex *X, aoclsparse_int ldx, const aoclsparse_int kid)#
-
aoclsparse_status aoclsparse_ztrsm_kid(aoclsparse_operation trans, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const aoclsparse_double_complex *B, aoclsparse_int n, aoclsparse_int ldb, aoclsparse_double_complex *X, aoclsparse_int ldx, const aoclsparse_int kid)#
Solve sparse triangular linear system of equations with multiple right hand sides for real/complex single and double data precisions (kernel flag variation).
For full details refer to
aoclsparse_?trsm().This variation of TRSM, namely with a suffix of
_kid, allows to choose which underlying TRSV kernels to use (if possible). Currently, all the existing kernels supported byaoclsparse_?trsv_kid()are available here as well.- Parameters
trans – [in] matrix operation to perform on \(A\). Possible values are aoclsparse_operation_none, aoclsparse_operation_transpose, and aoclsparse_operation_conjugate_transpose.
alpha – [in] scalar \(\alpha\).
A – [in] sparse matrix \(A\) of size \(m\).
descr – [in] descriptor of the sparse matrix \(A\).
order – [in] storage order of dense matrices \(B\) and \(X\). Possible options are aoclsparse_order_row and aoclsparse_order_column.
B – [in] dense matrix, potentially rectangular, of size \(m \times n\).
n – [in] \(n,\) number of columns of the dense matrix \(B\).
ldb – [in] leading dimension of \(B\). Eventhough the matrix \(B\) is considered of size \(m \times n\), its memory layout may correspond to a larger matrix (
ldbby \(N>n\)) in which only the submatrix \(B\) is of interest. In this case, this parameter provides means to access the correct elements of \(B\) within the larger layout.matrix layout
row count
column count
\(m\)
ldbwithldb\(\ge n\)ldbwithldb\(\ge m\)\(n\)
X – [out] solution matrix \(X,\) dense and potentially rectangular matrix of size \(m \times n\).
ldx – [in] leading dimension of \(X\). Eventhough the matrix \(X\) is considered of size \(m \times n\), its memory layout may correspond to a larger matrix (
ldxby \(N>n\)) in which only the submatrix \(X\) is of interest. In this case, this parameter provides means to access the correct elements of \(X\) within the larger layout.matrix layout
row count
column count
\(m\)
ldxwithldx\(\ge n\)ldxwithldx\(\ge m\)\(n\)
kid – [in] kernel ID, hints which kernel to use.
aoclsparse_sp2m()#
-
aoclsparse_status aoclsparse_sp2m(aoclsparse_operation opA, const aoclsparse_mat_descr descrA, const aoclsparse_matrix A, aoclsparse_operation opB, const aoclsparse_mat_descr descrB, const aoclsparse_matrix B, const aoclsparse_request request, aoclsparse_matrix *C)#
Sparse matrix Sparse matrix multiplication for real and complex datatypes.
aoclsparse_?sp2mmultiplies two sparse matrices in CSR storage format. The result is stored in a newly allocated sparse matrix in CSR format, such that\[ C = op(A) \, op(B), \]with\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]and\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if } {\bf\mathsf{opB}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ B^T, & \text{if } {\bf\mathsf{opB}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ B^H, & \text{if } {\bf\mathsf{opB}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]where \(A\) is a \(m \times k\) matrix , \(B\) is a \(k \times n\) matrix, resulting in \(m \times n\) matrix \(C\), foropAandopB= aoclsparse_operation_none. \(A\) is a \(k \times m\) matrix whenopA= aoclsparse_operation_transpose or aoclsparse_operation_conjugate_transpose and \(B\) is a \(n \times k\) matrix whenopB= aoclsparse_operation_transpose or aoclsparse_operation_conjugate_transposeaoclsparse_sp2m can be run in single-stage or two-stage. The single-stage algorithm allocates and computes the entire output matrix in a single stage aoclsparse_stage_full_computation. Whereas, in two-stage algorithm, the first stage aoclsparse_stage_nnz_count allocates memory for the output matrix and computes the number of entries of the matrix. The second stage aoclsparse_stage_finalize computes column indices of non-zero elements and values of the output matrix. The second stage has to be invoked only after the first stage. But, it can be also be invoked multiple times consecutively when the sparsity structure of input matrices remains unchanged, with only the values getting updated.
1/* ************************************************************************ 2 * Copyright (c) 2023 Advanced Micro Devices, Inc. All rights reserved. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <complex> 27#include <iomanip> 28#include <iostream> 29 30/* Computes multiplication of 2 sparse matrices A and B in CSR format. */ 31 32// Comment this out for single stage computation 33//#define TWO_STAGE_COMPUTATION 34 35int main(void) 36{ 37 std::cout << "-------------------------------" << std::endl 38 << "----- SP2M sample program -----" << std::endl 39 << "-------------------------------" << std::endl 40 << std::endl; 41 42 aoclsparse_status status; 43 aoclsparse_int nnz_C; 44 aoclsparse_request request; 45 aoclsparse_index_base base = aoclsparse_index_base_zero; 46 aoclsparse_operation opA = aoclsparse_operation_transpose; 47 aoclsparse_operation opB = aoclsparse_operation_none; 48 49 // Print aoclsparse version 50 std::cout << aoclsparse_get_version() << std::endl; 51 52 // Initialise matrix descriptor and csr matrix structure of inputs A and B 53 aoclsparse_mat_descr descrA; 54 aoclsparse_mat_descr descrB; 55 aoclsparse_matrix csrA; 56 aoclsparse_matrix csrB; 57 58 // Create matrix descriptor of input matrices 59 // aoclsparse_create_mat_descr set aoclsparse_matrix_type to aoclsparse_matrix_type_general 60 // and aoclsparse_index_base to aoclsparse_index_base_zero. 61 aoclsparse_create_mat_descr(&descrA); 62 aoclsparse_create_mat_descr(&descrB); 63 64 // Matrix sizes 65 aoclsparse_int m = 5, n = 5, k = 5; 66 aoclsparse_int nnz_A = 10, nnz_B = 10; 67 // Matrix A 68 aoclsparse_int row_ptr_A[] = {0, 1, 2, 5, 9, 10}; 69 aoclsparse_int col_ind_A[] = {0, 0, 1, 2, 4, 0, 1, 2, 3, 4}; 70 aoclsparse_double_complex val_A[] = {{-0.86238, 0.454626}, 71 {-2.62138, -0.442597}, 72 {-0.875679, 0.137933}, 73 {-0.661939, -1.09106}, 74 {0.0501717, -2.37527}, 75 {-1.48812, -0.420546}, 76 {-0.588085, -0.708977}, 77 {0.310933, -0.96569}, 78 {-0.88964, -2.37881}, 79 {-1.23201, 0.213152}}; 80 aoclsparse_create_zcsr( 81 &csrA, base, k, m, nnz_A, row_ptr_A, col_ind_A, (aoclsparse_double_complex *)val_A); 82 83 // Matrix B 84 aoclsparse_int row_ptr_B[] = {0, 4, 4, 7, 8, 10}; 85 aoclsparse_int col_ind_B[] = {0, 1, 2, 4, 0, 1, 2, 2, 2, 3}; 86 aoclsparse_double_complex val_B[] = {{-1.59204, -0.259325}, 87 {0.467532, -0.980612}, 88 {0.078412, -0.513591}, 89 {-1.52364, 0.403911}, 90 {0.211966, -1.33485}, 91 {-1.37901, -1.44562}, 92 {1.42472, -2.08662}, 93 {-2.26549, -1.0073}, 94 {-1.75098, 0.207783}, 95 {-1.8152, 0.482205}}; 96 97 aoclsparse_create_zcsr( 98 &csrB, base, k, n, nnz_B, row_ptr_B, col_ind_B, (aoclsparse_double_complex *)val_B); 99 100 aoclsparse_matrix csrC = NULL; 101 aoclsparse_int *csr_row_ptr_C = NULL; 102 aoclsparse_int *csr_col_ind_C = NULL; 103 aoclsparse_double_complex *csr_val_C = NULL; 104 aoclsparse_int C_M, C_N; 105 106 // expected output matrices 107 aoclsparse_int C_M_exp = 5, C_N_exp = 5, nnz_C_exp = 15; 108 aoclsparse_int csr_row_ptr_C_exp[] = {0, 4, 7, 10, 11, 15}; 109 aoclsparse_int csr_col_ind_C_exp[] = {0, 1, 2, 4, 0, 1, 2, 0, 1, 2, 2, 0, 1, 2, 3}; 110 aoclsparse_double_complex csr_val_C_exp[] = {{1.49084, -0.500145}, 111 {0.0426217, 1.05821}, 112 {3.11358, 2.93029}, 113 {1.13033, -1.04101}, 114 {-0.0014949, 1.19813}, 115 {1.40697, 1.07569}, 116 {-0.34163, 4.22229}, 117 {-1.59671, 0.652319}, 118 {-0.664445, 2.4615}, 119 {-4.89686, 1.70132}, 120 {-0.380707, 6.28532}, 121 {-3.15998, -0.570448}, 122 {-3.50293, 3.20298}, 123 {-2.77187, -4.11799}, 124 {2.13356, -0.980994}}; 125 126#ifdef TWO_STAGE_COMPUTATION 127 std::cout << "Invoking aoclsparse_sp2m with aoclsparse_stage_nnz_count..\n"; 128 // aoclsparse_stage_nnz_count : Only rowIndex array of the CSR matrix 129 // is computed internally. 130 request = aoclsparse_stage_nnz_count; 131 status = aoclsparse_sp2m(opA, descrA, csrA, opB, descrB, csrB, request, &csrC); 132 if(status != aoclsparse_status_success) 133 return 1; 134 135 std::cout << "Invoking aoclsparse_sp2m with aoclsparse_stage_finalize..\n"; 136 // aoclsparse_stage_finalize : Finalize computation of remaining 137 // output arrays ( column indices and values of output matrix entries) . 138 // Has to be called only after aoclsparse_sp2m call with 139 // aoclsparse_stage_nnz_count parameter. 140 request = aoclsparse_stage_finalize; 141 status = aoclsparse_sp2m(opA, descrA, csrA, opB, descrB, csrB, request, &csrC); 142 if(status != aoclsparse_status_success) 143 return 2; 144 145#else // SINGLE STAGE 146 std::cout << "Invoking aoclsparse_sp2m with aoclsparse_stage_full_computation..\n"; 147 // aoclsparse_stage_full_computation : Whole computation is performed in 148 // single step. 149 request = aoclsparse_stage_full_computation; 150 status = aoclsparse_sp2m(opA, descrA, csrA, opB, descrB, csrB, request, &csrC); 151 if(status != aoclsparse_status_success) 152 return 3; 153 154#endif 155 156 aoclsparse_export_zcsr( 157 csrC, &base, &C_M, &C_N, &nnz_C, &csr_row_ptr_C, &csr_col_ind_C, &csr_val_C); 158 // Check and print the result 159 std::cout << std::fixed; 160 std::cout.precision(1); 161 bool oka, okb, okc, oki, okj, okk, ok = true; 162 std::cout << std::endl 163 << "Output Matrix C: " << std::endl 164 << std::setw(11) << "C_M" << std::setw(3) << "" << std::setw(11) << "expected" 165 << std::setw(3) << "" << std::setw(11) << "C_N" << std::setw(3) << "" << std::setw(11) 166 << "expected" << std::setw(3) << "" << std::setw(11) << "nnz_C" << std::setw(3) << "" 167 << std::setw(11) << "expected" << std::endl; 168 oka = C_M == C_M_exp; 169 ok &= oka; 170 std::cout << std::setw(11) << C_M << std::setw(3) << "" << std::setw(11) << C_M_exp 171 << std::setw(2) << (oka ? "" : " !"); 172 okb = C_N == C_N_exp; 173 ok &= okb; 174 std::cout << std::setw(11) << C_N << std::setw(3) << "" << std::setw(11) << C_N_exp 175 << std::setw(2) << (okb ? "" : " !"); 176 okc = nnz_C == nnz_C_exp; 177 ok &= okc; 178 std::cout << std::setw(11) << nnz_C << std::setw(3) << "" << std::setw(11) << nnz_C_exp 179 << std::setw(2) << (okc ? "" : " !"); 180 std::cout << std::endl; 181 std::cout << std::endl; 182 std::cout << std::setw(11) << "csr_val_C" << std::setw(3) << "" << std::setw(11) << "expected" 183 << std::setw(3) << "" << std::setw(11) << "csr_col_ind_C" << std::setw(3) << "" 184 << std::setw(11) << "expected" << std::setw(3) << "" << std::setw(11) 185 << "csr_row_ptr_C" << std::setw(3) << "" << std::setw(11) << "expected" << std::endl; 186 //Initializing precision tolerance range for double 187 const double tol = 1e-03; 188 for(aoclsparse_int i = 0; i < nnz_C; i++) 189 { 190 oki = ((std::abs(csr_val_C[i].real - csr_val_C_exp[i].real) <= tol) 191 && (std::abs(csr_val_C[i].imag - csr_val_C_exp[i].imag) <= tol)); 192 ok &= oki; 193 std::cout << std::setw(11) << "(" << csr_val_C[i].real << ", " << csr_val_C[i].imag << "i) " 194 << std::setw(3) << "" << std::setw(11) << "(" << csr_val_C_exp[i].real << ", " 195 << csr_val_C_exp[i].imag << "i) " << std::setw(2) << (oki ? "" : " !"); 196 okj = csr_col_ind_C[i] == csr_col_ind_C_exp[i]; 197 ok &= okj; 198 std::cout << std::setw(11) << csr_col_ind_C[i] << std::setw(3) << "" << std::setw(11) 199 << csr_col_ind_C_exp[i] << std::setw(2) << (okj ? "" : " !"); 200 if(i < C_M) 201 { 202 okk = csr_row_ptr_C[i] == csr_row_ptr_C_exp[i]; 203 ok &= okk; 204 std::cout << " " << std::setw(11) << csr_row_ptr_C[i] << std::setw(3) << "" 205 << std::setw(11) << csr_row_ptr_C_exp[i] << std::setw(2) << (okk ? "" : " !"); 206 } 207 std::cout << std::endl; 208 } 209 210 aoclsparse_destroy_mat_descr(descrA); 211 aoclsparse_destroy_mat_descr(descrB); 212 aoclsparse_destroy(&csrA); 213 aoclsparse_destroy(&csrB); 214 aoclsparse_destroy(&csrC); 215 return (ok ? 0 : 6); 216}
- Parameters
opA – [in] matrix \(A\) operation type.
descrA – [in] descriptor of the sparse CSR matrix \(A\). Currently, only aoclsparse_matrix_type_general is supported.
A – [in] sparse CSR matrix \(A\) .
opB – [in] matrix \(B\) operation type.
descrB – [in] descriptor of the sparse CSR matrix \(B\). Currently, only aoclsparse_matrix_type_general is supported.
B – [in] sparse CSR matrix \(B\) .
request – [in] Specifies full computation or two-stage algorithm aoclsparse_stage_nnz_count , Only rowIndex array of the CSR matrix is computed internally. The output sparse CSR matrix can be extracted to measure the memory required for full operation. aoclsparse_stage_finalize . Finalize computation of remaining output arrays ( column indices and values of output matrix entries) . Has to be called only after aoclsparse_sp2m call with aoclsparse_stage_nnz_count parameter. aoclsparse_stage_full_computation . Perform the entire computation in a single step.
*C – [out] Pointer to sparse CSR matrix \(C\) . Matrix \(C\) arrays will always have zero-based indexing, irrespective of matrix \(A\) or matrix \(B\) being one-based or zero-based indexing. The column indices of the output matrix in CSR format can appear unsorted.
- Return values
aoclsparse_status_success – the operation completed successfully.
aoclsparse_status_invalid_pointer –
descrA,descrB,A,B,Cis invalid.aoclsparse_status_invalid_size – input size parameters contain an invalid value.
aoclsparse_status_invalid_value – input parameters contain an invalid value.
aoclsparse_status_wrong_type – A and B matrix datatypes dont match.
aoclsparse_status_memory_error – Memory allocation failure.
aoclsparse_status_not_implemented – aoclsparse_matrix_type is not aoclsparse_matrix_type_general or input matrices
AorBis not in CSR format
aoclsparse_spmm()#
-
aoclsparse_status aoclsparse_spmm(aoclsparse_operation opA, const aoclsparse_matrix A, const aoclsparse_matrix B, aoclsparse_matrix *C)#
Sparse matrix Sparse matrix multiplication for real and complex datatypes.
aoclsparse_?spmmmultiplies two sparse matrices in CSR storage format. The result is stored in a newly allocated sparse matrix in CSR format, such that\[\begin{split}C = op(A) \cdot B, \text{ with } op(A) = \left\{ \begin{array}{ll} A, & \text{ if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{ if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{ if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right.\end{split}\]where \(A\) is a \(m \times k\) matrix , \(B\) is a \(k \times n\) matrix, resulting in \(m \times n\) matrix \(C\), foropA= aoclsparse_operation_none. \(A\) is a \(k \times m\) matrix when opA = aoclsparse_operation_transpose or aoclsparse_operation_conjugate_transpose- Parameters
opA – [in] matrix \(A\) operation type.
A – [in] sparse CSR matrix \(A\).
B – [in] sparse CSR matrix \(B\).
*C – [out] Pointer to sparse CSR matrix \(C\) . Matrix \(C\) arrays will always have zero-based indexing, irrespective of matrix \(A\) or matrix \(B\) being one-based or zero-based indexing. The column indices of the output matrix in CSR format can appear unsorted.
- Return values
aoclsparse_status_success – the operation completed successfully.
aoclsparse_status_invalid_pointer –
A,B,Cis invalid.aoclsparse_status_invalid_size – input size parameters contain an invalid value.
aoclsparse_status_invalid_value – input parameters contain an invalid value.
aoclsparse_status_wrong_type –
AandBmatrix data types do not match.aoclsparse_status_memory_error – Memory allocation failure.
aoclsparse_status_not_implemented – Input matrices
AorBis not in CSR format
aoclsparse_?csrmm()#
-
aoclsparse_status aoclsparse_scsrmm(aoclsparse_operation op, const float alpha, const aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const float *B, aoclsparse_int n, aoclsparse_int ldb, const float beta, float *C, aoclsparse_int ldc)#
-
aoclsparse_status aoclsparse_dcsrmm(aoclsparse_operation op, const double alpha, const aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const double *B, aoclsparse_int n, aoclsparse_int ldb, const double beta, double *C, aoclsparse_int ldc)#
-
aoclsparse_status aoclsparse_ccsrmm(aoclsparse_operation op, const aoclsparse_float_complex alpha, const aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const aoclsparse_float_complex *B, aoclsparse_int n, aoclsparse_int ldb, const aoclsparse_float_complex beta, aoclsparse_float_complex *C, aoclsparse_int ldc)#
-
aoclsparse_status aoclsparse_zcsrmm(aoclsparse_operation op, const aoclsparse_double_complex alpha, const aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const aoclsparse_double_complex *B, aoclsparse_int n, aoclsparse_int ldb, const aoclsparse_double_complex beta, aoclsparse_double_complex *C, aoclsparse_int ldc)#
Sparse matrix dense matrix multiplication using CSR storage format.
aoclsparse_?csrmmmultiplies a scalar \(\alpha\) with a sparse \(m \times k\) matrix \(A\), defined in CSR storage format, and a dense \(k \times n\) matrix \(B\) and adds the result to the dense \(m \times n\) matrix \(C\) that is multiplied by a scalar \(\beta\), such that\[\begin{split} C = \alpha \, op(A) \, B + \beta \, C, \quad \text{ with } \quad op(A) = \left\{ \begin{array}{ll} A, & \text{ if } {\bf\mathsf{trans}\_\mathsf{A}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{ if } {\bf\mathsf{trans}\_\mathsf{A}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{ if } {\bf\mathsf{trans}\_\mathsf{A}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]1/* ************************************************************************ 2 * Copyright (c) 2023 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <complex> 27#include <cstring> 28#include <iomanip> 29#include <iostream> 30#include <limits> 31#include <vector> 32 33// Example to illustrate the usage of aoclsparse_dcsrmm API 34int main(void) 35{ 36 37 aoclsparse_index_base base = aoclsparse_index_base_zero; 38 aoclsparse_matrix A; 39 aoclsparse_status status; 40 aoclsparse_order order = aoclsparse_order_row; 41 aoclsparse_mat_descr descr; 42 43 // By default aoclsparse_create_mat_descr sets aoclsparse_matrix_type to aoclsparse_matrix_type_general 44 status = aoclsparse_create_mat_descr(&descr); 45 if(status != aoclsparse_status_success) 46 return status; 47 48 // Set aoclsparse_index_base to aoclsparse_index_base_zero. 49 status = aoclsparse_set_mat_index_base(descr, base); 50 if(status != aoclsparse_status_success) 51 return status; 52 53 // Matrix sizes 54 aoclsparse_int m = 3, n = 3, k = 3; 55 aoclsparse_int nnz = 4; 56 // Matrix A 57 // [ 0. 42. 0.2] 58 // [ 4.6 0. 0. ] 59 // [ 0. 0. -8. ] 60 aoclsparse_int row_ptr[] = {0, 2, 3, 4}; 61 aoclsparse_int col_ind[] = {1, 2, 0, 2}; 62 double csr_val[] = {42., 0.2, 4.6, -8}; 63 64 //Dense Matrix B 65 double B[] = {-1.0, -2.7, 3.0, 4.5, 5.8, -6.0, 1.0, -2.0, 3.0}; 66 double alpha = 1, beta = 0; 67 status = aoclsparse_create_dcsr(&A, base, m, k, nnz, row_ptr, col_ind, csr_val); 68 if(status != aoclsparse_status_success) 69 { 70 return status; 71 } 72 73 // expected output matrices 74 double C_exp[] = {189.2, 243.2, -251.4, -4.6, -12.42, 13.8, -8, 16, -24}; 75 76 // output matrices; 77 double C[9] = {0}; 78 aoclsparse_operation op = aoclsparse_operation_none; 79 80 status = aoclsparse_dcsrmm(op, alpha, A, descr, order, B, n, k, beta, C, n); 81 if(status != aoclsparse_status_success) 82 { 83 return status; 84 } 85 86 std::cout << std::fixed; 87 std::cout.precision(2); 88 bool okij, ok_C = true; 89 //Initializing precision tolerance range for double 90 const double tol = std::sqrt((std::numeric_limits<double>::epsilon())); 91 92 aoclsparse_int i, j; 93 std::cout << "The output C matrix\n"; 94 for(i = 0; i < m; i++) 95 { 96 for(j = 0; j < n; j++) 97 { 98 okij = std::abs(C[i * n + j] - C_exp[i * n + j]) <= tol; 99 ok_C &= okij; 100 std::cout << std::setw(10) << C[i * n + j] << std::setw(3) << (okij ? "" : " !"); 101 } 102 std::cout << "\n"; 103 } 104 aoclsparse_destroy_mat_descr(descr); 105 aoclsparse_destroy(&A); 106 107 return ((ok_C ? 0 : 4)); 108}
- Parameters
op – [in] Matrix \(A\) operation type.
alpha – [in] Scalar \(\alpha\).
A – [in] Sparse CSR matrix \(A\) structure.
descr – [in] descriptor of the sparse CSR matrix \(A\). Currently, supports aoclsparse_matrix_type_general, aoclsparse_matrix_type_symmetric, and aoclsparse_matrix_type_hermitian matrices. Both, base-zero and base-one input arrays of CSR matrix are supported.
order – [in] aoclsparse_order_row / aoclsparse_order_column for dense matrix
B – [in] Array of dimension \(ldb \times n\) or \(ldb \times k\) .
n – [in] Number of columns of the dense matrix \(B\) and \(C\).
ldb – [in] Leading dimension of \(B\), must be at least \(\max{(1, k)}\) for \(op(A) = A\), or \(\max{(1, m)}\) when \(op(A) = A^T\) or \(op(A) = A^H\).
beta – [in] Scalar \(\beta\).
C – [inout] Array of dimension \(ldc \times n\).
ldc – [in] Leading dimension of \(C\), must be at least \(\max{(1, m)}\) for \(op(A) = A\), or \(\max{(1, k)}\) when \(op(A) = A^T\) or \(op(A) = A^H\).
- Return values
aoclsparse_status_success – The operation completed successfully.
aoclsparse_status_invalid_size – The value of
m,n,k,nnz,ldborldcis invalid.aoclsparse_status_invalid_pointer – The pointer
descr,A,B, orCis invalid.aoclsparse_status_invalid_value – The values of
descr->baseandA->basedo not coincide.aoclsparse_status_not_implemented – aoclsparse_matrix_type is not one of these: aoclsparse_matrix_type_general, aoclsparse_matrix_type_symmetric, aoclsparse_matrix_type_hermitian or input matrix
Ais not in CSR format
aoclsparse_?csr2m()#
-
aoclsparse_status aoclsparse_dcsr2m(aoclsparse_operation trans_A, const aoclsparse_mat_descr descrA, const aoclsparse_matrix csrA, aoclsparse_operation trans_B, const aoclsparse_mat_descr descrB, const aoclsparse_matrix csrB, const aoclsparse_request request, aoclsparse_matrix *csrC)#
-
aoclsparse_status aoclsparse_scsr2m(aoclsparse_operation trans_A, const aoclsparse_mat_descr descrA, const aoclsparse_matrix csrA, aoclsparse_operation trans_B, const aoclsparse_mat_descr descrB, const aoclsparse_matrix csrB, const aoclsparse_request request, aoclsparse_matrix *csrC)#
Sparse matrix Sparse matrix multiplication using CSR storage format for single and double precision datatypes.
aoclsparse_?csr2mmultiplies a sparse \(m \times k\) matrix \(A\), defined in CSR storage format, and the sparse \(k \times n\) matrix \(B\), defined in CSR storage format and stores the result to the sparse \(m \times n\) matrix \(C\), such that\[ C = op(A) \cdot op(B), \]with\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}\_\mathsf{A}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}\_\mathsf{A}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}\_\mathsf{A}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]and\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if } {\bf\mathsf{trans}\_\mathsf{B}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ B^T, & \text{if } {\bf\mathsf{trans}\_\mathsf{B}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ B^H, & \text{if } {\bf\mathsf{trans}\_\mathsf{B}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]1/* ************************************************************************ 2 * Copyright (c) 2021-2023 Advanced Micro Devices, Inc. All rights reserved. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <cstdlib> 27#include <cstring> 28#include <iostream> 29 30/* Computes multiplication of 2 sparse matrices A and B in CSR format. */ 31 32// Comment this out for single stage computation 33#define TWO_STAGE_COMPUTATION 34 35#define PRINT_OUTPUT 36 37int main(void) 38{ 39 aoclsparse_status status; 40 aoclsparse_int nnz_C; 41 aoclsparse_request request; 42 aoclsparse_index_base base = aoclsparse_index_base_zero; 43 aoclsparse_operation transA = aoclsparse_operation_none; 44 aoclsparse_operation transB = aoclsparse_operation_none; 45 46 // Print aoclsparse version 47 std::cout << aoclsparse_get_version() << std::endl; 48 49 // Initialise matrix descriptor and csr matrix structure of inputs A and B 50 aoclsparse_mat_descr descrA; 51 aoclsparse_mat_descr descrB; 52 aoclsparse_matrix csrA; 53 aoclsparse_matrix csrB; 54 55 // Create matrix descriptor of input matrices 56 // aoclsparse_create_mat_descr set aoclsparse_matrix_type to aoclsparse_matrix_type_general 57 // and aoclsparse_index_base to aoclsparse_index_base_zero. 58 aoclsparse_create_mat_descr(&descrA); 59 aoclsparse_create_mat_descr(&descrB); 60 61 // Matrix sizes 62 aoclsparse_int m = 3, n = 3, k = 3; 63 aoclsparse_int nnz_A = 6, nnz_B = 4; 64 // Matrix A 65 // 1 0 2 66 // 0 0 3 67 // 4 5 6 68 aoclsparse_int row_ptr_A[] = {0, 2, 3, 6}; 69 aoclsparse_int col_ind_A[] = {0, 2, 2, 0, 1, 2}; 70 float val_A[] = {1, 2, 3, 4, 5, 6}; 71 aoclsparse_create_scsr(&csrA, base, m, k, nnz_A, row_ptr_A, col_ind_A, val_A); 72 73 // Matrix B 74 // 1 2 0 75 // 0 0 3 76 // 0 4 0 77 aoclsparse_int row_ptr_B[] = {0, 2, 3, 4}; 78 aoclsparse_int col_ind_B[] = {0, 1, 2, 1}; 79 float val_B[] = {1, 2, 3, 4}; 80 aoclsparse_create_scsr(&csrB, base, k, n, nnz_B, row_ptr_B, col_ind_B, val_B); 81 82 aoclsparse_matrix csrC = NULL; 83 aoclsparse_int *csr_row_ptr_C = NULL; 84 aoclsparse_int *csr_col_ind_C = NULL; 85 float *csr_val_C = NULL; 86 aoclsparse_int C_M, C_N; 87 88#ifdef TWO_STAGE_COMPUTATION 89 std::cout << "Invoking aoclsparse_scsr2m with aoclsparse_stage_nnz_count.."; 90 // aoclsparse_stage_nnz_count : Only rowIndex array of the CSR matrix 91 // is computed internally. The output sparse CSR matrix can be 92 // extracted to measure the memory required for full operation. 93 request = aoclsparse_stage_nnz_count; 94 status = aoclsparse_scsr2m(transA, descrA, csrA, transB, descrB, csrB, request, &csrC); 95 if(status == aoclsparse_status_success) 96 std::cout << "DONE\n"; 97 else 98 { 99 std::cout << "ERROR in aoclsparse_scsr2m\n"; 100 exit(EXIT_FAILURE); 101 } 102 103 aoclsparse_export_scsr( 104 csrC, &base, &C_M, &C_N, &nnz_C, &csr_row_ptr_C, &csr_col_ind_C, &csr_val_C); 105 106#if 0 107 std::cout << "C_M" 108 << "\t" 109 << "C_N" 110 << "\t" 111 << "nnz_C" << std::endl; 112 std::cout << C_M << "\t" << C_N << "\t" << nnz_C << std::endl; 113 114 std::cout << "csr_row_ptr_C: " << std::endl; 115 for(aoclsparse_int i = 0; i < C_M + 1; i++) 116 std::cout << csr_row_ptr_C[i] << "\t"; 117 std::cout << std::endl; 118#endif 119 120 std::cout << "Invoking aoclsparse_scsr2m with aoclsparse_stage_finalize.."; 121 // aoclsparse_stage_finalize : Finalize computation of remaining 122 // output arrays ( column indices and values of output matrix entries) . 123 // Has to be called only after aoclsparse_scsr2m call with 124 // aoclsparse_stage_nnz_count parameter. 125 request = aoclsparse_stage_finalize; 126 status = aoclsparse_scsr2m(transA, descrA, csrA, transB, descrB, csrB, request, &csrC); 127 if(status == aoclsparse_status_success) 128 std::cout << "DONE\n"; 129 else 130 { 131 std::cout << "ERROR in aoclsparse_scsr2m\n"; 132 exit(EXIT_FAILURE); 133 } 134 aoclsparse_export_scsr( 135 csrC, &base, &C_M, &C_N, &nnz_C, &csr_row_ptr_C, &csr_col_ind_C, &csr_val_C); 136 137#ifdef PRINT_OUTPUT 138 std::cout << "csr_col_ind_C: " << std::endl; 139 for(aoclsparse_int i = 0; i < nnz_C; i++) 140 std::cout << csr_col_ind_C[i] << "\t"; 141 std::cout << std::endl; 142 143 std::cout << "csr_val_C: " << std::endl; 144 for(aoclsparse_int i = 0; i < nnz_C; i++) 145 std::cout << csr_val_C[i] << "\t"; 146 std::cout << std::endl; 147#endif 148 149#else // SINGLE STAGE 150 std::cout << "Invoking aoclsparse_scsr2m with aoclsparse_stage_full_computation.."; 151 // aoclsparse_stage_full_computation : Whole computation is performed in 152 // single step. 153 request = aoclsparse_stage_full_computation; 154 status = aoclsparse_scsr2m(transA, descrA, csrA, transB, descrB, csrB, request, &csrC); 155 if(status == aoclsparse_status_success) 156 std::cout << "DONE\n"; 157 else 158 { 159 std::cout << "ERROR in aoclsparse_scsr2m\n"; 160 exit(EXIT_FAILURE); 161 } 162 163 aoclsparse_export_scsr( 164 csrC, &base, &C_M, &C_N, &nnz_C, &csr_row_ptr_C, &csr_col_ind_C, &csr_val_C); 165 166#ifdef PRINT_OUTPUT 167 std::cout << "C_M" 168 << "\t" 169 << "C_N" 170 << "\t" 171 << "nnz_C" << std::endl; 172 std::cout << C_M << "\t" << C_N << "\t" << nnz_C << std::endl; 173 174 std::cout << "csr_row_ptr_C: " << std::endl; 175 for(aoclsparse_int i = 0; i < C_M + 1; i++) 176 std::cout << csr_row_ptr_C[i] << "\t"; 177 std::cout << std::endl; 178 179 std::cout << "csr_col_ind_C: " << std::endl; 180 for(aoclsparse_int i = 0; i < nnz_C; i++) 181 std::cout << csr_col_ind_C[i] << "\t"; 182 std::cout << std::endl; 183 184 std::cout << "csr_val_C: " << std::endl; 185 for(aoclsparse_int i = 0; i < nnz_C; i++) 186 std::cout << csr_val_C[i] << "\t"; 187 std::cout << std::endl; 188#endif 189 190#endif 191 aoclsparse_destroy_mat_descr(descrA); 192 aoclsparse_destroy_mat_descr(descrB); 193 aoclsparse_destroy(&csrA); 194 aoclsparse_destroy(&csrB); 195 aoclsparse_destroy(&csrC); 196 return 0; 197}
- Parameters
trans_A – [in] matrix \(A\) operation type.
descrA – [in] descriptor of the sparse CSR matrix \(A\). Currently, only aoclsparse_matrix_type_general is supported.
csrA – [in] sparse CSR matrix \(A\) structure.
trans_B – [in] matrix \(B\) operation type.
descrB – [in] descriptor of the sparse CSR matrix \(B\). Currently, only aoclsparse_matrix_type_general is supported.
csrB – [in] sparse CSR matrix \(B\) structure.
request – [in] Specifies full computation or two-stage algorithm aoclsparse_stage_nnz_count , Only rowIndex array of the CSR matrix is computed internally. The output sparse CSR matrix can be extracted to measure the memory required for full operation. aoclsparse_stage_finalize . Finalize computation of remaining output arrays ( column indices and values of output matrix entries) . Has to be called only after aoclsparse_dcsr2m() call with
aoclsparse_stage_nnz_countparameter. aoclsparse_stage_full_computation. Perform the entire computation in a single step.*csrC – [out] Pointer to sparse CSR matrix \(C\) structure.
- Return values
aoclsparse_status_success – the operation completed successfully.
aoclsparse_status_invalid_size – input parameters contain an invalid value.
aoclsparse_status_invalid_pointer –
descrA,csr,descrB,csrB,csrCis invalid.aoclsparse_status_not_implemented – aoclsparse_matrix_type is not aoclsparse_matrix_type_general or input matrices
AorBis not in CSR format
aoclsparse_?add()#
-
aoclsparse_status aoclsparse_sadd(const aoclsparse_operation op, const aoclsparse_matrix A, const float alpha, const aoclsparse_matrix B, aoclsparse_matrix *C)#
-
aoclsparse_status aoclsparse_dadd(const aoclsparse_operation op, const aoclsparse_matrix A, const double alpha, const aoclsparse_matrix B, aoclsparse_matrix *C)#
-
aoclsparse_status aoclsparse_cadd(const aoclsparse_operation op, const aoclsparse_matrix A, const aoclsparse_float_complex alpha, const aoclsparse_matrix B, aoclsparse_matrix *C)#
-
aoclsparse_status aoclsparse_zadd(const aoclsparse_operation op, const aoclsparse_matrix A, const aoclsparse_double_complex alpha, const aoclsparse_matrix B, aoclsparse_matrix *C)#
Addition of two sparse matrices.
aoclsparse_?addadds two sparse matrices and returns a sparse matrix. Matrices can be either real or complex types but cannot be intermixed. It performs\[\begin{split} C = \alpha \, op(A) + B \qquad\text{ with } \qquad op(A) = \left\{ \begin{array}{ll} A, & \text{ if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{ if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{ if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]where \(A\) is a \(m \times n\) matrix and \(B\) is a \(m \times n\) matrix, ifop= aoclsparse_operation_none. Otherwise \(A\) is \(n \times m\) and the result matrix \(C\) has the same dimension as \(B\).Note
Only matrices in CSR format are supported in this release.
- Parameters
op – [in] matrix \(A\) operation type.
alpha – [in] scalar with same precision as \(A\) and \(B\) matrix
A – [in] source sparse matrix \(A\)
B – [in] source sparse matrix \(B\)
*C – [out] pointer to the sparse output matrix \(C\)
- Return values
aoclsparse_status_success – The operation completed successfully.
aoclsparse_status_invalid_pointer –
AorBorCare invalidaoclsparse_status_invalid_size – The dimensions of
AandBare not compatible.aoclsparse_status_internal_error – Internal Error Occured
aoclsparse_status_memory_error – Memory allocation failure.
aoclsparse_status_not_implemented – Matrices are not in CSR format.
aoclsparse_?spmmd()#
-
aoclsparse_status aoclsparse_sspmmd(const aoclsparse_operation op, const aoclsparse_matrix A, const aoclsparse_matrix B, const aoclsparse_order layout, float *C, const aoclsparse_int ldc)#
-
aoclsparse_status aoclsparse_dspmmd(const aoclsparse_operation op, const aoclsparse_matrix A, const aoclsparse_matrix B, const aoclsparse_order layout, double *C, const aoclsparse_int ldc)#
-
aoclsparse_status aoclsparse_cspmmd(const aoclsparse_operation op, const aoclsparse_matrix A, const aoclsparse_matrix B, const aoclsparse_order layout, aoclsparse_float_complex *C, const aoclsparse_int ldc)#
-
aoclsparse_status aoclsparse_zspmmd(const aoclsparse_operation op, const aoclsparse_matrix A, const aoclsparse_matrix B, const aoclsparse_order layout, aoclsparse_double_complex *C, const aoclsparse_int ldc)#
Matrix multiplication of two sparse matrices stored in the CSR storage format. The output matrix is stored in a dense format.
aoclsparse_?spmmdmultiplies a sparse matrix \(A\) and a sparse matrix \(B\), both stored in the CSR storage format, and saves the result in a dense matrix \(C\), such that\[ C := op(A) \cdot B, \]with\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if op} = \text{aoclsparse\_operation\_none} \\ A^T, & \text{if op} = \text{aoclsparse\_operation\_transpose} \\ A^H, & \text{if op} = \text{aoclsparse\_operation\_conjugate\_transpose} \end{array} \right. \end{split}\]- Parameters
op – [in] Operation to perform on matrix \(A\).
A – [in] Matrix structure containing sparse matrix \(A\) of size \(m \times k\).
B – [in] Matrix structure containing sparse matrix \(B\) of size \(k \times n\) if
opis aoclsparse_operation_none otherwise of size \(m \times n\).layout – [in] Ordering of the dense output matrix: valid values are aoclsparse_order_row and aoclsparse_order_column.
C – [inout] Dense output matrix \(C\) of size \(m \times n\) if
opis aoclsparse_operation_none, otherwise of size \(k \times n\) containing the matrix-matrix product of \(A\) and \(B\).ldc – [in] Leading dimension of \(C\), e.g., for C stored in
aoclsparse_order_row,ldcmust be at least \(\max{(1, m)}\) when \(op(A) = A\), or \(\max{(1, k)}\) if \(op(A) = A^T\) or \(op(A) = A^H\).
- Return values
aoclsparse_status_success – The operation completed successfully.
aoclsparse_status_invalid_size –
m,n,k,nnzorldcis not valid.aoclsparse_status_invalid_pointer –
A,BorCpointer is not valid.aoclsparse_status_wrong_type – aoclsparse_matrix_data_type does not match the precision type.
aoclsparse_status_not_implemented – aoclsparse_matrix_format_type is not aoclsparse_csr_mat.
aoclsparse_?sp2md()#
-
aoclsparse_status aoclsparse_ssp2md(const aoclsparse_operation opA, const aoclsparse_mat_descr descrA, const aoclsparse_matrix A, const aoclsparse_operation opB, const aoclsparse_mat_descr descrB, const aoclsparse_matrix B, const float alpha, const float beta, float *C, const aoclsparse_order layout, const aoclsparse_int ldc)#
-
aoclsparse_status aoclsparse_dsp2md(const aoclsparse_operation opA, const aoclsparse_mat_descr descrA, const aoclsparse_matrix A, const aoclsparse_operation opB, const aoclsparse_mat_descr descrB, const aoclsparse_matrix B, const double alpha, const double beta, double *C, const aoclsparse_order layout, const aoclsparse_int ldc)#
-
aoclsparse_status aoclsparse_csp2md(const aoclsparse_operation opA, const aoclsparse_mat_descr descrA, const aoclsparse_matrix A, const aoclsparse_operation opB, const aoclsparse_mat_descr descrB, const aoclsparse_matrix B, aoclsparse_float_complex alpha, aoclsparse_float_complex beta, aoclsparse_float_complex *C, const aoclsparse_order layout, const aoclsparse_int ldc)#
-
aoclsparse_status aoclsparse_zsp2md(const aoclsparse_operation opA, const aoclsparse_mat_descr descrA, const aoclsparse_matrix A, const aoclsparse_operation opB, const aoclsparse_mat_descr descrB, const aoclsparse_matrix B, aoclsparse_double_complex alpha, aoclsparse_double_complex beta, aoclsparse_double_complex *C, const aoclsparse_order layout, const aoclsparse_int ldc)#
A variant of matrix multiplication of two sparse matrices stored in the CSR storage format. The output matrix is stored in a dense format. Supports operations on both sparse matrices.
aoclsparse_?sp2mdmultiplies a sparse matrix \(A\) and a sparse matrix \(B\), both stored in the CSR storage format, and saves the result in a dense matrix \(C\), such that\[ C := \alpha \cdot op(A) \cdot op(B) + \beta \cdot C, \]with\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if opA} = \text{aoclsparse\_operation\_none} \\ A^T, & \text{if opA} = \text{aoclsparse\_operation\_transpose} \\ A^H, & \text{if opA} = \text{aoclsparse\_operation\_conjugate\_transpose} \end{array} \right. \end{split}\]and\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if opB} = \text{aoclsparse\_operation\_none} \\ B^T, & \text{if opB} = \text{aoclsparse\_operation\_transpose} \\ B^H, & \text{if opB} = \text{aoclsparse\_operation\_conjugate\_transpose} \end{array} \right. \end{split}\]- Parameters
opA – [in] Operation to perform on matrix \(A\).
descrA – [in] Descriptor of A. Only aoclsparse_matrix_type_general is supported at present. As a consequence, all other parameters within the descriptor are ignored.
A – [in] Matrix structure containing sparse matrix \(A\) of size \(m \times k\).
opB – [in] Operation to perform on matrix \(B\).
descrB – [in] Descriptor of B. Only aoclsparse_matrix_type_general is supported at present. As a consequence, all other parameters within the descriptor are ignored.
B – [in] Matrix structure containing sparse matrix \(B\) of size \(k \times n\) if
opis aoclsparse_operation_none otherwise of size \(m \times n\).alpha – [in] Value of \( \alpha\).
beta – [in] Value of \( \beta\).
C – [inout] Dense output matrix \(C\).
layout – [in] Ordering of the dense output matrix: valid values are aoclsparse_order_row and aoclsparse_order_column.
ldc – [in] Leading dimension of \(C\), e.g., for C stored in
aoclsparse_order_row,ldcmust be at least \(\max{(1, m)}\) ( \(op(A) = A\)) or \(\max{(1, k)}\) ( \(op(A) = A^T\) or \(op(A) = A^H\)).
- Return values
aoclsparse_status_success – The operation completed successfully.
aoclsparse_status_invalid_size –
m,n,k,nnzorldcis not valid.aoclsparse_status_invalid_pointer –
A,BorCpointer is not valid.aoclsparse_status_wrong_type – aoclsparse_matrix_data_type does not match the precision type.
aoclsparse_status_not_implemented – aoclsparse_matrix_format_type is not aoclsparse_csr_mat.
aoclsparse_status_internal_error – An internal error occurred.
aoclsparse_syrk()#
-
aoclsparse_status aoclsparse_syrk(const aoclsparse_operation opA, const aoclsparse_matrix A, aoclsparse_matrix *C)#
Multiplication of a sparse matrix and its transpose (or conjugate transpose) stored as a sparse matrix.
aoclsparse_syrkmultiplies a sparse matrix with its transpose (or conjugate transpose) in CSR storage format. The result is stored in a newly allocated sparse matrix in CSR format, such that\[ C := A \cdot op(A) \]if \(opA\) is aoclsparse_operation_none.Otherwise,
\[ C := op(A) \cdot A, \]where
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A^T, & \text{transpose of } {\bf\mathsf{A}} \text{ for real matrices}\\ A^H, & \text{conjugate transpose of } {\bf\mathsf{A}} \text{ for complex matrices}\\ \end{array} \right. \end{split}\]where \(A\) is a \(m \times n\) matrix, opA is one of aoclsparse_operation_none, aoclsparse_operation_transpose (for real matrices) or aoclsparse_operation_conjugate_transpose (for complex matrices). The output matrix \(C\) is a sparse symmetric (or Hermitian) matrix stored as an upper triangular matrix in CSR format.
1/* ************************************************************************ 2 * Copyright (c) 2024 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <complex> 27#include <cstring> 28#include <iomanip> 29#include <iostream> 30#include <limits> 31#include <vector> 32 33// An example to illustrate the usage of aoclsparse_syrk 34int main(void) 35{ 36 std::cout << "-------------------------------" << std::endl 37 << "----- syrk sample program -----" << std::endl 38 << "-------------------------------" << std::endl 39 << std::endl; 40 41 aoclsparse_matrix A; 42 aoclsparse_matrix C; 43 aoclsparse_status status; 44 aoclsparse_index_base base = aoclsparse_index_base_zero; 45 aoclsparse_int *csr_row_ptr_C = NULL; 46 aoclsparse_int *csr_col_ind_C = NULL; 47 double *csr_val_C = NULL; 48 aoclsparse_int C_m, C_n, C_nnz; 49 50 // Matrix sizes 51 aoclsparse_int m = 4, k = 3; 52 aoclsparse_int nnz = 7; 53 // Matrix A 54 // [ 0. -1.2 2.3] 55 // [ 4.6 0. 0. ] 56 // [ 0. 3.0 -8.1 ] 57 // [ 0.3 0. -5.1 ] 58 aoclsparse_int row_ptr[] = {0, 2, 3, 5, 7}; 59 aoclsparse_int col_ind[] = {1, 2, 0, 1, 2, 0, 2}; 60 double csr_val[] = {-1.2, 2.3, 4.6, 3.0, -8.1, 0.3, -5.1}; 61 aoclsparse_operation op = aoclsparse_operation_none; 62 63 status = aoclsparse_create_dcsr(&A, base, m, k, nnz, row_ptr, col_ind, csr_val); 64 if(status != aoclsparse_status_success) 65 { 66 return status; 67 } 68 69 // expected output matrices 70 aoclsparse_int C_m_exp = 4, C_n_exp = 4, C_nnz_exp = 8; 71 aoclsparse_int csr_row_ptr_C_exp[] = {0, 3, 5, 7, 8}; 72 aoclsparse_int csr_col_ind_C_exp[] = {0, 2, 3, 1, 3, 2, 3, 3}; 73 double csr_val_C_exp[] 74 = {6.73000, -22.23000, -11.73000, 21.16000, 1.38000, 74.61000, 41.31000, 26.10000}; 75 76 // output matrices; 77 78 status = aoclsparse_syrk(op, A, &C); 79 if(status != aoclsparse_status_success) 80 { 81 return status; 82 } 83 aoclsparse_export_dcsr( 84 C, &base, &C_m, &C_n, &C_nnz, &csr_row_ptr_C, &csr_col_ind_C, &csr_val_C); 85 86 bool oka, okb, okc, oki, okj, okk, ok = true; 87 std::cout << std::endl 88 << "Output Matrix C: " << std::endl 89 << std::setw(11) << "C_m" << std::setw(3) << "" << std::setw(11) << "expected" 90 << std::setw(3) << "" << std::setw(11) << "C_n" << std::setw(3) << "" << std::setw(11) 91 << "expected" << std::setw(3) << "" << std::setw(11) << "C_nnz" << std::setw(3) << "" 92 << std::setw(11) << "expected" << std::endl; 93 oka = C_m == C_m_exp; 94 ok &= oka; 95 std::cout << std::setw(11) << C_m << std::setw(3) << "" << std::setw(11) << C_m_exp 96 << std::setw(2) << (oka ? "" : " !"); 97 okb = C_n == C_n_exp; 98 ok &= okb; 99 std::cout << std::setw(11) << C_n << std::setw(3) << "" << std::setw(11) << C_n_exp 100 << std::setw(2) << (okb ? "" : " !"); 101 okc = C_nnz == C_nnz_exp; 102 ok &= okc; 103 std::cout << std::setw(11) << C_nnz << std::setw(3) << "" << std::setw(11) << C_nnz_exp 104 << std::setw(2) << (okc ? "" : " !"); 105 std::cout << std::endl; 106 std::cout << std::endl; 107 std::cout << std::setw(11) << "csr_val_C" << std::setw(3) << "" << std::setw(11) << "expected" 108 << std::setw(3) << "" << std::setw(11) << "csr_col_ind_C" << std::setw(3) << "" 109 << std::setw(11) << "expected" << std::setw(3) << "" << std::setw(11) 110 << "csr_row_ptr_C" << std::setw(3) << "" << std::setw(11) << "expected" << std::endl; 111 //Initializing precision tolerance range for double 112 const double tol = 1e-03; 113 for(aoclsparse_int i = 0; i < C_nnz; i++) 114 { 115 oki = ((std::abs(csr_val_C[i] - csr_val_C_exp[i]) <= tol)); 116 ok &= oki; 117 std::cout << std::setw(11) << csr_val_C[i] << "" << std::setw(11) << csr_val_C_exp[i] 118 << (oki ? "" : " !"); 119 okj = csr_col_ind_C[i] == csr_col_ind_C_exp[i]; 120 ok &= okj; 121 std::cout << std::setw(11) << csr_col_ind_C[i] << std::setw(3) << "" << std::setw(11) 122 << csr_col_ind_C_exp[i] << std::setw(2) << (okj ? "" : " !"); 123 if(i <= C_m) 124 { 125 okk = csr_row_ptr_C[i] == csr_row_ptr_C_exp[i]; 126 ok &= okk; 127 std::cout << " " << std::setw(11) << csr_row_ptr_C[i] << std::setw(3) << "" 128 << std::setw(11) << csr_row_ptr_C_exp[i] << std::setw(2) << (okk ? "" : " !"); 129 } 130 std::cout << std::endl; 131 } 132 aoclsparse_destroy(&A); 133 aoclsparse_destroy(&C); 134 135 return (ok ? 0 : 6); 136}
Note
aoclsparse_syrkassumes that the input CSR matrix has sorted column indices in each row. If not, call aoclsparse_order_mat() before callingaoclsparse_syrk.Note
aoclsparse_syrkcurrently does not support aoclsparse_operation_transpose for complexA.- Parameters
opA – [in] Matrix \(A\) operation type.
A – [in] Sorted sparse CSR matrix \(A\).
*C – [out] Pointer to the new sparse CSR symmetric/Hermitian matrix \(C\). Only upper triangle of the result matrix is computed. The column indices of the output matrix in CSR format might be unsorted. The matrix should be freed by aoclsparse_destroy() when no longer needed.
- Return values
aoclsparse_status_success – The operation completed successfully.
aoclsparse_status_invalid_pointer –
A,Cis invalid.aoclsparse_status_wrong_type – A and its operation type do not match.
aoclsparse_status_not_implemented – The input matrix is not in the CSR format or
opAis aoclsparse_operation_transpose andAhas complex values.aoclsparse_status_invalid_value – The value of opA is invalid.
aoclsparse_status_unsorted_input – Input matrices are not sorted.
aoclsparse_status_memory_error – Memory allocation failure.
aoclsparse_?syrkd()#
-
aoclsparse_status aoclsparse_ssyrkd(const aoclsparse_operation opA, const aoclsparse_matrix A, const float alpha, const float beta, float *C, const aoclsparse_order orderC, const aoclsparse_int ldc)#
-
aoclsparse_status aoclsparse_dsyrkd(const aoclsparse_operation opA, const aoclsparse_matrix A, const double alpha, const double beta, double *C, const aoclsparse_order orderC, const aoclsparse_int ldc)#
-
aoclsparse_status aoclsparse_csyrkd(const aoclsparse_operation opA, const aoclsparse_matrix A, const aoclsparse_float_complex alpha, const aoclsparse_float_complex beta, aoclsparse_float_complex *C, const aoclsparse_order orderC, const aoclsparse_int ldc)#
-
aoclsparse_status aoclsparse_zsyrkd(const aoclsparse_operation opA, const aoclsparse_matrix A, const aoclsparse_double_complex alpha, const aoclsparse_double_complex beta, aoclsparse_double_complex *C, const aoclsparse_order orderC, const aoclsparse_int ldc)#
Multiplication of a sparse matrix and its transpose (or conjugate transpose) for all data types.
aoclsparse_syrkdmultiplies a sparse matrix with its transpose (or conjugate transpose) in CSR storage format. The result is stored in a dense format, such that\[ C := \alpha \cdot A \cdot op(A) + \beta \cdot C \]if \(opA\) is aoclsparse_operation_none.Otherwise,
\[ C := \alpha \cdot op(A) \cdot A + \beta \cdot C \]\[\begin{split} op(A) = \left\{ \begin{array}{ll} A^T, & \text{transpose of } {\bf\mathsf{A}} \text{ for real matrices}\\ A^H, & \text{conjugate transpose of } {\bf\mathsf{A}} \text{ for complex matrices}\\ \end{array} \right. \end{split}\]where \(A\) is a \(m \times n\) sparse matrix,
opAis one of aoclsparse_operation_none, aoclsparse_operation_transpose (for real matrices) or aoclsparse_operation_conjugate_transpose (for complex matrices). The output matrix \(C\) is a dense symmetric (or Hermitian) matrix stored as an upper triangular matrix.1/* ************************************************************************ 2 * Copyright (c) 2024 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <complex> 27#include <cstring> 28#include <iomanip> 29#include <iostream> 30#include <limits> 31#include <vector> 32 33// An example to illustrate the usage of aoclsparse_syrk 34int main(void) 35{ 36 std::cout << "-------------------------------" << std::endl 37 << "----- syrkd sample program -----" << std::endl 38 << "-------------------------------" << std::endl 39 << std::endl; 40 41 aoclsparse_matrix A; 42 aoclsparse_status status; 43 aoclsparse_index_base base = aoclsparse_index_base_zero; 44 aoclsparse_mat_descr descr; 45 double alpha = 2.1; 46 double beta = 1.3; 47 aoclsparse_int ldc = 10; 48 aoclsparse_operation op = aoclsparse_operation_none; 49 aoclsparse_order layout = aoclsparse_order_row; 50 aoclsparse_int m_C; 51 double *C = nullptr; 52 double *C_exp = nullptr; 53 54 // By default aoclsparse_create_mat_descr sets aoclsparse_matrix_type to aoclsparse_matrix_type_general 55 status = aoclsparse_create_mat_descr(&descr); 56 if(status != aoclsparse_status_success) 57 return status; 58 59 // Matrix sizes 60 aoclsparse_int m = 4, k = 3; 61 aoclsparse_int nnz = 7; 62 // Matrix A 63 // [ 0. -1.2 2.3] 64 // [ 4.6 0. 0. ] 65 // [ 0. 3.0 -8.1 ] 66 // [ 0.3 0. -5.1 ] 67 aoclsparse_int row_ptr[] = {0, 2, 3, 5, 7}; 68 aoclsparse_int col_ind[] = {1, 2, 0, 1, 2, 0, 2}; 69 double csr_val[] = {-1.2, 2.3, 4.6, 3.0, -8.1, 0.3, -5.1}; 70 71 status = aoclsparse_create_dcsr(&A, base, m, k, nnz, row_ptr, col_ind, csr_val); 72 if(status != aoclsparse_status_success) 73 { 74 return status; 75 } 76 77 // output matrix dimension 78 m_C = 4; 79 80 C = (double *)malloc(sizeof(double) * ldc * m_C); 81 C_exp = (double *)malloc(sizeof(double) * ldc * m_C); 82 83 // set the large C matrix to -1, only a window of size 4x4 (upper triangular part) 84 // is updated with the syrkd call 85 for(aoclsparse_int i = 0; i < ldc * m_C; i++) 86 { 87 C[i] = -1; 88 C_exp[i] = -1; 89 } 90 91 // Expected value of C which is C_exp 92 // [ 12.8330 -1.3000 -47.9830 -25.9330, -1, -1, ....] 93 // [ -1 43.1360 -1.3000 1.5980, -1, -1, ....] 94 // [ -1 -1 155.3810 85.4510, -1, -1, ....] 95 // [ -1 -1 -1 53.5100, -1, -1, ....] 96 // [ -1 -1 -1 -1, -1, -1, ....] 97 // ... 98 C_exp[0] = 12.8330; 99 C_exp[1] = -1.3000; 100 C_exp[2] = -47.9830; 101 C_exp[3] = -25.9330; 102 C_exp[11] = 43.1360; 103 C_exp[12] = -1.3000; 104 C_exp[13] = 1.5980; 105 C_exp[22] = 155.3810; 106 C_exp[23] = 85.4510; 107 C_exp[33] = 53.5100; 108 109 status = aoclsparse_dsyrkd(op, A, alpha, beta, C, layout, ldc); 110 if(status != aoclsparse_status_success) 111 { 112 return status; 113 } 114 115 bool oki, ok = true; 116 //Initializing precision tolerance range for double 117 const double tol = 1e-03; 118 std::cout << "Expected output\n" 119 << "-------------------------------\n"; 120 for(aoclsparse_int i = 0; i < m_C; i++) 121 { 122 for(aoclsparse_int j = 0; j < ldc; j++) 123 { 124 std::cout << std::setw(10) << C_exp[i * ldc + j] << std::setw(3) << ""; 125 } 126 std::cout << std::endl; 127 } 128 129 std::cout << "Actual output\n" 130 << "-------------------------------\n"; 131 for(aoclsparse_int i = 0; i < m_C; i++) 132 { 133 for(aoclsparse_int j = 0; j < ldc; j++) 134 { 135 oki = ((std::abs(C[i * ldc + j] - C_exp[i * ldc + j]) <= tol)); 136 ok &= oki; 137 std::cout << std::setw(10) << C[i * ldc + j] << std::setw(3) << (oki ? "" : " !"); 138 } 139 std::cout << std::endl; 140 } 141 aoclsparse_destroy_mat_descr(descr); 142 aoclsparse_destroy(&A); 143 free(C); 144 free(C_exp); 145 146 return (ok ? 0 : 6); 147}
Note
aoclsparse_syrkdassumes that the input CSR matrix has sorted column indices in each row. If not, call aoclsparse_order_mat() before callingaoclsparse_syrkd.Note
For complex type, only the real parts of \(\alpha\) and \(\beta\) are taken into account to preserve Hermitian \(C\).
Note
aoclsparse_syrkdcurrently does not support aoclsparse_operation_transpose for complexA.- Parameters
opA – [in] Matrix \(A\) operation type.
A – [in] Sorted sparse CSR matrix \(A\).
alpha – [in] Scalar \(\alpha\).
beta – [in] Scalar \(\beta\).
C – [inout] Output dense matrix. Only upper triangular part of the matrix is processed during the computation, the strictly lower triangle is not modified.
orderC – [in] Storage format of the output dense matrix, \(C\). It can be aoclsparse_order_row or aoclsparse_order_column.
ldc – [in] Leading dimension of \(C\).
- Return values
aoclsparse_status_success – The operation completed successfully.
aoclsparse_status_invalid_pointer –
A,Cis invalid.aoclsparse_status_wrong_type –
Aand its operation type do not match.aoclsparse_status_not_implemented – The input matrix is not in the CSR format or
opAis aoclsparse_operation_transpose andAhas complex values.aoclsparse_status_invalid_value – The value of
opA,orderCorldcis invalid.aoclsparse_status_unsorted_input – Input matrix is not sorted.
aoclsparse_status_memory_error – Memory allocation failure.
aoclsparse_?sypr()#
-
aoclsparse_status aoclsparse_sypr(aoclsparse_operation opA, const aoclsparse_matrix A, const aoclsparse_matrix B, const aoclsparse_mat_descr descrB, aoclsparse_matrix *C, const aoclsparse_request request)#
Symmetric product of three sparse matrices for real and complex datatypes stored as a sparse matrix.
aoclsparse_syprmultiplies three sparse matrices in CSR storage format. The result is returned in a newly allocated symmetric or Hermitian sparse matrix stored as an upper triangle in CSR format.If \(opA\) is aoclsparse_operation_none,
\[ C = A \cdot B \cdot A^T, \]or\[ C = A \cdot B \cdot A^H, \]for real or complex input matrices, respectively, where \(A\) is a \(m \times n\) general matrix , \(B\) is a \(n \times n\) symmetric (for real data types) or Hermitian (for complex data types) matrix, resulting in a symmetric or Hermitian \(m \times m\) matrix \(C\).Otherwise,
\[ C = op(A) \cdot B \cdot A, \]with\[\begin{split} op(A) = \left\{ \begin{array}{ll} A^T, & \text{if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]where \(A\) is a \(m \times n\) matrix and \(B\) is a \(m \times m\) symmetric (or Hermitian) matrix, resulting in a \(n \times n\) symmetric (or Hermitian) matrix \(C\).Depending on
request,aoclsparse_syprmight compute the result in a single stage (aoclsparse_stage_full_computation) or in two stages. Then the first stage (aoclsparse_stage_nnz_count) allocates memory for the new output matrix \(C\) and computes its number of non-zeros and their structure which is followed by the second stage (aoclsparse_stage_finalize) to compute the column indices and values of all elements. The second stage can be invoked multiple times (either after aoclsparse_stage_full_computation or aoclsparse_stage_nnz_count) to recompute the numerical values of \(C\) on assumption that the sparsity structure of the input matrices remained unchanged and only the values of the non-zero elements were modified (e.g., by a call to aoclsparse_supdate_values() and variants).1/* ************************************************************************ 2 * Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <complex> 27#include <iomanip> 28#include <iostream> 29 30/* Computes triple symmetric product of complex sparse matrices 31 * C:=opA(A)*B*A, where opA = conjugate transpose 32*/ 33 34// Comment this out for single stage computation 35//#define TWO_STAGE_COMPUTATION 36 37int main(void) 38{ 39 std::cout << "-------------------------------" << std::endl 40 << "----- SYPR sample program -----" << std::endl 41 << "-------------------------------" << std::endl 42 << std::endl; 43 44 aoclsparse_status status; 45 aoclsparse_int nnz_C; 46 aoclsparse_request request; 47 aoclsparse_index_base base = aoclsparse_index_base_zero; 48 aoclsparse_operation opA = aoclsparse_operation_conjugate_transpose; 49 50 // Print aoclsparse version 51 std::cout << aoclsparse_get_version() << std::endl; 52 53 // Initialise matrix descriptor and csr matrix structure of inputs A and B 54 aoclsparse_mat_descr descrB; 55 aoclsparse_matrix csrA; 56 aoclsparse_matrix csrB; 57 58 // Create matrix descriptor of input matrices 59 status = aoclsparse_create_mat_descr(&descrB); 60 if(status != aoclsparse_status_success) 61 { 62 std::cerr << "Error returned from aoclsparse_create_mat_descr, status = " << status << "." 63 << std::endl; 64 return 1; 65 } 66 67 aoclsparse_set_mat_type(descrB, aoclsparse_matrix_type_hermitian); 68 aoclsparse_set_mat_fill_mode(descrB, aoclsparse_fill_mode_lower); 69 aoclsparse_set_mat_index_base(descrB, aoclsparse_index_base_zero); 70 71 // Matrix sizes 72 aoclsparse_int m_a = 5, n_a = 5, m_b = 5, n_b = 5; 73 aoclsparse_int nnz_A = 10, nnz_B = 9; 74 // Matrix A 75 aoclsparse_int row_ptr_A[] = {0, 1, 2, 5, 9, 10}; 76 aoclsparse_int col_ind_A[] = {0, 0, 1, 2, 4, 0, 1, 2, 3, 4}; 77 aoclsparse_double_complex val_A[] = {{-0.86, 0.45}, 78 {-2.62, -0.44}, 79 {-0.87, 0.13}, 80 {-0.66, -1.09}, 81 {0.05, -2.37}, 82 {-1.48, -0.42}, 83 {-0.58, -0.70}, 84 {0.31, -0.96}, 85 {-0.88, -2.37}, 86 {-1.23, 0.21}}; 87 status = aoclsparse_create_zcsr(&csrA, base, m_a, n_a, nnz_A, row_ptr_A, col_ind_A, val_A); 88 if(status != aoclsparse_status_success) 89 { 90 std::cerr << "Error returned from aoclsparse_create_zcsr, status = " << status << "." 91 << std::endl; 92 return 2; 93 } 94 95 // Hermitian Matrix B 96 aoclsparse_int row_ptr_B[] = {0, 1, 2, 4, 6, 9}; 97 aoclsparse_int col_ind_B[] = {0, 1, 0, 2, 1, 3, 1, 2, 4}; 98 aoclsparse_double_complex val_B[] = {{-1.59, 0}, 99 {0.46, 0}, 100 {0.07, -0.51}, 101 {-1.52, 0}, 102 {0.21, -1.33}, 103 {-1.37, 0}, 104 {1.42, -2.08}, 105 {-2.26, -1.00}, 106 {-1.81, 0}}; 107 108 status = aoclsparse_create_zcsr(&csrB, base, m_b, n_b, nnz_B, row_ptr_B, col_ind_B, val_B); 109 if(status != aoclsparse_status_success) 110 { 111 std::cerr << "Error returned from aoclsparse_create_zcsr, status = " << status << "." 112 << std::endl; 113 return 3; 114 } 115 116 aoclsparse_matrix csrC = NULL; 117 aoclsparse_int *csr_row_ptr_C = NULL; 118 aoclsparse_int *csr_col_ind_C = NULL; 119 aoclsparse_double_complex *csr_val_C = NULL; 120 aoclsparse_int C_M, C_N; 121 122 // expected output matrix 123 aoclsparse_int C_M_exp = 5, C_N_exp = 5, nnz_C_exp = 14; 124 aoclsparse_int csr_row_ptr_C_exp[] = {0, 5, 9, 12, 13, 14}; 125 aoclsparse_int csr_col_ind_C_exp[] = {0, 1, 2, 3, 4, 1, 2, 3, 4, 2, 4, 3, 3, 4}; 126 aoclsparse_double_complex csr_val_C_exp[] = {{-0.982439, 0.000000}, 127 {-3.380974, 2.107664}, 128 {-4.156461, -1.960739}, 129 {-10.188348, 1.376974}, 130 {5.609324, 4.536282}, 131 {-2.308344, 0.000000}, 132 {-1.331714, -2.631938}, 133 {-2.972078, -1.039282}, 134 {-1.922892, -1.975280}, 135 {-3.862273, 0.000000}, 136 {-3.714510, 1.465694}, 137 {-2.743288, 2.163915}, 138 {-8.756081, -0.000000}, 139 {-3.022874, 0.000000}}; 140 141#ifdef TWO_STAGE_COMPUTATION 142 std::cout << "Invoking aoclsparse_sypr with aoclsparse_stage_nnz_count...\n"; 143 // aoclsparse_stage_nnz_count: Only rowIndex array of the CSR matrix 144 // is computed internally. 145 request = aoclsparse_stage_nnz_count; 146 status = aoclsparse_sypr(opA, csrA, csrB, descrB, &csrC, request); 147 if(status != aoclsparse_status_success) 148 { 149 std::cerr << "Error returned from aoclsparse_sypr, status = " << status << "." << std::endl; 150 return 4; 151 } 152 153 std::cout << "Invoking aoclsparse_sypr with aoclsparse_stage_finalize...\n"; 154 // aoclsparse_stage_finalize: Finalize computation of remaining 155 // output arrays (column indices and values of output matrix entries). 156 // Has to be called only after aoclsparse_sypr call with 157 // aoclsparse_stage_nnz_count parameter. 158 request = aoclsparse_stage_finalize; 159 status = aoclsparse_sypr(opA, csrA, csrB, descrB, &csrC, request); 160 if(status != aoclsparse_status_success) 161 { 162 std::cerr << "Error returned from aoclsparse_sypr, status = " << status << "." << std::endl; 163 return 5; 164 } 165 166#else // SINGLE STAGE 167 std::cout << "Invoking aoclsparse_sypr with aoclsparse_stage_full_computation...\n"; 168 // aoclsparse_stage_full_computation: Whole computation is performed in 169 // a single step. 170 request = aoclsparse_stage_full_computation; 171 status = aoclsparse_sypr(opA, csrA, csrB, descrB, &csrC, request); 172 if(status != aoclsparse_status_success) 173 { 174 std::cerr << "Error returned from aoclsparse_sypr, status = " << status << "." << std::endl; 175 return 6; 176 } 177 178#endif 179 180 aoclsparse_export_zcsr( 181 csrC, &base, &C_M, &C_N, &nnz_C, &csr_row_ptr_C, &csr_col_ind_C, &csr_val_C); 182 // Check and print the result 183 std::cout << std::fixed; 184 std::cout.precision(1); 185 bool oka, okb, okc, oki, okj, okk, ok = true; 186 std::cout << std::endl 187 << "Output Matrix C: " << std::endl 188 << std::setw(11) << "C_M" << std::setw(3) << "" << std::setw(11) << "expected" 189 << std::setw(2) << "" << std::setw(11) << "C_N" << std::setw(3) << "" << std::setw(11) 190 << "expected" << std::setw(2) << "" << std::setw(11) << "nnz_C" << std::setw(3) << "" 191 << std::setw(11) << "expected" << std::endl; 192 oka = C_M == C_M_exp; 193 ok &= oka; 194 std::cout << std::setw(11) << C_M << std::setw(3) << "" << std::setw(11) << C_M_exp 195 << std::setw(2) << (oka ? "" : " !"); 196 okb = C_N == C_N_exp; 197 ok &= okb; 198 std::cout << std::setw(11) << C_N << std::setw(3) << "" << std::setw(11) << C_N_exp 199 << std::setw(2) << (okb ? "" : " !"); 200 okc = nnz_C == nnz_C_exp; 201 ok &= okc; 202 std::cout << std::setw(11) << nnz_C << std::setw(3) << "" << std::setw(11) << nnz_C_exp 203 << std::setw(2) << (okc ? "" : " !"); 204 std::cout << std::endl; 205 std::cout << std::endl; 206 std::cout << std::setw(14) << "csr_val_C" << std::setw(16) << "expected" << std::setw(14) 207 << "csr_col_ind_C" << std::setw(14) << "expected" << std::setw(14) << "csr_row_ptr_C" 208 << std::setw(14) << "expected" << std::endl; 209 // Initializing precision tolerance range for double 210 const double tol = 1e-06; 211 for(aoclsparse_int i = 0; i < nnz_C; i++) 212 { 213 oki = ((std::abs(csr_val_C[i].real - csr_val_C_exp[i].real) <= tol) 214 && (std::abs(csr_val_C[i].imag - csr_val_C_exp[i].imag) <= tol)); 215 ok &= oki; 216 std::cout << "(" << std::setw(5) << csr_val_C[i].real << "," << std::setw(5) 217 << csr_val_C[i].imag << "i) " 218 << " (" << std::setw(5) << csr_val_C_exp[i].real << "," << std::setw(5) 219 << csr_val_C_exp[i].imag << "i) " << std::setw(2) << (oki ? "" : " !"); 220 okj = csr_col_ind_C[i] == csr_col_ind_C_exp[i]; 221 ok &= okj; 222 std::cout << std::setw(11) << csr_col_ind_C[i] << std::setw(3) << "" << std::setw(11) 223 << csr_col_ind_C_exp[i] << std::setw(2) << (okj ? "" : " !"); 224 if(i <= C_M) 225 { 226 okk = csr_row_ptr_C[i] == csr_row_ptr_C_exp[i]; 227 ok &= okk; 228 std::cout << " " << std::setw(11) << csr_row_ptr_C[i] << std::setw(3) << "" 229 << std::setw(11) << csr_row_ptr_C_exp[i] << std::setw(2) << (okk ? "" : " !"); 230 } 231 std::cout << std::endl; 232 } 233 234 aoclsparse_destroy_mat_descr(descrB); 235 aoclsparse_destroy(&csrA); 236 aoclsparse_destroy(&csrB); 237 aoclsparse_destroy(&csrC); 238 return (ok ? 0 : 7); 239}
Note
aoclsparse_syprsupports only matrices in CSR format which have sorted column indices in each row. If the matrices are unsorted, you might want to call aoclsparse_order_mat().Note
Currently,
opA= aoclsparse_operation_transpose is supported only for real data types.- Parameters
opA – [in] matrix \(A\) operation type.
A – [in] sorted sparse CSR matrix \(A\).
B – [in] sorted sparse CSR matrix \(B\) to be interpreted as symmetric (or Hermitian).
descrB – [in] descriptor of the sparse CSR matrix \(B\). aoclsparse_matrix_type must be aoclsparse_matrix_type_symmetric for real matrices or aoclsparse_matrix_type_hermitian for complex matrices. aoclsparse_fill_mode might be either aoclsparse_fill_mode_upper or aoclsparse_fill_mode_lower to process the upper or lower triangular matrix part, respectively.
request – [in] Specifies if the computation takes place in one stage (aoclsparse_stage_full_computation) or in two stages (aoclsparse_stage_nnz_count followed by aoclsparse_stage_finalize).
*C – [inout] Pointer to the new sparse CSR symmetric/Hermitian matrix \(C\) . Only upper triangle of the result matrix is computed. Matrix \(C\) will always have zero-based indexing, irrespective of the zero/one-based indexing of the input matrices \(A\) and \(B\). The column indices of the output matrix in CSR format might be unsorted. If
requestis aoclsparse_stage_finalize, matrix \(C\) must not be modified by the user since the last call toaoclsparse_sypr, in the other cases is \(C\) treated as an output only. The matrix should be freed by aoclsparse_destroy() when no longer needed.
- Return values
aoclsparse_status_success – the operation completed successfully.
aoclsparse_status_invalid_pointer –
descrB,A,BorCis invalid.aoclsparse_status_invalid_size – Matrix dimensions do not match
AorBis not square.aoclsparse_status_invalid_value – Input parameters are invalid, for example,
descrBdoes not matchBindexing orBis not symmetric/Hermitian,Chas been modified between stages oropAorrequestis not recognized.aoclsparse_status_wrong_type –
AandBmatrix data types do not match.aoclsparse_status_not_implemented – Input matrix
AorBis not in CSR format.aoclsparse_status_unsorted_input – Input matrices are not sorted.
aoclsparse_status_memory_error – Memory allocation failure.
aoclsparse_?syprd()#
-
aoclsparse_status aoclsparse_ssyprd(const aoclsparse_operation op, const aoclsparse_matrix A, const float *B, const aoclsparse_order orderB, const aoclsparse_int ldb, const float alpha, const float beta, float *C, const aoclsparse_order orderC, const aoclsparse_int ldc)#
-
aoclsparse_status aoclsparse_dsyprd(const aoclsparse_operation op, const aoclsparse_matrix A, const double *B, const aoclsparse_order orderB, const aoclsparse_int ldb, const double alpha, const double beta, double *C, const aoclsparse_order orderC, const aoclsparse_int ldc)#
-
aoclsparse_status aoclsparse_csyprd(const aoclsparse_operation op, const aoclsparse_matrix A, const aoclsparse_float_complex *B, const aoclsparse_order orderB, const aoclsparse_int ldb, const aoclsparse_float_complex alpha, const aoclsparse_float_complex beta, aoclsparse_float_complex *C, const aoclsparse_order orderC, const aoclsparse_int ldc)#
-
aoclsparse_status aoclsparse_zsyprd(const aoclsparse_operation op, const aoclsparse_matrix A, const aoclsparse_double_complex *B, const aoclsparse_order orderB, const aoclsparse_int ldb, const aoclsparse_double_complex alpha, const aoclsparse_double_complex beta, aoclsparse_double_complex *C, const aoclsparse_order orderC, const aoclsparse_int ldc)#
Performs symmetric triple product of a sparse matrix and a dense matrix and stores the output as a dense matrix.
aoclsparse_?syprdperforms product of a scalar \(\alpha\), with the symmetric triple product of a sparse \(m \times k\) matrix \(A\), defined in CSR format, with a \(k \times k\) symmetric dense (or Hermitian) matrix \(B\), and a \(k \times m\) \(op(A)\). Adds the resulting matrix to \(m \times m\) symmetric dense (or Hermitian) matrix \(C\) that is multiplied by a scalar \(\beta\), such that\[ C := \alpha \cdot A \cdot B \cdot op(A) + \beta \cdot C \]if \(op\) is aoclsparse_operation_none.Otherwise,
\[ C := \alpha \cdot op(A) \cdot B \cdot A + \beta \cdot C \]\[\begin{split} op(A) = \left\{ \begin{array}{ll} A^T, & \text{if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \text{ (real matrices)}\\ A^H, & \text{if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \text{ (complex matrices)} \end{array} \right. \end{split}\]Notes
1. This routine assumes the dense matrices (B and C) are stored in full although the computations happen on the upper triangular portion of the matrices.
2. aoclsparse_operation_transpose is only supported for real matrices.
3. aoclsparse_operation_conjugate_transpose is only supported for complex matrices.
4. Complex dense matrices are assumed to be Hermitian matrices.
- Parameters
op – [in] Matrix \(A\) operation type.
A – [in] Sparse CSR matrix \(A\) structure.
B – [in] Array of dimension \(ldb \times ldb\). Only the upper triangular matrix is used for computation.
orderB – [in] aoclsparse_order_row or aoclsparse_order_column for dense matrix B.
ldb – [in] Leading dimension of \(B\), must be at least \(\max{(1, k)}\) ( \(op(A) = A\)) or \(\max{(1, m)}\) ( \(op(A) = A^T\) or \(op(A) = A^H\)).
alpha – [in] Scalar \(\alpha\).
beta – [in] Scalar \(\beta\).
C – [inout] Array of dimension \(ldc \times ldc\). Only upper triangular part of the matrix is processed.
orderC – [in] aoclsparse_order_row or aoclsparse_order_column for dense matrix C.
ldc – [in] Leading dimension of \(C\), must be at least \(\max{(1, m)}\) ( \(op(A) = A\)) or \(\max{(1, k)}\) ( \(op(A) = A^T\) or \(op(A) = A^H\)).
- Return values
aoclsparse_status_success – The operation completed successfully.
aoclsparse_invalid_operation – The operation is invalid if the matrix B and C has a different layout ordering.
aoclsparse_status_wrong_type – The data type of the matrices are not matching or invalid.
aoclsparse_status_invalid_size – The value of
m,k,nnz,ldborldcis invalid.aoclsparse_status_invalid_pointer – The pointer
A,B, orCis invalid.aoclsparse_status_not_implemented – The values of
orderBandorderCare different.
Miscellaneous#
aoclsparse_ilu_?smoother()#
-
aoclsparse_status aoclsparse_silu_smoother(aoclsparse_operation op, aoclsparse_matrix A, const aoclsparse_mat_descr descr, float **precond_csr_val, const float *approx_inv_diag, float *x, const float *b)#
-
aoclsparse_status aoclsparse_dilu_smoother(aoclsparse_operation op, aoclsparse_matrix A, const aoclsparse_mat_descr descr, double **precond_csr_val, const double *approx_inv_diag, double *x, const double *b)#
Incomplete LU factorization with zero fill-in, ILU(0).
Performs incomplete LU factorization with zero fill-in on symmetric sparse matrix
Aof size \(n \times n\). It also performs a solve forxin\[ L U x = b, \qquad \text{where} \qquad LU\approx A.\]MatrixAshould be numerically of full rank. Currently single and double precision datatypes are supported.1/* ************************************************************************ 2 * Copyright (c) 2022-2023 Advanced Micro Devices, Inc. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <cfloat> 27#include <iomanip> 28#include <iostream> 29#include <math.h> 30#include <vector> 31 32#define PREMATURE_STOP_TOLERANCE 1e-4 33const char gmres_sol[] = "gmres_direct_d"; 34 35double calculate_l2Norm_solvers(const double *xSol, const double *x, aoclsparse_int n) 36{ 37 double norm = 0.0f; 38 //residual = xSol - x 39 //xSol obtained by initial spmv 40 //x obtained by ILU-GMRES iterations 41 for(aoclsparse_int i = 0; i < n; i++) 42 { 43 double a = xSol[i] - x[i]; 44 norm += a * a; 45 } 46 norm = sqrt(norm); 47 return norm; 48} 49 50aoclsparse_int monit(aoclsparse_int n, 51 const double *x, 52 const double *r __attribute__((unused)), 53 double *rinfo, 54 void *udata __attribute__((unused))) 55{ 56 int it = (int)rinfo[30]; 57 double tol = PREMATURE_STOP_TOLERANCE; 58 59 std::ios oldState(nullptr); 60 oldState.copyfmt(std::cout); 61 62 std::ios_base::fmtflags fmt = std::cout.flags(); 63 fmt |= std::ios_base::scientific | std::ios_base::right | std::ios_base::showpos; 64 65 if(!(it % 10)) 66 { 67 std::cout << std::setw(5) << std::right << " iter" 68 << " " << std::setw(16) << std::right << "optim"; 69 for(int i = 0; i < n; i++) 70 std::cout << std::setw(8) << std::right << "x[" << i << "]"; 71 std::cout << std::endl; 72 } 73 std::cout << std::setw(5) << std::right << (int)rinfo[30] << " " << std::setw(16) << std::right 74 << std::scientific << std::setprecision(8) << rinfo[0]; 75 std::cout << std::setprecision(2) << std::showpos; 76 for(int i = 0; i < n; i++) 77 std::cout << " " << x[i]; 78 std::cout << std::endl; 79 std::cout << std::resetiosflags(fmt); 80 81 //reset std::cout state 82 std::cout.copyfmt(oldState); 83 84 if(rinfo[0] < tol) // check for premature stop 85 { 86 return 1; // request to interrupt 87 } 88 return 0; 89} 90 91int main() 92{ 93 std::vector<aoclsparse_int> csr_row_ptr; 94 std::vector<aoclsparse_int> csr_col_ind; 95 std::vector<double> csr_val; 96 97 int m, n, nnz; 98 aoclsparse_status exit_status = aoclsparse_status_success; 99 100 std::string filename = "cage4.mtx"; 101 //symmetry = 0; //real unsymmetric, symmetry = 0%, spd = no 102 //https://www.cise.ufl.edu/research/sparse/MM/vanHeukelum/cage4.tar.gz 103 n = 9; 104 m = 9; 105 nnz = 49; 106 csr_row_ptr.assign({0, 5, 10, 15, 20, 26, 32, 38, 44, 49}); 107 csr_col_ind.assign({0, 1, 3, 4, 7, 0, 1, 2, 4, 5, 1, 2, 3, 5, 6, 0, 2, 3, 6, 7, 0, 1, 4, 5, 6, 108 8, 1, 2, 4, 5, 7, 8, 2, 3, 4, 6, 7, 8, 0, 3, 5, 6, 7, 8, 4, 5, 6, 7, 8}); 109 csr_val.assign({0.75, 0.14, 0.11, 0.14, 0.11, 0.08, 0.69, 0.11, 0.08, 0.11, 0.09, 0.67, 0.08, 110 0.09, 0.08, 0.09, 0.14, 0.73, 0.14, 0.09, 0.04, 0.04, 0.54, 0.14, 0.11, 0.25, 111 0.05, 0.05, 0.08, 0.45, 0.08, 0.15, 0.04, 0.04, 0.09, 0.47, 0.09, 0.18, 0.05, 112 0.05, 0.14, 0.11, 0.55, 0.25, 0.08, 0.08, 0.09, 0.08, 0.17}); 113 114 // Create aocl sparse matrix 115 aoclsparse_matrix A; 116 aoclsparse_index_base base = aoclsparse_index_base_zero; 117 aoclsparse_mat_descr descr_a; 118 aoclsparse_operation trans = aoclsparse_operation_none; 119 aoclsparse_create_dcsr(&A, 120 base, 121 (aoclsparse_int)n, 122 (aoclsparse_int)n, 123 (aoclsparse_int)nnz, 124 csr_row_ptr.data(), 125 csr_col_ind.data(), 126 csr_val.data()); 127 aoclsparse_create_mat_descr(&descr_a); 128 aoclsparse_set_mat_type(descr_a, aoclsparse_matrix_type_symmetric); 129 aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_lower); 130 aoclsparse_set_mat_index_base(descr_a, base); 131 aoclsparse_set_sv_hint(A, trans, descr_a, 100); 132 aoclsparse_optimize(A); 133 134 // Initialize initial point x0 and right hand side b 135 double *expected_sol = NULL; 136 double *x = NULL; 137 double *b = NULL; 138 double norm = 0.0; 139 double rinfo[100]; 140 double alpha = 1.0, beta = 0.; 141 int rs_iters = 7; 142 char rs_iters_string[16]; 143 144 expected_sol = new double[n]; 145 x = new double[n]; 146 b = new double[n]; 147 148 double init_x = 1.0, ref_x = 0.5; 149 for(int i = 0; i < n; i++) 150 { 151 expected_sol[i] = ref_x; 152 x[i] = init_x; 153 } 154 155 aoclsparse_dmv(trans, &alpha, A, descr_a, expected_sol, &beta, b); 156 157 double tol = PREMATURE_STOP_TOLERANCE; 158 aoclsparse_itsol_handle handle = NULL; 159 // create GMRES handle 160 aoclsparse_itsol_d_init(&handle); 161 162 exit_status = aoclsparse_itsol_option_set(handle, "iterative method", "GMRES"); 163 if(exit_status != aoclsparse_status_success) 164 printf("Warning an iterative method option could not be set, exit status = %d\n", 165 exit_status); 166 167 exit_status = aoclsparse_itsol_option_set(handle, "gmres preconditioner", "ILU0"); 168 if(exit_status != aoclsparse_status_success) 169 printf("Warning gmres preconditioner option could not be set, exit status = %d\n", 170 exit_status); 171 172 exit_status = aoclsparse_itsol_option_set(handle, "gmres iteration limit", "50"); 173 if(exit_status != aoclsparse_status_success) 174 printf("Warning gmres iteration limit option could not be set, exit status = %d\n", 175 exit_status); 176 177 sprintf(rs_iters_string, "%d", (int)rs_iters); 178 exit_status = aoclsparse_itsol_option_set(handle, "gmres restart iterations", rs_iters_string); 179 if(exit_status != aoclsparse_status_success) 180 printf("Warning gmres restart iterations option could not be set, exit status = %d\n", 181 exit_status); 182 183 aoclsparse_itsol_handle_prn_options(handle); 184 185 // Call GMRES solver 186 aoclsparse_status status; 187 188 status = aoclsparse_itsol_d_solve(handle, n, A, descr_a, b, x, rinfo, NULL, monit, &n); 189 if(status == aoclsparse_status_success || status == aoclsparse_status_user_stop 190 || aoclsparse_status_maxit == status) 191 { 192 norm = calculate_l2Norm_solvers(expected_sol, x, n); 193 printf("input = %s\n", filename.c_str()); 194 printf("solver = %s\n", gmres_sol); 195 printf("no of rows = %d\n", (int)m); 196 printf("no of cols = %d\n", (int)n); 197 printf("no of nnz = %d\n", (int)nnz); 198 printf("monitoring tolerance = %e\n", tol); 199 printf("restart iterations = %d\n", (int)rs_iters); 200 printf("residual achieved = %e\n", rinfo[0]); 201 printf("total iterations = %d\n", (int)rinfo[30]); 202 printf("l2 Norm = %e\n", norm); 203 } 204 else 205 { 206 std::cout << "Something unexpected happened!, Status = " << status << std::endl; 207 } 208 209 delete[] expected_sol; 210 delete[] x; 211 delete[] b; 212 aoclsparse_itsol_destroy(&handle); 213 aoclsparse_destroy_mat_descr(descr_a); 214 aoclsparse_destroy(&A); 215 printf("\n"); 216 fflush(stdout); 217 218 return 0; 219}
- Parameters
op – [in] matrix
Aoperation type. Transpose not supported in this release.A – [in] sparse symmetric matrix handle. Currently ILU functionality is supported only for CSR matrix format.
descr – [in] descriptor of the sparse matrix handle
A. Currently, only aoclsparse_matrix_type_symmetric is supported.precond_csr_val – [out] pointer that contains L and U factors after ILU factorization operation.
Ais not overwritten with the factors.approx_inv_diag – [in] Reserved for future use.
x – [out] array of
nelements containing the solution to solving aproximatly \(Ax=b\).b – [in] Right-hand-side of the linear system of equations \(Ax = b\).
- Return values
aoclsparse_status_success – the operation completed successfully.
aoclsparse_status_invalid_size – input parameters contain an invalid value.
aoclsparse_status_invalid_pointer –
descr,Ais invalid.aoclsparse_status_not_implemented – aoclsparse_matrix_type is not aoclsparse_matrix_type_symmetric or input matrix
Ais not in CSR format