Sparse BLAS level 1, 2, and 3 functions - Sparse BLAS level 1, 2, and 3 functions - 5.0 English - 63865

AOCL Sparse Library (63865)

Document ID
63865
Release Date
2024-10-09
Version
5.0 English

Sparse BLAS level 1, 2, and 3 functions#

aoclsparse_functions.h provides AMD CPU hardware optimized level 1, 2, and 3 Sparse Linear Algebra Subprograms (Sparse BLAS).

Level 1#

The sparse level 1 routines describe operations between a vector in sparse format and a vector in dense format.

This section describes all provided level 1 sparse linear algebra functions.

aoclsparse_?axpyi()#

aoclsparse_status aoclsparse_saxpyi(const aoclsparse_int nnz, const float a, const float *x, const aoclsparse_int *indx, float *y)#
aoclsparse_status aoclsparse_daxpyi(const aoclsparse_int nnz, const double a, const double *x, const aoclsparse_int *indx, double *y)#
aoclsparse_status aoclsparse_caxpyi(const aoclsparse_int nnz, const void *a, const void *x, const aoclsparse_int *indx, void *y)#
aoclsparse_status aoclsparse_zaxpyi(const aoclsparse_int nnz, const void *a, const void *x, const aoclsparse_int *indx, void *y)#

A variant of sparse vector-vector addition between compressed sparse vector and dense vector.

aoclsparse_?axpyi adds a scalar multiple of compressed sparse vector to a dense vector.

Let \(y\in R^m\) (or \(C^m\)) be a dense vector, \(x\) be a compressed sparse vector and \(I_x\) be the nonzero indices set for x of length at least nnz described by indx, then

\[ y_{I_{x_{i}}} = a\,x_i + y_{I_{x_{i}}}, \quad i\in\{1,\ldots,{\bf\mathsf{nnz}}\}. \]

 1/* ************************************************************************
 2 * Copyright (c) 2023-2024 Advanced Micro Devices, Inc.
 3 *
 4 * Permission is hereby granted, free of charge, to any person obtaining a copy
 5 * of this software and associated documentation files (the "Software"), to deal
 6 * in the Software without restriction, including without limitation the rights
 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 8 * copies of the Software, and to permit persons to whom the Software is
 9 * furnished to do so, subject to the following conditions:
10 *
11 * The above copyright notice and this permission notice shall be included in
12 * all copies or substantial portions of the Software.
13 *
14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
20 * THE SOFTWARE.
21 *
22 * ************************************************************************ */
23
24#include "aoclsparse.h"
25
26#include <cfloat>
27#include <cmath>
28#include <complex>
29#include <iomanip>
30#include <iostream>
31#include <vector>
32
33/* Sample program to illustrate the usage of axpyi API which performs vector-vector
34 * addition operation of the form y = ax + y, where
35 * a: scalar value
36 * x: compressed sparse vector
37 * y: dense vector
38 * Below example is for complex double precision. Using other precisions are also similar.
39 */
40int main(void)
41{
42    std::cout << "--------------------------------" << std::endl
43              << "----- axpyi sample program -----" << std::endl
44              << "--------------------------------" << std::endl
45              << std::endl;
46
47    // Number of non-zeros of the sparse vector
48    const aoclsparse_int nnz = 3;
49    // Scalar value
50    const std::complex<double> a = {10, 20};
51
52    // Sparse index vector (does not need to be ordered)
53    std::vector<aoclsparse_int> indx = {0, 6, 3};
54    // Sparse value vector in compressed form
55    std::vector<std::complex<double>> x{{2, 3}, {4, 1}, {5, 1}};
56
57    // Output vector
58    std::vector<std::complex<double>> y{{0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {65, -999}};
59
60    // Expected output vector
61    std::vector<std::complex<double>> y_exp{
62        {-40, 70}, {0, 0}, {0, 0}, {30, 110}, {0, 0}, {0, 0}, {85, -909}};
63
64    aoclsparse_int ny = y.size();
65
66    aoclsparse_status status;
67
68    std::cout << "Invoking aoclsparse_zaxpyi...\n";
69    //Invoke complex axpyi
70    status = aoclsparse_zaxpyi(nnz, &a, x.data(), indx.data(), y.data());
71    if(status != aoclsparse_status_success)
72    {
73        std::cerr << "Error returned from aoclsparse_zaxpyi, status = " << status << "."
74                  << std::endl;
75        return 3;
76    }
77
78    // Check and print the result
79    std::cout << "vector-vector addition of y: " << std::endl
80              << std::setw(10) << "y"
81              << "   " << std::setw(20) << "expected y" << std::endl;
82    std::cout << std::fixed;
83    std::cout.precision(4);
84    bool oki, ok = true;
85    for(aoclsparse_int i = 0; i < ny; i++)
86    {
87        oki = y[i] == y_exp[i];
88        ok &= oki;
89        std::cout << std::setw(15) << y[i] << std::setw(5) << "" << std::setw(15) << y_exp[i]
90                  << std::setw(5) << (oki ? "" : " ! Error") << std::endl;
91    }
92    return (ok ? 0 : 6);
93}

Note

The contents of the vectors are not checked for NaNs.

Parameters
  • nnz[in] The number of elements in \(x\) and indx.

  • a[in] Scalar value.

  • x[in] Sparse vector stored in compressed form of at least nnz elements.

  • indx[in] Nonzero indices set, \(I_x\), of x described by this array of length at least nnz. The elements in this vector are only checked for non-negativity. The caller should make sure that all indices are less than the size of y. Array is assumed to be in zero base.

  • y[inout] Array of at least \(\max(I_{x_i}, i \in \{ 1,\ldots,{\bf\mathsf{nnz}}\})\) elements.

Return values
  • aoclsparse_status_success – The operation completed successfully.

  • aoclsparse_status_invalid_pointer – At least one of the pointers x, indx, y is invalid.

  • aoclsparse_status_invalid_size – Indicates that provided nnz is less than zero.

  • aoclsparse_status_invalid_index_value – At least one of the indices in indx is negative.

aoclsparse_?dotci()#

aoclsparse_status aoclsparse_cdotci(const aoclsparse_int nnz, const void *x, const aoclsparse_int *indx, const void *y, void *dot)#
aoclsparse_status aoclsparse_zdotci(const aoclsparse_int nnz, const void *x, const aoclsparse_int *indx, const void *y, void *dot)#

Sparse conjugate dot product for single and double data precision complex types.

aoclsparse_cdotci() (complex float) and aoclsparse_zdotci() (complex double) compute the dot product of the conjugate of a complex vector stored in a compressed format and a complex dense vector. Let \(x\) and \(y\) be respectively a sparse and dense vectors in \(C^m\) with indx ( \(I_x\)) the nonzero indices array of x of length at least nnz that is used to index into the entries of dense vector \(y\), then these functions return

\[ {\bf\mathsf{dot}} = \sum_{i=0}^{{\bf\mathsf{nnz}}-1} \overline{\,x_i\,} \cdot y_{I_{x_i}}. \]

 1/* ************************************************************************
 2 * Copyright (c) 2023 Advanced Micro Devices, Inc.
 3 *
 4 * Permission is hereby granted, free of charge, to any person obtaining a copy
 5 * of this software and associated documentation files (the "Software"), to deal
 6 * in the Software without restriction, including without limitation the rights
 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 8 * copies of the Software, and to permit persons to whom the Software is
 9 * furnished to do so, subject to the following conditions:
10 *
11 * The above copyright notice and this permission notice shall be included in
12 * all copies or substantial portions of the Software.
13 *
14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
20 * THE SOFTWARE.
21 *
22 * ************************************************************************ */
23
24#include <complex>
25#include <iostream>
26#include <vector>
27
28/* One can use std::complex<double> instead of aoclsparse_double_complex
29#define STD_CMPLX  (or pass it as a compilation flag -DSTD_CMPLX)
30#ifdef STD_CMPLX
31#define aoclsparse_double_complex std::complex<double>
32#endif
33*/
34
35#include "aoclsparse.h"
36
37int main(void)
38{
39    aoclsparse_status                      status;
40    const aoclsparse_int                   nnz  = 3;
41    std::vector<aoclsparse_int>            indx = {0, 1, 2};
42    std::vector<aoclsparse_double_complex> x{{1, 1}, {2, 2}, {3, 3}};
43    std::vector<aoclsparse_double_complex> y{
44        {1, 1}, {1, 2}, {2, 3}, {1, 0}, {4, 1}, {1.2, -1}, {7, -1}, {0, 2}, {-2, 3}};
45    aoclsparse_double_complex dot_exp{-5, 23};
46    aoclsparse_double_complex dotc_exp{23, 5};
47    aoclsparse_double_complex dotp;
48    bool                      ok;
49
50    std::cout << "Invoking aoclsparse_zdotui...\n";
51
52    //Invoke complex double dot product
53    status = aoclsparse_zdotui(nnz, x.data(), indx.data(), y.data(), &dotp);
54    if(status != aoclsparse_status_success)
55    {
56        std::cerr << "Error returned from aoclsparse_cdotui, status = " << status << "."
57                  << std::endl;
58        return 1;
59    }
60
61#ifdef STD_CMPLX
62    std::cout << "Dot product: " << dotp << ", Expected dot product: " << dot_exp << std::endl;
63    ok = (dotp == dot_exp);
64#else
65    std::cout << "Dot product: (" << dotp.real << ", " << dotp.imag << "), Expected dot product: ("
66              << dot_exp.real << ", " << dot_exp.imag << ")" << std::endl;
67    ok = ((dotp.real == dot_exp.real) && (dotp.imag == dot_exp.imag));
68#endif
69
70    std::cout << "Invoking aoclsparse_zdotci...\n";
71
72    //Invoke complex double dot product
73    status = aoclsparse_zdotci(nnz, x.data(), indx.data(), y.data(), &dotp);
74    if(status != aoclsparse_status_success)
75    {
76        std::cerr << "Error returned from aoclsparse_cdotci, status = " << status << "."
77                  << std::endl;
78        return 2;
79    }
80#ifdef STD_CMPLX
81    std::cout << "Conjugated dot product: " << dotp
82              << ", Expected conjugated dot product: " << dotc_exp << std::endl;
83    ok &= (dotp == dotc_exp);
84    return ok ? 0 : 3;
85
86#else
87    std::cout << "Dot product: (" << dotp.real << ", " << dotp.imag << "), Expected dot product: ("
88              << dotc_exp.real << ", " << dotc_exp.imag << ")" << std::endl;
89    ok &= ((dotp.real == dotc_exp.real) && (dotp.imag == dotc_exp.imag));
90    return ok ? 0 : 4;
91#endif
92}

Note

The contents of the vectors are not checked for NaNs.

Parameters
  • nnz[in] The number of elements (length) of vectors x and indx.

  • x[in] Array of at least nnz complex elements.

  • indx[in] Nonzero indices set, \(I_x\), of x described by this array of length at least nnz. Each entry must contain a valid index into y and be unique. The entries of indx are not checked for validity.

  • y[in] Array of at least \(\max(I_{x_i}, i \in \{ 1,\ldots,{\bf\mathsf{nnz}}\})\) complex elements.

  • dot[out] The dot product of the conjugate of \(x\) and \(y\) when nnz \(> 0\). If nnz \(\le 0\), dot is set to 0.

Return values
  • aoclsparse_status_success – The operation completed successfully.

  • aoclsparse_status_invalid_pointer – At least one of the pointers x, indx, y, dot is invalid.

  • aoclsparse_status_invalid_size – Indicates that the provided nnz is not positive.

aoclsparse_?dotui()#

aoclsparse_status aoclsparse_cdotui(const aoclsparse_int nnz, const void *x, const aoclsparse_int *indx, const void *y, void *dot)#
aoclsparse_status aoclsparse_zdotui(const aoclsparse_int nnz, const void *x, const aoclsparse_int *indx, const void *y, void *dot)#

Sparse dot product for single and double data precision complex types.

aoclsparse_cdotui() (complex float) and aoclsparse_zdotui() (complex double) compute the dot product of a complex vector stored in a compressed format and a complex dense vector. Let \(x\) and \(y\) be respectively a sparse and dense vectors in \(C^m\) with indx ( \(I_x\)) the nonzero indices array of x of length at least nnz that is used to index into the entries of dense vector \(y\), then these functions return

\[ {\bf\mathsf{dot}} = \sum_{i=0}^{{\bf\mathsf{nnz}}-1} x_{i} \cdot y_{I_{x_i}}. \]

 1/* ************************************************************************
 2 * Copyright (c) 2023 Advanced Micro Devices, Inc.
 3 *
 4 * Permission is hereby granted, free of charge, to any person obtaining a copy
 5 * of this software and associated documentation files (the "Software"), to deal
 6 * in the Software without restriction, including without limitation the rights
 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 8 * copies of the Software, and to permit persons to whom the Software is
 9 * furnished to do so, subject to the following conditions:
10 *
11 * The above copyright notice and this permission notice shall be included in
12 * all copies or substantial portions of the Software.
13 *
14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
20 * THE SOFTWARE.
21 *
22 * ************************************************************************ */
23
24#include <complex>
25#include <iostream>
26#include <vector>
27
28/* One can use std::complex<double> instead of aoclsparse_double_complex
29#define STD_CMPLX  (or pass it as a compilation flag -DSTD_CMPLX)
30#ifdef STD_CMPLX
31#define aoclsparse_double_complex std::complex<double>
32#endif
33*/
34
35#include "aoclsparse.h"
36
37int main(void)
38{
39    aoclsparse_status                      status;
40    const aoclsparse_int                   nnz  = 3;
41    std::vector<aoclsparse_int>            indx = {0, 1, 2};
42    std::vector<aoclsparse_double_complex> x{{1, 1}, {2, 2}, {3, 3}};
43    std::vector<aoclsparse_double_complex> y{
44        {1, 1}, {1, 2}, {2, 3}, {1, 0}, {4, 1}, {1.2, -1}, {7, -1}, {0, 2}, {-2, 3}};
45    aoclsparse_double_complex dot_exp{-5, 23};
46    aoclsparse_double_complex dotc_exp{23, 5};
47    aoclsparse_double_complex dotp;
48    bool                      ok;
49
50    std::cout << "Invoking aoclsparse_zdotui...\n";
51
52    //Invoke complex double dot product
53    status = aoclsparse_zdotui(nnz, x.data(), indx.data(), y.data(), &dotp);
54    if(status != aoclsparse_status_success)
55    {
56        std::cerr << "Error returned from aoclsparse_cdotui, status = " << status << "."
57                  << std::endl;
58        return 1;
59    }
60
61#ifdef STD_CMPLX
62    std::cout << "Dot product: " << dotp << ", Expected dot product: " << dot_exp << std::endl;
63    ok = (dotp == dot_exp);
64#else
65    std::cout << "Dot product: (" << dotp.real << ", " << dotp.imag << "), Expected dot product: ("
66              << dot_exp.real << ", " << dot_exp.imag << ")" << std::endl;
67    ok = ((dotp.real == dot_exp.real) && (dotp.imag == dot_exp.imag));
68#endif
69
70    std::cout << "Invoking aoclsparse_zdotci...\n";
71
72    //Invoke complex double dot product
73    status = aoclsparse_zdotci(nnz, x.data(), indx.data(), y.data(), &dotp);
74    if(status != aoclsparse_status_success)
75    {
76        std::cerr << "Error returned from aoclsparse_cdotci, status = " << status << "."
77                  << std::endl;
78        return 2;
79    }
80#ifdef STD_CMPLX
81    std::cout << "Conjugated dot product: " << dotp
82              << ", Expected conjugated dot product: " << dotc_exp << std::endl;
83    ok &= (dotp == dotc_exp);
84    return ok ? 0 : 3;
85
86#else
87    std::cout << "Dot product: (" << dotp.real << ", " << dotp.imag << "), Expected dot product: ("
88              << dotc_exp.real << ", " << dotc_exp.imag << ")" << std::endl;
89    ok &= ((dotp.real == dotc_exp.real) && (dotp.imag == dotc_exp.imag));
90    return ok ? 0 : 4;
91#endif
92}

Note

The contents of the vectors are not checked for NaNs.

Parameters
  • nnz[in] The number of elements (length) of vectors \(x\) and \(indx\).

  • x[in] Array of at least nnz complex elements.

  • indx[in] Nonzero indices set, \(I_x\), of x described by this array of length at least nnz. Each entry must contain a valid index into y and be unique. The entries of indx are not checked for validity.

  • y[in] Array of at least \(\max(I_{x_i}, i \in \{ 1,\ldots,{\bf\mathsf{nnz}}\})\) complex elements.

  • dot[out] The dot product of x and y when nnz \(> 0\). If nnz \(\le 0\), dot is set to 0.

Return values
  • aoclsparse_status_success – The operation completed successfully.

  • aoclsparse_status_invalid_pointer – At least one of the pointers x, indx, y, dot is invalid.

  • aoclsparse_status_invalid_size – Indicates that the provided nnz is not positive.

aoclsparse_?doti()#

float aoclsparse_sdoti(const aoclsparse_int nnz, const float *x, const aoclsparse_int *indx, const float *y)#
double aoclsparse_ddoti(const aoclsparse_int nnz, const double *x, const aoclsparse_int *indx, const double *y)#

Sparse dot product for single and double data precision real types.

aoclsparse_sdoti() and aoclsparse_ddoti() compute the dot product of a real vector stored in a compressed format and a real dense vector. Let \(x\) and \(y\) be respectively a sparse and dense vectors in \(R^m\) with indx ( \(I_x\)) an indices array of length at least nnz that is used to index into the entries of dense vector \(y\), then these functions return

\[ {\bf\mathsf{dot}} = \sum_{i=0}^{{\bf\mathsf{nnz}}-1} x_{i} \cdot y_{I_{x_i}}. \]

Note

The contents of the vectors are not checked for NaNs.

Parameters
  • nnz[in] The number of elements to access in vectors x and indx.

  • x[in] Array of at least nnz elements.

  • indx[in] Nonzero indices set, \(I_x\), of x described by this array of length at least nnz. Each entry must contain a valid index into y and be unique. The entries of indx are not checked for validity.

  • y[in] Array of at least \(\max(I_{x_i}, i \in \{ 1,\ldots,{\bf\mathsf{nnz}}\})\) elements.

Return values

dot – Value of the dot product if nnz is positive, otherwise returns 0.

aoclsparse_?sctr()#

aoclsparse_status aoclsparse_ssctr(const aoclsparse_int nnz, const float *x, const aoclsparse_int *indx, float *y)#
aoclsparse_status aoclsparse_dsctr(const aoclsparse_int nnz, const double *x, const aoclsparse_int *indx, double *y)#
aoclsparse_status aoclsparse_csctr(const aoclsparse_int nnz, const void *x, const aoclsparse_int *indx, void *y)#
aoclsparse_status aoclsparse_zsctr(const aoclsparse_int nnz, const void *x, const aoclsparse_int *indx, void *y)#

Sparse scatter for single and double precision real and complex types.

aoclsparse_?sctr scatter the elements of a compressed sparse vector into a dense vector.

Let \(y\in R^m\) (or \(C^m\)) be a dense vector, and \(x\) be a compressed sparse vector with \(I_x\) be its nonzero indices set of length at least nnz and described by the array indx, then

\[ y_{I_{x_{i}}} = x_i, \quad i\in\{1,\ldots,{\bf\mathsf{nnz}}\}. \]

 1/* ************************************************************************
 2 * Copyright (c) 2023 Advanced Micro Devices, Inc.
 3 *
 4 * Permission is hereby granted, free of charge, to any person obtaining a copy
 5 * of this software and associated documentation files (the "Software"), to deal
 6 * in the Software without restriction, including without limitation the rights
 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 8 * copies of the Software, and to permit persons to whom the Software is
 9 * furnished to do so, subject to the following conditions:
10 *
11 * The above copyright notice and this permission notice shall be included in
12 * all copies or substantial portions of the Software.
13 *
14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
20 * THE SOFTWARE.
21 *
22 * ************************************************************************ */
23
24#include "aoclsparse.h"
25
26#include <cfloat>
27#include <cmath>
28#include <complex>
29#include <iomanip>
30#include <iostream>
31#include <vector>
32
33// Sample program to illustrate the usage of scatter of a compressed sparse vector to the full storage form (complex double precision)
34// Using other precisions are also similar to below example
35int main(void)
36{
37    std::cout << "--------------------------------" << std::endl
38              << "---- Scatter sample program ----" << std::endl
39              << "--------------------------------" << std::endl
40              << std::endl;
41
42    // Number of non-zeros of the sparse vector
43    const aoclsparse_int nnz = 3;
44
45    // Sparse index vector (does not need to be ordered)
46    std::vector<aoclsparse_int> indx = {0, 3, 6};
47    // Sparse value vector in compressed form
48    std::vector<std::complex<double>> x{{1.01, -1.13}, {2.4, -2.0}, {-0.3, 1.3}};
49
50    // Output vector
51    std::vector<std::complex<double>> y{{0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}};
52
53    // Expected output vector
54    std::vector<std::complex<double>> y_exp{
55        {1.01, -1.13}, {0, 0}, {0, 0}, {2.4, -2.0}, {0, 0}, {0, 0}, {-0.3, 1.3}};
56
57    aoclsparse_int ny = y.size();
58
59    aoclsparse_status status;
60
61    std::cout << "Invoking aoclsparse_zsctr...\n";
62    //Invoke complex scatter
63    status = aoclsparse_zsctr(nnz, x.data(), indx.data(), y.data());
64    if(status != aoclsparse_status_success)
65    {
66        std::cerr << "Error returned from aoclsparse_zsctr, status = " << status << "."
67                  << std::endl;
68        return 3;
69    }
70
71    // Check and print the result
72    std::cout << "Scatter to vector y: " << std::endl
73              << std::setw(10) << "y"
74              << "   " << std::setw(17) << "expected y" << std::endl;
75    std::cout << std::fixed;
76    std::cout.precision(4);
77    bool oki, ok = true;
78    for(aoclsparse_int i = 0; i < ny; i++)
79    {
80        oki = y[i] == y_exp[i];
81        ok &= oki;
82        std::cout << std::setw(11) << y[i] << std::setw(3) << "" << std::setw(11) << y_exp[i]
83                  << std::setw(2) << (oki ? "" : " ! Error") << std::endl;
84    }
85    return (ok ? 0 : 6);
86}

Note

The contents of the vectors are not checked for NaNs.

Parameters
  • nnz[in] The number of elements to use from \(x\) and \({\bf\mathsf{indx}}\).

  • x[in] Dense array of at least size \({\bf\mathsf{nnz}}\). The first \({\bf\mathsf{nnz}}\) elements are to be scattered.

  • indx[in] Nonzero index set for \(x\) of size at least \({\bf\mathsf{nnz}}\). The first \({\bf\mathsf{nnz}}\) indices are used for the scattering. The elements in this vector are only checked for non-negativity. The user should make sure that index is less than the size of y.

  • y[out] Array of at least \(\max(I_{x_i}, i \in \{ 1,\ldots,{\bf\mathsf{nnz}}\})\) elements.

Return values
  • aoclsparse_status_success – The operation completed successfully.

  • aoclsparse_status_invalid_pointer – At least one of the pointers x, indx, y is invalid.

  • aoclsparse_status_invalid_size – Indicates that provided nnz is less than zero.

  • aoclsparse_status_invalid_index_value – At least one of the indices in indx is negative.

sparse_?sctrs()#

aoclsparse_status aoclsparse_ssctrs(const aoclsparse_int nnz, const float *x, aoclsparse_int stride, float *y)#
aoclsparse_status aoclsparse_dsctrs(const aoclsparse_int nnz, const double *x, aoclsparse_int stride, double *y)#
aoclsparse_status aoclsparse_csctrs(const aoclsparse_int nnz, const void *x, aoclsparse_int stride, void *y)#
aoclsparse_status aoclsparse_zsctrs(const aoclsparse_int nnz, const void *x, aoclsparse_int stride, void *y)#

Sparse scatter with stride for real/complex single and double data precisions.

aoclsparse_?sctrs scatters the elements of a compressed sparse vector into a dense vector using a stride.

Let \(y\) be a dense vector of length \(n>0\), \(x\) be a compressed sparse vector with nnz > 0 nonzeros, and stride be a striding distance, then

\[ y_{{\bf\mathsf{stride}} \times i} = x_i,\quad i\in\{1,\ldots,{\bf\mathsf{nnz}}\}.\]

Note

Contents of the vector x are accessed but not checked.

Parameters
  • nnz[in] Number of nonzero elements to access in \(x\).

  • x[in] Array of at least nnz elements. The first nnz elements are to be scattered into y.

  • stride[in] (Positive) striding distance used to store elements in vector y.

  • y[out] Array of size at least stride \(\times\) nnz.

Return values
  • aoclsparse_status_success – The operation completed successfully.

  • aoclsparse_status_invalid_pointer – At least one of the pointers x, y is invalid.

  • aoclsparse_status_invalid_size – Indicates that one or more of the values provided in nnz or stride is not positive.

aoclsparse_?roti()#

aoclsparse_status aoclsparse_sroti(const aoclsparse_int nnz, float *x, const aoclsparse_int *indx, float *y, const float c, const float s)#
aoclsparse_status aoclsparse_droti(const aoclsparse_int nnz, double *x, const aoclsparse_int *indx, double *y, const double c, const double s)#

Apply Givens rotation to single or double precision real vectors.

aoclsparse_sroti() and aoclsparse_droti() apply the Givens rotation on elements of two real vectors.

Let \(y\in R^m\) be a vector in full storage form, \(x\) be a vector in a compressed form and \(I_x\) its nonzero indices set of length at least nnz described by the array indx, then

\[ x_i = {\bf\mathsf{c}} * x_i + {\bf\mathsf{s}} * y_{I_{x_{i}}}, \]
\[ y_{I_{x_{i}}} = {\bf\mathsf{c}} * y_{I_{x_{i}}} - {\bf\mathsf{s}} * x_i, \]

for \(i\in 1, \ldots, {\bf\mathsf{nnz}}\). The elements c, s are scalars.

  1/* ************************************************************************
  2 * Copyright (c) 2023 Advanced Micro Devices, Inc.
  3 *
  4 * Permission is hereby granted, free of charge, to any person obtaining a copy
  5 * of this software and associated documentation files (the "Software"), to deal
  6 * in the Software without restriction, including without limitation the rights
  7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  8 * copies of the Software, and to permit persons to whom the Software is
  9 * furnished to do so, subject to the following conditions:
 10 *
 11 * The above copyright notice and this permission notice shall be included in
 12 * all copies or substantial portions of the Software.
 13 *
 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 20 * THE SOFTWARE.
 21 *
 22 * ************************************************************************ */
 23
 24#include "aoclsparse.h"
 25
 26#include <cfloat>
 27#include <cmath>
 28#include <iomanip>
 29#include <iostream>
 30#include <limits>
 31#include <vector>
 32
 33// Sample program to illustrate the usage of Givens rotation on a compressed sparse vector and a full storage vector(double precision)
 34int main(void)
 35{
 36    std::cout << "-----------------------------" << std::endl
 37              << "---- roti sample program ----" << std::endl
 38              << "-----------------------------" << std::endl
 39              << std::endl;
 40
 41    // Input data
 42
 43    // Number of non-zeros of the sparse vector
 44    const aoclsparse_int nnz = 5;
 45
 46    // Sparse index vector (does not need to be ordered)
 47    std::vector<aoclsparse_int> indx{0, 3, 6, 7, 9};
 48
 49    // Sparse value vector in compressed form
 50    std::vector<double> x{-0.75, 4, -9.5, 46, 1.25};
 51
 52    // Output vector
 53    std::vector<double> y{-0.75, 0, 0, 4, 0, 0, -9.5, 46, 0, 1.25};
 54
 55    // Expected output vectors
 56    std::vector<double> y_exp{0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
 57    std::vector<double> x_exp{-3, 16, -38, 184, 5};
 58
 59    const double c = 2;
 60    const double s = 2;
 61
 62    aoclsparse_int ny = y.size();
 63    aoclsparse_int nx = x.size();
 64
 65    aoclsparse_status status;
 66    std::cout << "Invoking aoclsparse_droti...\n";
 67    //Invoke aoclsparse_droti to apply Givens rotation
 68    status = aoclsparse_droti(nnz, x.data(), indx.data(), y.data(), c, s);
 69
 70    if(status != aoclsparse_status_success)
 71    {
 72        std::cerr << "Error returned from aoclsparse_droti, status = " << status << "."
 73                  << std::endl;
 74        return 3;
 75    }
 76
 77    // Check and print the result
 78    std::cout << "The vector y after Givens rotation: " << std::endl
 79              << std::setw(11) << "y"
 80              << "   " << std::setw(11) << "expected" << std::endl;
 81    std::cout << std::fixed;
 82
 83    std::cout.precision(2);
 84
 85    bool oki, ok_y = true, ok_x = true;
 86
 87    //Initializing precision tolerance range for double
 88    const double tol = std::sqrt(std::numeric_limits<double>::epsilon());
 89
 90    for(aoclsparse_int i = 0; i < ny; i++)
 91    {
 92        oki = std::abs(y[i] - y_exp[i]) <= tol;
 93        ok_y &= oki;
 94        std::cout << std::setw(11) << y[i] << std::setw(3) << "" << std::setw(11) << y_exp[i]
 95                  << std::setw(2) << (oki ? "" : " !") << std::endl;
 96    }
 97    std::cout << std::endl;
 98    std::cout << "The vector x after Givens rotation: " << std::endl
 99              << std::setw(11) << "x"
100              << "   " << std::setw(11) << "expected" << std::endl;
101    std::cout << std::fixed;
102    for(aoclsparse_int i = 0; i < nx; i++)
103    {
104        oki = std::abs(x[i] - x_exp[i]) <= tol;
105        ok_x &= oki;
106        std::cout << std::setw(11) << x[i] << std::setw(3) << "" << std::setw(11) << x_exp[i]
107                  << std::setw(2) << (oki ? "" : " !") << std::endl;
108    }
109    return ((ok_y && ok_x) ? 0 : 6);
110}

Note

The contents of the vectors are not checked for NaNs.

Parameters
  • nnz[in] The number of elements to use from \(x\) and \({\bf\mathsf{indx}}\).

  • x[inout] Array \(x\) of at least \({\bf\mathsf{nnz}}\) elements in compressed form. The elements of the array are updated after applying the Givens rotation.

  • indx[in] Nonzero index set of \(x\), \(I_x\), with at least \({\bf\mathsf{nnz}}\) elements. The first \({\bf\mathsf{nnz}}\) elements are used to apply the Givens rotation. The elements in this vector are only checked for non-negativity. The caller should make sure that each entry is less than the size of y and are all distinct.

  • y[inout] Dense array of at least \(\max(I_{x_i}, \text{ for } i \in \{ 1,\ldots, {\bf\mathsf{nnz}}\})\) elements in full storage form. The elements of the array are updated after applying the Givens rotation.

  • c[in] A scalar.

  • s[in] A scalar.

Return values
  • aoclsparse_status_success – The operation completed successfully.

  • aoclsparse_status_invalid_pointer – At least one of the pointers x, indx, y is invalid.

  • aoclsparse_status_invalid_size – Indicates that provided nnz is less than zero.

  • aoclsparse_status_invalid_index_value – At least one of the indices in indx is negative. With this error, the values of vectors x and y are undefined.

aoclsparse_?gthr()#

aoclsparse_status aoclsparse_sgthr(aoclsparse_int nnz, const float *y, float *x, const aoclsparse_int *indx)#
aoclsparse_status aoclsparse_dgthr(aoclsparse_int nnz, const double *y, double *x, const aoclsparse_int *indx)#
aoclsparse_status aoclsparse_cgthr(aoclsparse_int nnz, const void *y, void *x, const aoclsparse_int *indx)#
aoclsparse_status aoclsparse_zgthr(aoclsparse_int nnz, const void *y, void *x, const aoclsparse_int *indx)#

Gather elements from a dense vector and store them into a sparse vector.

The aoclsparse_?gthr is a group of functions that gather the elements indexed in indx from the dense vector y into the sparse vector x.

Let \(y\in R^m\) (or \(C^m\)) be a dense vector, \(x\) be a sparse vector from the same space and \(I_x\) be a set of indices of size \(0<\) nnz \(\le m\) described by indx, then

\[ x_i = y_{I_{x_i}}, i\in\{1,\ldots,{\bf\mathsf{nnz}}\}. \]

For double precision complex vectors use aoclsparse_zgthr() and for single precision complex vectors use aoclsparse_cgthr().

  1/* ************************************************************************
  2 * Copyright (c) 2023 Advanced Micro Devices, Inc.
  3 *
  4 * Permission is hereby granted, free of charge, to any person obtaining a copy
  5 * of this software and associated documentation files (the "Software"), to deal
  6 * in the Software without restriction, including without limitation the rights
  7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  8 * copies of the Software, and to permit persons to whom the Software is
  9 * furnished to do so, subject to the following conditions:
 10 *
 11 * The above copyright notice and this permission notice shall be included in
 12 * all copies or substantial portions of the Software.
 13 *
 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 20 * THE SOFTWARE.
 21 *
 22 * ************************************************************************ */
 23
 24#include "aoclsparse.h"
 25
 26#include <cfloat>
 27#include <cmath>
 28#include <complex>
 29#include <iomanip>
 30#include <iostream>
 31#include <vector>
 32
 33int main()
 34{
 35    std::cout << "-------------------------------" << std::endl
 36              << "---- Gather sample program ----" << std::endl
 37              << "-------------------------------" << std::endl
 38              << std::endl;
 39
 40    // Input data
 41
 42    // Number of non-zeros of the sparse vector
 43    const aoclsparse_int nnz = 3;
 44
 45    // Sparse index vector (does not need to be ordered)
 46    std::vector<aoclsparse_int> indx = {0, 5, 3};
 47
 48    // Sparse value vector
 49    std::vector<std::complex<double>> x(3);
 50
 51    // Dense vector
 52    std::vector<std::complex<double>> y{
 53        {1, 1}, {1, 2}, {2, 3}, {1, 0}, {4, 1}, {1.2, -1}, {7, -1}, {0, 2}, {-2, 3}};
 54
 55    // Expected sparse value vector
 56    std::vector<std::complex<double>> x_exp{{1, 1}, {1.2, -1}, {1, 0}};
 57
 58    // Expected dense vector
 59    std::vector<std::complex<double>> y_exp{
 60        {0, 0}, {1, 2}, {2, 3}, {0, 0}, {4, 1}, {0, 0}, {7, -1}, {0, 2}, {-2, 3}};
 61
 62    aoclsparse_int ny = y.size();
 63
 64    aoclsparse_status status;
 65
 66    // Call the gather function
 67    status = aoclsparse_zgthr(nnz, y.data(), x.data(), indx.data());
 68    if(status != aoclsparse_status_success)
 69    {
 70        std::cerr << "Error returned from aoclsparse_zgthr, status = " << status << "."
 71                  << std::endl;
 72        return 1;
 73    }
 74
 75    // Check and print the result
 76    std::cout << "Gather from vector y: " << std::endl
 77              << std::setw(11) << "x"
 78              << "   " << std::setw(11) << "expected" << std::endl;
 79    std::cout << std::fixed;
 80    std::cout.precision(1);
 81    bool oki, ok = true;
 82    for(aoclsparse_int i = 0; i < nnz; i++)
 83    {
 84        oki = x[i] == x_exp[i];
 85        ok &= oki;
 86        std::cout << std::setw(11) << x[i] << std::setw(3) << "" << std::setw(11) << x_exp[i]
 87                  << std::setw(2) << (oki ? "" : " !") << std::endl;
 88    }
 89
 90    // Call the gather with zero function
 91    status = aoclsparse_zgthrz(nnz, y.data(), x.data(), indx.data());
 92    if(status != aoclsparse_status_success)
 93    {
 94        std::cerr << "Error returned from aoclsparse_zgthr, status = " << status << "."
 95                  << std::endl;
 96        return 3;
 97    }
 98
 99    // Check and print the result
100    std::cout << std::endl
101              << "Gather from vector y: " << std::endl
102              << std::setw(11) << "y" << std::setw(3) << "" << std::setw(11) << "expected"
103              << std::setw(3) << "" << std::setw(11) << "x" << std::setw(3) << "" << std::setw(11)
104              << "expected" << std::endl;
105    std::cout << std::fixed;
106    std::cout.precision(1);
107    bool oky;
108    for(aoclsparse_int i = 0; i < ny; i++)
109    {
110        oky = y[i] == y_exp[i];
111        ok &= oky;
112        std::cout << std::setw(11) << y[i] << std::setw(3) << "" << std::setw(11) << y_exp[i]
113                  << std::setw(2) << (oky ? "" : " !");
114        if(i < nnz)
115        {
116            oki = x[i] == x_exp[i];
117            ok &= oki;
118            std::cout << " " << std::setw(11) << x[i] << std::setw(3) << "" << std::setw(11)
119                      << x_exp[i] << std::setw(2) << (oki ? "" : " !");
120        }
121        std::cout << std::endl;
122    }
123
124    return (ok ? 0 : 6);
125}

Note

These functions assume that the indices stored in indx are less than \(m\) without duplicate elements, and that x and indx are pointers to vectors of size at least nnz.

Parameters
  • nnz[in] number of non-zero entries of \(x\). If nnz is zero, then none of the entries of vectors x, y, and indx are touched.

  • y[in] pointer to dense vector \(y\) of size at least \(m\).

  • x[out] pointer to sparse vector \(x\) with at least nnz non-zero elements.

  • indx[in] index vector of size nnz, containing the indices of the non-zero values of \(x\). Indices should range from 0 to \(m-1\), need not be ordered. The elements in this vector are only checked for non-negativity. The user should make sure that no index is out-of-bound and that it does not contains any duplicates.

Return values
  • aoclsparse_status_success – the operation completed successfully

  • aoclsparse_status_invalid_sizennz parameter value is negative

  • aoclsparse_status_invalid_pointer – at least one of the pointers y, x or indx is invalid

  • aoclsparse_status_invalid_index_value – at least one of the indices in indx is negative

aoclsparse_?gthrz()#

aoclsparse_status aoclsparse_sgthrz(aoclsparse_int nnz, float *y, float *x, const aoclsparse_int *indx)#
aoclsparse_status aoclsparse_dgthrz(aoclsparse_int nnz, double *y, double *x, const aoclsparse_int *indx)#
aoclsparse_status aoclsparse_cgthrz(aoclsparse_int nnz, void *y, void *x, const aoclsparse_int *indx)#
aoclsparse_status aoclsparse_zgthrz(aoclsparse_int nnz, void *y, void *x, const aoclsparse_int *indx)#

Gather and zero out elements from a dense vector and store them into a sparse vector.

The aoclsparse_?gthrz is a group of functions that gather the elements

indexed in indx from the dense vector y into the sparse vector x. The gathered elements in \(y\) are replaced by zero.

Let \(y\in R^m\) (or \(C^m\)) be a dense vector, \(x\) be a sparse vector from the same space and \(I_x\) be a set of indices of size \(0<\) nnz \(\le m\) described by indx, then

\[ x_i = y_{I_{x_i}}, i\in\{1,\ldots,{\bf\mathsf{nnz}}\}, \text{ and after the assignment, } y_{I_{x_i}}=0, i\in\{1,\ldots,{\bf\mathsf{nnz}}\}. \]

For double precision complex vectors use aoclsparse_zgthrz() and for single precision complex vectors use aoclsparse_cgthrz().

Note

These functions assume that the indices stored in indx are less than \(m\) without duplicate elements, and that x and indx are pointers to vectors of size at least nnz.

Parameters
  • nnz[in] number of non-zero entries of \(x\). If nnz is zero, then none of the entries of vectors x, y, and indx are touched.

  • y[in] pointer to dense vector \(y\) of size at least \(m\).

  • x[out] pointer to sparse vector \(x\) with at least nnz non-zero elements.

  • indx[in] index vector of size nnz, containing the indices of the non-zero values of \(x\). Indices should range from 0 to \(m-1\), need not be ordered. The elements in this vector are only checked for non-negativity. The user should make sure that no index is out-of-bound and that it does not contains any duplicates.

Return values
  • aoclsparse_status_success – the operation completed successfully

  • aoclsparse_status_invalid_sizennz parameter value is negative

  • aoclsparse_status_invalid_pointer – at least one of the pointers y, x or indx is invalid

  • aoclsparse_status_invalid_index_value – at least one of the indices in indx is negative

aoclsparse_?gthrs()#

aoclsparse_status aoclsparse_sgthrs(aoclsparse_int nnz, const float *y, float *x, aoclsparse_int stride)#
aoclsparse_status aoclsparse_dgthrs(aoclsparse_int nnz, const double *y, double *x, aoclsparse_int stride)#
aoclsparse_status aoclsparse_cgthrs(aoclsparse_int nnz, const void *y, void *x, aoclsparse_int stride)#
aoclsparse_status aoclsparse_zgthrs(aoclsparse_int nnz, const void *y, void *x, aoclsparse_int stride)#

Gather elements from a dense vector using a stride and store them into a sparse vector.

The aoclsparse_?gthrs is a group of functions that gather the elements from the dense vector y using a fixed stride distance and copies them into the sparse vector x.

Let \(y\in R^m\) (or \(C^m\)) be a dense vector, \(x\) be a sparse vector from the same space and stride be a (positive) striding distance, then \( x_i = y_{\text{stride} \times i}, \quad i\in\{1,\ldots,{\bf\mathsf{nnz}}\}. \)

Parameters
  • nnz[in] Number of non-zero entries of \(x\). If nnz is zero, then none of the entries of vectors x and y are accessed. Note that nnz must be such that stride \(\times\) nnz must be less or equal to \(m\).

  • y[in] Pointer to dense vector \(y\) of size at least \(m\).

  • x[out] Pointer to sparse vector \(x\) with at least nnz non-zero elements.

  • stride[in] Striding distance used to access elements in the dense vector y. It must be such that stride \(\times\) nnz is less or equal to \(m\).

Return values
  • aoclsparse_status_success – the operation completed successfully.

  • aoclsparse_status_invalid_size – at least one of the parameters nnz or stride has a negative value.

  • aoclsparse_status_invalid_pointer – at least one of the pointers y, or x is invalid.

Level 2#

This module holds all sparse level 2 routines.

The sparse level 2 routines describe operations between a matrix in sparse format and a vector in dense or sparse format.

aoclsparse_?mv()#

aoclsparse_status aoclsparse_smv(aoclsparse_operation op, const float *alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const float *x, const float *beta, float *y)#
aoclsparse_status aoclsparse_dmv(aoclsparse_operation op, const double *alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const double *x, const double *beta, double *y)#
aoclsparse_status aoclsparse_cmv(aoclsparse_operation op, const aoclsparse_float_complex *alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_float_complex *x, const aoclsparse_float_complex *beta, aoclsparse_float_complex *y)#
aoclsparse_status aoclsparse_zmv(aoclsparse_operation op, const aoclsparse_double_complex *alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_double_complex *x, const aoclsparse_double_complex *beta, aoclsparse_double_complex *y)#

Compute sparse matrix-vector multiplication for real/complex single and double data precisions.

The aoclsparse_?mv perform sparse matrix-vector products of the form

\[ y = \alpha \, op(A) \, x + \beta \, y, \]
where, \(x\) and \(y\) are dense vectors, \(\alpha\) and \(\beta\) are scalars, and \(A\) is a sparse matrix structure. The matrix operation \(op()\) is defined as:
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]

 1/* ************************************************************************
 2 * Copyright (c) 2020-2023 Advanced Micro Devices, Inc.
 3 *
 4 * Permission is hereby granted, free of charge, to any person obtaining a copy
 5 * of this software and associated documentation files (the "Software"), to deal
 6 * in the Software without restriction, including without limitation the rights
 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 8 * copies of the Software, and to permit persons to whom the Software is
 9 * furnished to do so, subject to the following conditions:
10 *
11 * The above copyright notice and this permission notice shall be included in
12 * all copies or substantial portions of the Software.
13 *
14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
20 * THE SOFTWARE.
21 *
22 * ************************************************************************ */
23
24#include "aoclsparse.h"
25
26#include <iostream>
27
28#define M 5
29#define N 5
30#define NNZ 8
31
32int main(void)
33{
34    aoclsparse_operation trans = aoclsparse_operation_none;
35
36    double alpha = 1.0;
37    double beta  = 0.0;
38
39    // Print aoclsparse version
40    std::cout << aoclsparse_get_version() << std::endl;
41
42    // Create matrix descriptor
43    aoclsparse_mat_descr descr;
44    // aoclsparse_create_mat_descr set aoclsparse_matrix_type to aoclsparse_matrix_type_general
45    // and aoclsparse_index_base to aoclsparse_index_base_zero.
46    aoclsparse_create_mat_descr(&descr);
47
48    aoclsparse_index_base base = aoclsparse_index_base_zero;
49
50    // Initialise matrix
51    //  1  0  0  2  0
52    //  0  3  0  0  0
53    //  0  0  4  0  0
54    //  0  5  0  6  7
55    //  0  0  0  0  8
56    aoclsparse_int    csr_row_ptr[M + 1] = {0, 2, 3, 4, 7, 8};
57    aoclsparse_int    csr_col_ind[NNZ]   = {0, 3, 1, 2, 1, 3, 4, 4};
58    double            csr_val[NNZ]       = {1, 2, 3, 4, 5, 6, 7, 8};
59    aoclsparse_matrix A;
60    aoclsparse_create_dcsr(&A, base, M, N, NNZ, csr_row_ptr, csr_col_ind, csr_val);
61
62    // Initialise vectors
63    double x[N] = {1.0, 2.0, 3.0, 4.0, 5.0};
64    double y[M];
65
66    //to identify hint id(which routine is to be executed, destroyed later)
67    aoclsparse_set_mv_hint(A, trans, descr, 1);
68
69    // Optimize the matrix, "A"
70    aoclsparse_optimize(A);
71
72    std::cout << "Invoking aoclsparse_dmv..";
73    //Invoke SPMV API (double precision)
74    aoclsparse_dmv(trans, &alpha, A, descr, x, &beta, y);
75    std::cout << "Done." << std::endl;
76    std::cout << "Output Vector:" << std::endl;
77    for(aoclsparse_int i = 0; i < M; i++)
78        std::cout << y[i] << std::endl;
79
80    aoclsparse_destroy_mat_descr(descr);
81    aoclsparse_destroy(&A);
82    return 0;
83}

  1/* ************************************************************************
  2 * Copyright (c) 2023 Advanced Micro Devices, Inc.
  3 *
  4 * Permission is hereby granted, free of charge, to any person obtaining a copy
  5 * of this software and associated documentation files (the "Software"), to deal
  6 * in the Software without restriction, including without limitation the rights
  7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  8 * copies of the Software, and to permit persons to whom the Software is
  9 * furnished to do so, subject to the following conditions:
 10 *
 11 * The above copyright notice and this permission notice shall be included in
 12 * all copies or substantial portions of the Software.
 13 *
 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 20 * THE SOFTWARE.
 21 *
 22 * ************************************************************************ */
 23
 24#include "aoclsparse.h"
 25
 26#include <math.h>
 27#include <stdio.h>
 28
 29#define M 5
 30#define N 5
 31#define NNZ 8
 32
 33int main(void)
 34{
 35    aoclsparse_status     status;
 36    aoclsparse_matrix     A     = NULL;
 37    aoclsparse_mat_descr  descr = NULL;
 38    aoclsparse_operation  trans = aoclsparse_operation_none;
 39    aoclsparse_index_base base  = aoclsparse_index_base_zero;
 40
 41    double alpha = 1.0;
 42    double beta  = 0.0;
 43
 44    // Input matrix
 45    //  1  0  0  2  0
 46    //  0  3  0  0  0
 47    //  0  0  4  0  0
 48    //  0  5  0  6  7
 49    //  0  0  0  0  8
 50    aoclsparse_int csr_row_ptr[M + 1] = {0, 2, 3, 4, 7, 8};
 51    aoclsparse_int csr_col_ind[NNZ]   = {0, 3, 1, 2, 1, 3, 4, 4};
 52    double         csr_val[NNZ]       = {1, 2, 3, 4, 5, 6, 7, 8};
 53
 54    // Input vectors
 55    double x[N] = {1.0, 2.0, 3.0, 4.0, 5.0};
 56    double y[M];
 57
 58    // Expected reference result & tolerance
 59    double         y_exp[M] = {9.0, 6.0, 12.0, 69.0, 40.0};
 60    double         tol      = 1e-12;
 61    aoclsparse_int oki, ok = 1;
 62
 63    // Print aoclsparse version
 64    printf("%s\n", aoclsparse_get_version());
 65
 66    // Create matrix descriptor
 67    // aoclsparse_create_mat_descr sets aoclsparse_matrix_type to aoclsparse_matrix_type_general
 68    // and aoclsparse_index_base to aoclsparse_index_base_zero.
 69    aoclsparse_create_mat_descr(&descr);
 70
 71    // Initialise sparse matrix
 72    status = aoclsparse_create_dcsr(&A, base, M, N, NNZ, csr_row_ptr, csr_col_ind, csr_val);
 73    if(status != aoclsparse_status_success)
 74    {
 75        printf("Error while creating a sparse matrix, status = %i\n", status);
 76        return 1;
 77    }
 78
 79    // Hint the system what operation to expect // to identify hint id(which routine is to be executed, destroyed later)
 80    status = aoclsparse_set_mv_hint(A, trans, descr, 1);
 81    if(status != aoclsparse_status_success)
 82    {
 83        printf("Error while hinting operation, status = %i\n", status);
 84        return 1;
 85    }
 86
 87    // Optimize the matrix
 88    status = aoclsparse_optimize(A);
 89    if(status != aoclsparse_status_success)
 90    {
 91        printf("Error while optimizing the matrix, status = %i\n", status);
 92        return 1;
 93    }
 94
 95    // Invoke SPMV API (double precision)
 96    printf("Invoking aoclsparse_dmv...\n");
 97    status = aoclsparse_dmv(trans, &alpha, A, descr, x, &beta, y);
 98    if(status != aoclsparse_status_success)
 99    {
100        printf("Error while computing SPMV, status = %i\n", status);
101        return 1;
102    }
103
104    // Print and check the results
105    printf("Output vector:\n");
106    for(aoclsparse_int i = 0; i < M; i++)
107    {
108        oki = fabs(y[i] - y_exp[i]) <= tol;
109        printf("  %lf %c\n", y[i], oki ? ' ' : '!');
110        ok = ok && oki;
111    }
112
113    aoclsparse_destroy_mat_descr(descr);
114    aoclsparse_destroy(&A);
115    return ok ? 0 : 2;
116}
Parameters
Return values
  • aoclsparse_status_success – The operation completed successfully.

  • aoclsparse_status_invalid_size – The value of m, n or nnz is invalid.

  • aoclsparse_status_invalid_pointerdescr, alpha, internal structures related to the sparse matrix A, x, beta or y has an invalid pointer.

  • aoclsparse_status_not_implemented – The requested functionality is not implemented.

aoclsparse_?trsv()#

aoclsparse_status aoclsparse_strsv(aoclsparse_operation trans, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const float *b, float *x)#
aoclsparse_status aoclsparse_dtrsv(aoclsparse_operation trans, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const double *b, double *x)#
aoclsparse_status aoclsparse_ctrsv(aoclsparse_operation trans, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_float_complex *b, aoclsparse_float_complex *x)#
aoclsparse_status aoclsparse_ztrsv(aoclsparse_operation trans, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_double_complex *b, aoclsparse_double_complex *x)#

Sparse triangular solver for real/complex single and double data precisions.

The function aoclsparse_strsv() and variants solve sparse lower (or upper) triangular linear system of equations. The system is defined by the sparse \(m \times m\) matrix \(A\), the dense solution \(m\)-vector \(x\), and the right-hand side dense \(m\)-vector \(b\). Vector \(b\) is multiplied by \(\alpha\). The solution \(x\) is estimated by solving

\[ op(L) \cdot x = \alpha \cdot b, \quad \text{ or } \quad op(U) \cdot x = \alpha \cdot b, \]
where \(L = \text{tril}(A)\) is the lower triangle of matrix \(A\), similarly, \(U = \text{triu}(A)\) is the upper triangle of matrix \(A\). The operator \(op()\) is regarded as the matrix linear operation,
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]

Notes

1. This routine supports only sparse matrices in CSR format.

2. If the matrix descriptor descr specifies that the matrix \(A\) is to be regarded has having a unitary diagonal, then the main diagonal entries of matrix \(A\) are not accessed and are all considered to be unitary.

3. The input matrix need not be (upper or lower) triangular matrix, in the descr, the fill_mode entity specifies which triangle to consider, namely, if fill_mode = aoclsparse_fill_mode_lower, then

\[ op(L) \cdot x = \alpha \cdot b, \]

otherwise, if fill_mode = aoclsparse_fill_mode_upper, then

\[ op(U) \cdot x = \alpha \cdot b, \]

is solved.

4. To increase performance and if the matrix \(A\) is to be used more than once to solve for different right-hand sides b, then it is encouraged to provide hints using aoclsparse_set_sv_hint() and aoclsparse_optimize(), otherwise, the optimization for the matrix will be done by the solver on entry.

5. There is a kid (Kernel ID) variation of TRSV, namely with a suffix of _kid, aoclsparse_strsv_kid() (and variations) where it is possible to specify the TRSV kernel to use (if possible).

  1/* ************************************************************************
  2 * Copyright (c) 2022-2023 Advanced Micro Devices, Inc.
  3 *
  4 * Permission is hereby granted, free of charge, to any person obtaining a copy
  5 * of this software and associated documentation files (the "Software"), to deal
  6 * in the Software without restriction, including without limitation the rights
  7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  8 * copies of the Software, and to permit persons to whom the Software is
  9 * furnished to do so, subject to the following conditions:
 10 *
 11 * The above copyright notice and this permission notice shall be included in
 12 * all copies or substantial portions of the Software.
 13 *
 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 20 * THE SOFTWARE.
 21 *
 22 * ************************************************************************ */
 23
 24#include "aoclsparse.h"
 25
 26#include <iomanip>
 27#include <iostream>
 28
 29int main()
 30{
 31    std::cout << "-------------------------------" << std::endl
 32              << " Triangle solve sample program" << std::endl
 33              << "-------------------------------" << std::endl
 34              << std::endl;
 35
 36    // Create a tri-diagonal matrix in CSR format
 37    // | 1  2  0  0 |
 38    // | 3  1  2  0 |
 39    // | 0  3  1  2 |
 40    // | 0  0  3  1 |
 41    aoclsparse_int    n = 4, m = 4, nnz = 10;
 42    aoclsparse_int    icrow[5] = {0, 2, 5, 8, 10};
 43    aoclsparse_int    icol[18] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};
 44    double            aval[18] = {1, 2, 3, 1, 2, 3, 1, 2, 3, 1};
 45    aoclsparse_status status;
 46
 47    // create aoclsparse matrix and its descriptor
 48    aoclsparse_matrix     A;
 49    aoclsparse_index_base base = aoclsparse_index_base_zero;
 50    aoclsparse_mat_descr  descr_a;
 51    aoclsparse_operation  trans = aoclsparse_operation_none;
 52    aoclsparse_create_dcsr(&A, base, m, n, nnz, icrow, icol, aval);
 53    aoclsparse_create_mat_descr(&descr_a);
 54
 55    // A = L + D + U where:
 56    // - L and U are the strict lower and upper triangle of A
 57    // - D is the diagonal
 58    // Use the matrix descriptor to solve (L+D)x = b
 59    // b = [1, 4, 4, 4]^t
 60    double b[4] = {1, 4, 4, 4};
 61    aoclsparse_set_mat_type(descr_a, aoclsparse_matrix_type_triangular);
 62    aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_lower);
 63    aoclsparse_set_mat_index_base(descr_a, base);
 64    status = aoclsparse_set_sv_hint(A, trans, descr_a, 1);
 65    if(status != aoclsparse_status_success)
 66    {
 67        std::cerr << "Error setting SV hint, status = " << status << std::endl;
 68        return 1;
 69    }
 70    status = aoclsparse_optimize(A);
 71    if(status != aoclsparse_status_success)
 72    {
 73        std::cerr << "Error returned from aoclsparse_optimize, status = " << status << "."
 74                  << std::endl;
 75        return 2;
 76    }
 77    // Call the triangle solver
 78    double alpha = 1.0;
 79    double x[n];
 80    status = aoclsparse_dtrsv(trans, alpha, A, descr_a, b, x);
 81    if(status != aoclsparse_status_success)
 82    {
 83        std::cerr << "Error returned from aoclsparse_dtrsv, status = " << status << "."
 84                  << std::endl;
 85        return 3;
 86    }
 87
 88    // Print the result
 89    std::cout << "Solving (L+D)x = b: " << std::endl << "  x = ";
 90    std::cout << std::fixed;
 91    std::cout.precision(1);
 92    for(aoclsparse_int i = 0; i < n; i++)
 93        std::cout << x[i] << "  ";
 94    std::cout << std::endl;
 95
 96    // The same method can be used to solve (U+D)x = b
 97    aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_upper);
 98    status = aoclsparse_dtrsv(trans, alpha, A, descr_a, b, x);
 99    if(status != aoclsparse_status_success)
100    {
101        std::cerr << "Error returned from aoclsparse_dtrsv, status = " << status << "."
102                  << std::endl;
103        return 3;
104    }
105
106    // Print the result
107    std::cout << std::endl;
108    std::cout << "Solving (U+D)x = b: " << std::endl << "  x = ";
109    for(aoclsparse_int i = 0; i < n; i++)
110        std::cout << x[i] << "  ";
111    std::cout << std::endl;
112
113    // destroy the aoclsparse memory
114    aoclsparse_destroy_mat_descr(descr_a);
115    aoclsparse_destroy(&A);
116
117    return 0;
118}

  1/* ************************************************************************
  2 * Copyright (c) 2023-2024 Advanced Micro Devices, Inc.
  3 *
  4 * Permission is hereby granted, free of charge, to any person obtaining a copy
  5 * of this software and associated documentation files (the "Software"), to deal
  6 * in the Software without restriction, including without limitation the rights
  7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  8 * copies of the Software, and to permit persons to whom the Software is
  9 * furnished to do so, subject to the following conditions:
 10 *
 11 * The above copyright notice and this permission notice shall be included in
 12 * all copies or substantial portions of the Software.
 13 *
 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 20 * THE SOFTWARE.
 21 *
 22 * ************************************************************************ */
 23
 24#include "aoclsparse.h"
 25
 26#include <assert.h>
 27#include <complex>
 28#include <iomanip>
 29#include <iostream>
 30#include <limits>
 31#include <vector>
 32
 33int main()
 34{
 35    std::cout << "-------------------------------" << std::endl
 36              << " Triangle solve sample program" << std::endl
 37              << "-------------------------------" << std::endl
 38              << std::endl;
 39
 40    /* Solve two complex linear systems of equations
 41     * Lx = alpha b, and U^Tx = alpha b,
 42     * where L is the lower triangular part of A and
 43     * U is the upper triangular part of A. U^H is
 44     * the conjugate transpose of U (a lower triangular matrix).
 45     * The complex matrix A is the tri-diagonal matrix in CSR format
 46     *
 47     * | 1+3i  2+5i     0     0 |
 48     * | 3     1+2i  2-2i     0 |
 49     * |    0     3     1  2+3i |
 50     * |    0     0  3+1i  1+1i |
 51     */
 52    const aoclsparse_int                   n = 4, m = 4, nnz = 10;
 53    aoclsparse_int                         icrow[5] = {0, 2, 5, 8, 10};
 54    aoclsparse_int                         icol[18] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};
 55    std::vector<aoclsparse_double_complex> aval     = {{1., 3.},
 56                                                       {2., 5.},
 57                                                       {3., 0.},
 58                                                       {1., 2.},
 59                                                       {2., -2.},
 60                                                       {3., 0.},
 61                                                       {1., 0.},
 62                                                       {2., 3.},
 63                                                       {3., 1.},
 64                                                       {1., 1.}};
 65    aoclsparse_status                      status;
 66    std::vector<aoclsparse_double_complex> ref;
 67
 68    // create aoclsparse matrix and its descriptor
 69    aoclsparse_matrix     A;
 70    aoclsparse_index_base base = aoclsparse_index_base_zero;
 71    aoclsparse_mat_descr  descr_a;
 72    aoclsparse_operation  trans = aoclsparse_operation_none;
 73    status = aoclsparse_create_zcsr(&A, base, m, n, nnz, icrow, icol, aval.data());
 74    if(status != aoclsparse_status_success)
 75    {
 76        std::cerr << "Error creating the matrix, status = " << status << std::endl;
 77        return 1;
 78    }
 79    aoclsparse_create_mat_descr(&descr_a);
 80
 81    /* Solve the lower triangular system Lx = b,
 82     * here alpha=1 and b = [1+i, 4+2i, 4+i, 4].
 83     */
 84    std::vector<aoclsparse_double_complex> b = {{1., 1.}, {4., 2.}, {4., 1}, {4., 0}};
 85    aoclsparse_set_mat_type(descr_a, aoclsparse_matrix_type_triangular);
 86    aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_lower);
 87    status = aoclsparse_set_sv_hint(A, trans, descr_a, 1);
 88    if(status != aoclsparse_status_success)
 89    {
 90        std::cerr << "Error setting SV hint, status = " << status << std::endl;
 91        return 1;
 92    }
 93    status = aoclsparse_optimize(A);
 94    if(status != aoclsparse_status_success)
 95    {
 96        std::cerr << "Error returned from aoclsparse_optimize, status = " << status << "."
 97                  << std::endl;
 98        return 2;
 99    }
100
101    // Call the triangle solver
102    aoclsparse_double_complex alpha = {1.0, 0.};
103    aoclsparse_double_complex x[n];
104    status = aoclsparse_ztrsv(trans, alpha, A, descr_a, b.data(), x);
105    if(status != aoclsparse_status_success)
106    {
107        std::cerr << "Error returned from aoclsparse_ztrsv, status = " << status << "."
108                  << std::endl;
109        return 3;
110    }
111
112    // Print and check the result
113    double tol = 20 * std::numeric_limits<double>::epsilon();
114    ref.assign({{0.4, -0.2}, {1.6, -0.6}, {-0.8, 2.8}, {0.8, -8.4}});
115    std::cout << "Solving Lx = alpha b: " << std::endl << "  x = ";
116    std::cout << std::fixed;
117    std::cout.precision(1);
118    bool oki, ok = true;
119    for(aoclsparse_int i = 0; i < n; i++)
120    {
121        oki = std::abs(x[i].real - ref[i].real) <= tol && std::abs(x[i].imag - ref[i].imag) <= tol;
122        std::cout << "(" << x[i].real << ", " << x[i].imag << "i) " << (oki ? " " : "!  ");
123        ok &= oki;
124    }
125    std::cout << std::endl;
126
127    /* Solve the lower triangular system U^Hx = b,
128     * here alpha=1-i and b is unchanged.
129     */
130    alpha = {1, -1};
131    // Indicate to use only the conjugate transpose of the upper part of A
132    trans = aoclsparse_operation_conjugate_transpose;
133    aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_upper);
134    status = aoclsparse_ztrsv(trans, alpha, A, descr_a, b.data(), x);
135    if(status != aoclsparse_status_success)
136    {
137        std::cerr << "Error returned from aoclsparse_ztrsv, status = " << status << "."
138                  << std::endl;
139        return 3;
140    }
141
142    // Print and check the result
143    ref.assign({{0.2, 0.6}, {1.4, 0.6}, {3.4, -7.0}, {-1.0, 19.2}});
144    std::cout << std::endl;
145    std::cout << "Solving U^Hx = alpha*b: " << std::endl << "  x = ";
146    for(aoclsparse_int i = 0; i < n; i++)
147    {
148        oki = std::abs(x[i].real - ref[i].real) <= tol && std::abs(x[i].imag - ref[i].imag) <= tol;
149        std::cout << "(" << x[i].real << ", " << x[i].imag << "i) " << (oki ? " " : "!  ");
150        ok &= oki;
151    }
152    std::cout << std::endl;
153
154    // Destroy the aoclsparse memory
155    aoclsparse_destroy_mat_descr(descr_a);
156    aoclsparse_destroy(&A);
157
158    return ok ? 0 : 4;
159}
Parameters
Return values
  • aoclsparse_status_success – the operation completed successfully and \(x\) contains the solution to the linear system of equations.

  • aoclsparse_status_invalid_size – matrix \(A\) or \(op(A)\) is invalid.

  • aoclsparse_status_invalid_pointer – One or more of A, descr, x, b are invalid pointers.

  • aoclsparse_status_internal_error – an internal error occurred.

  • aoclsparse_status_not_implemented – the requested operation is not yet implemented.

  • other – possible failure values from a call to aoclsparse_optimize.

aoclsparse_status aoclsparse_strsv_strided(aoclsparse_operation trans, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const float *b, const aoclsparse_int incb, float *x, const aoclsparse_int incx)#
aoclsparse_status aoclsparse_dtrsv_strided(aoclsparse_operation trans, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const double *b, const aoclsparse_int incb, double *x, const aoclsparse_int incx)#
aoclsparse_status aoclsparse_ctrsv_strided(aoclsparse_operation trans, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_float_complex *b, const aoclsparse_int incb, aoclsparse_float_complex *x, const aoclsparse_int incx)#
aoclsparse_status aoclsparse_ztrsv_strided(aoclsparse_operation trans, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_double_complex *b, const aoclsparse_int incb, aoclsparse_double_complex *x, const aoclsparse_int incx)#

This is a variation of TRSV, namely with a suffix of _strided, allows to set the stride for the dense vectors b and x.

For full details refer to aoclsparse_?trsv().

Parameters
aoclsparse_status aoclsparse_strsv_kid(aoclsparse_operation trans, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const float *b, float *x, aoclsparse_int kid)#
aoclsparse_status aoclsparse_dtrsv_kid(aoclsparse_operation trans, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const double *b, double *x, aoclsparse_int kid)#
aoclsparse_status aoclsparse_ctrsv_kid(aoclsparse_operation trans, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_float_complex *b, aoclsparse_float_complex *x, aoclsparse_int kid)#
aoclsparse_status aoclsparse_ztrsv_kid(aoclsparse_operation trans, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_double_complex *b, aoclsparse_double_complex *x, aoclsparse_int kid)#

Sparse triangular solver for real/complex single and double data precisions (kernel flag variation).

For full details refer to aoclsparse_?trsv().

This variation of TRSV, namely with a suffix of _kid, allows to choose which TRSV kernel to use (if possible). Currently the possible choices are:

kid=0

Reference implementation (No explicit AVX instructions).

kid=1

Alias to kid=2 (Kernel Template AVX 256-bit implementation)

kid=2

Kernel Template version using AVX2 extensions.

kid=3

Kernel Template version using AVX512F+ CPU extensions.

Any other Kernel ID value will default to kid = 0.

Parameters

aoclsparse_?dotmv()#

aoclsparse_status aoclsparse_sdotmv(const aoclsparse_operation op, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const float *x, const float beta, float *y, float *d)#
aoclsparse_status aoclsparse_ddotmv(const aoclsparse_operation op, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const double *x, const double beta, double *y, double *d)#
aoclsparse_status aoclsparse_cdotmv(const aoclsparse_operation op, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_float_complex *x, const aoclsparse_float_complex beta, aoclsparse_float_complex *y, aoclsparse_float_complex *d)#
aoclsparse_status aoclsparse_zdotmv(const aoclsparse_operation op, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, const aoclsparse_double_complex *x, const aoclsparse_double_complex beta, aoclsparse_double_complex *y, aoclsparse_double_complex *d)#

Performs sparse matrix-vector multiplication followed by vector-vector multiplication.

aoclsparse_?dotmv multiplies the scalar \(\alpha\) with a sparse \(m \times n\) matrix, defined in a sparse storage format, and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that

\[\begin{split} y := \alpha \, op(A) \, x + \beta \, y, \quad \text{ with } \quad op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]

followed by dot product of dense vectors \(x\) and \(y\) such that

\[\begin{split} {\bf\mathsf{d}} = \left\{ \begin{array}{ll} \sum_{i=0}^{\min(m,n)-1} x_{i} \; y_{i}, & \text{real case} \\ \sum_{i=0}^{\min(m,n)-1} \overline{x_i} \; y_{i}, & \text{complex case} \end{array} \right. \end{split}\]

 1/* ************************************************************************
 2 * Copyright (c) 2023-2024 Advanced Micro Devices, Inc.
 3 *
 4 * Permission is hereby granted, free of charge, to any person obtaining a copy
 5 * of this software and associated documentation files (the "Software"), to deal
 6 * in the Software without restriction, including without limitation the rights
 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 8 * copies of the Software, and to permit persons to whom the Software is
 9 * furnished to do so, subject to the following conditions:
10 *
11 * The above copyright notice and this permission notice shall be included in
12 * all copies or substantial portions of the Software.
13 *
14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
20 * THE SOFTWARE.
21 *
22 * ************************************************************************ */
23
24#include "aoclsparse.h"
25
26#include <iostream>
27#include <limits>
28
29#define M 6
30#define N 5
31#define NNZ 8
32
33int main(void)
34{
35    aoclsparse_operation op = aoclsparse_operation_none;
36
37    double alpha = 1.0;
38    double beta  = 0.0;
39
40    // Print aoclsparse version
41    std::cout << aoclsparse_get_version() << std::endl;
42
43    // Create matrix descriptor
44    aoclsparse_mat_descr descr;
45    // aoclsparse_create_mat_descr set aoclsparse_matrix_type to aoclsparse_matrix_type_general
46    // and aoclsparse_index_base to aoclsparse_index_base_zero.
47    aoclsparse_create_mat_descr(&descr);
48
49    aoclsparse_index_base base = aoclsparse_index_base_zero;
50
51    // Initialise matrix
52    //  1  0  0  2  0
53    //  0  3  0  0  0
54    //  0  0  4  0  0
55    //  0  5  0  6  7
56    //  0  0  0  0  8
57    //  0  0  0  0  0
58    aoclsparse_int    csr_row_ptr[M + 1] = {0, 2, 3, 4, 7, 8, 8};
59    aoclsparse_int    csr_col_ind[NNZ]   = {0, 3, 1, 2, 1, 3, 4, 4};
60    double            csr_val[NNZ]       = {1, 2, 3, 4, 5, 6, 7, 8};
61    aoclsparse_matrix A;
62    aoclsparse_create_dcsr(&A, base, M, N, NNZ, csr_row_ptr, csr_col_ind, csr_val);
63
64    // Initialise vectors
65    double x[N]     = {1.0, 1.0, 1.0, 1.0, 1.0};
66    double y[M]     = {1.0, 1.0, 1.0, 1.0, 1.0, 1.0};
67    double d        = 0;
68    double y_exp[M] = {3.0, 3.0, 4.0, 18.0, 8.0};
69    double d_exp    = 36;
70
71    //to identify hint id(which routine is to be executed, destroyed later)
72    aoclsparse_set_dotmv_hint(A, op, descr, 1);
73
74    // Optimize the matrix, "A"
75    aoclsparse_optimize(A);
76
77    std::cout << "Invoking aoclsparse_ddotmv..";
78    //Invoke DOTMV API (double precision)
79    aoclsparse_ddotmv(op, alpha, A, descr, x, beta, y, &d);
80    std::cout << "Done." << std::endl;
81
82    // Print and check the result
83    std::cout << std::endl << "y = ";
84    bool   oki, ok = true;
85    double tol = 20 * std::numeric_limits<double>::epsilon();
86    for(aoclsparse_int i = 0; i < M; i++)
87    {
88        oki = std::abs(y[i] - y_exp[i]) <= tol;
89        std::cout << y[i] << (oki ? " " : "!  ");
90        ok &= oki;
91    }
92
93    ok &= (std::abs(d - d_exp) <= tol);
94    std::cout << std::endl << "Output dot product: " << d << (ok ? " " : "!  ") << std::endl;
95
96    aoclsparse_destroy_mat_descr(descr);
97    aoclsparse_destroy(&A);
98    return ok ? 0 : 4;
99}
Parameters
  • op[in] matrix operation type.

  • alpha[in] scalar \(\alpha\).

  • A[in] the sparse \(m \times n\) matrix structure that is created using aoclsparse_create_scsr() or other variation.

  • descr[in] descriptor of the sparse CSR matrix. Both base-zero and base-one are supported, however, the index base needs to match the one used when aoclsparse_matrix was created.

  • x[in] array of atleast n elements if \(op(A) = A\) or at least m elements if \(op(A) = A^T\) or \(A^H\).

  • beta[in] scalar \(\beta\).

  • y[inout] array of atleast m elements if \(op(A) = A\) or at least n elements if \(op(A) = A^T\) or \(A^H\).

  • d[out] dot product of y and x.

Return values
  • aoclsparse_status_success – the operation completed successfully.

  • aoclsparse_status_invalid_sizem, n or nnz is invalid.

  • aoclsparse_status_invalid_value – (base index is neither aoclsparse_index_base_zero nor aoclsparse_index_base_one, or matrix base index and descr base index values do not match.

  • aoclsparse_status_invalid_pointerdescr, internal structures related to the sparse matrix A, x, y or d are invalid pointer.

  • aoclsparse_status_wrong_type – matrix data type is not supported.

  • aoclsparse_status_not_implementedaoclsparse_matrix_type is aoclsparse_matrix_type_hermitian or, aoclsparse_matrix_format_type is not aoclsparse_csr_mat

aoclsparse_?ellmv()#

aoclsparse_status aoclsparse_sellmv(aoclsparse_operation trans, const float *alpha, aoclsparse_int m, aoclsparse_int n, aoclsparse_int nnz, const float *ell_val, const aoclsparse_int *ell_col_ind, aoclsparse_int ell_width, const aoclsparse_mat_descr descr, const float *x, const float *beta, float *y)#
aoclsparse_status aoclsparse_dellmv(aoclsparse_operation trans, const double *alpha, aoclsparse_int m, aoclsparse_int n, aoclsparse_int nnz, const double *ell_val, const aoclsparse_int *ell_col_ind, aoclsparse_int ell_width, const aoclsparse_mat_descr descr, const double *x, const double *beta, double *y)#

Real single and double precision sparse matrix vector product using ELL storage format.

aoclsparse_?ellmv multiplies the scalar \(\alpha\) with a sparse \(m \times n\) matrix, defined in ELL storage format, and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that

\[ y = \alpha \, op(A) \, x + \beta \, y, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]

Note

Currently, only trans = aoclsparse_operation_none is supported.

Parameters
  • trans[in] matrix operation type.

  • alpha[in] scalar \(\alpha\).

  • m[in] number of rows of the sparse ELL matrix.

  • n[in] number of columns of the sparse ELL matrix.

  • nnz[in] number of non-zero entries of the sparse ELL matrix.

  • descr[in] descriptor of the sparse ELL matrix. Both, base-zero and base-one input arrays of ELL matrix are supported

  • ell_val[in] array that contains the elements of the sparse ELL matrix. Padded elements should be zero.

  • ell_col_ind[in] array that contains the column indices of the sparse ELL matrix. Padded column indices should be -1.

  • ell_width[in] number of non-zero elements per row of the sparse ELL matrix.

  • x[in] array of n elements ( \(op(A) = A\)) or m elements ( \(op(A) = A^T\) or \(op(A) = A^H\)).

  • beta[in] scalar \(\beta\).

  • y[inout] array of m elements ( \(op(A) = A\)) or n elements ( \(op(A) = A^T\) or \(op(A) = A^H\)).

Return values

aoclsparse_?diamv()#

aoclsparse_status aoclsparse_sdiamv(aoclsparse_operation trans, const float *alpha, aoclsparse_int m, aoclsparse_int n, aoclsparse_int nnz, const float *dia_val, const aoclsparse_int *dia_offset, aoclsparse_int dia_num_diag, const aoclsparse_mat_descr descr, const float *x, const float *beta, float *y)#
aoclsparse_status aoclsparse_ddiamv(aoclsparse_operation trans, const double *alpha, aoclsparse_int m, aoclsparse_int n, aoclsparse_int nnz, const double *dia_val, const aoclsparse_int *dia_offset, aoclsparse_int dia_num_diag, const aoclsparse_mat_descr descr, const double *x, const double *beta, double *y)#

Real single and double precision sparse matrix vector product using DIA storage format.

aoclsparse_?diamv multiplies the scalar \(\alpha\) with a sparse \(m \times n\) matrix, defined in DIA storage format, and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that

\[ y = \alpha \, op(A) \, x + \beta \, y, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]

Note

Currently, only trans = aoclsparse_operation_none is supported.

Parameters
  • trans[in] matrix operation type.

  • alpha[in] scalar \(\alpha\).

  • m[in] number of rows of the matrix.

  • n[in] number of columns of the matrix.

  • nnz[in] number of non-zero entries of the matrix.

  • descr[in] descriptor of the sparse DIA matrix.

  • dia_val[in] array that contains the elements of the matrix. Padded elements should be zero.

  • dia_offset[in] array that contains the offsets of each diagonal of the matrix.

  • dia_num_diag[in] number of diagonals in the matrix.

  • x[in] array of n elements ( \(op(A) = A\)) or m elements ( \(op(A) = A^T\) or \(op(A) = A^H\)).

  • beta[in] scalar \(\beta\).

  • y[inout] array of m elements ( \(op(A) = A\)) or n elements ( \(op(A) = A^T\) or \(op(A) = A^H\)).

Return values

aoclsparse_?bsrmv()#

aoclsparse_status aoclsparse_sbsrmv(aoclsparse_operation trans, const float *alpha, aoclsparse_int mb, aoclsparse_int nb, aoclsparse_int bsr_dim, const float *bsr_val, const aoclsparse_int *bsr_col_ind, const aoclsparse_int *bsr_row_ptr, const aoclsparse_mat_descr descr, const float *x, const float *beta, float *y)#
aoclsparse_status aoclsparse_dbsrmv(aoclsparse_operation trans, const double *alpha, aoclsparse_int mb, aoclsparse_int nb, aoclsparse_int bsr_dim, const double *bsr_val, const aoclsparse_int *bsr_col_ind, const aoclsparse_int *bsr_row_ptr, const aoclsparse_mat_descr descr, const double *x, const double *beta, double *y)#

Real single and double precision matrix vector product using BSR storage format.

aoclsparse_?bsrmv multiplies the scalar \(\alpha\) with a sparse mb times bsr_dim by nb times bsr_dim matrix, defined in BSR storage format, and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that

\[ y = \alpha \, op(A) \, x + \beta \, y, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]

Note

Only trans = aoclsparse_operation_none is supported.

Parameters
  • trans[in] matrix operation type.

  • mb[in] number of block rows of the sparse BSR matrix.

  • nb[in] number of block columns of the sparse BSR matrix.

  • alpha[in] scalar \(\alpha\).

  • descr[in] descriptor of the sparse BSR matrix. Both, base-zero and base-one input arrays of BSR matrix are supported.

  • bsr_val[in] array of nnzb blocks of the sparse BSR matrix.

  • bsr_row_ptr[in] array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.

  • bsr_col_ind[in] array of nnz containing the block column indices of the sparse BSR matrix.

  • bsr_dim[in] block dimension of the sparse BSR matrix.

  • x[in] array of nb times bsr_dim elements ( \(op(A) = A\)) or mb times bsr_dim elements ( \(op(A) = A^T\) or \(op(A) = A^H\)).

  • beta[in] scalar \(\beta\).

  • y[inout] array of mb times bsr_dim elements ( \(op(A) = A\)) or nb times bsr_dim elements ( \(op(A) = A^T\) or \(op(A) = A^H\)).

Return values
  • aoclsparse_status_success – the operation completed successfully.

  • aoclsparse_status_invalid_handle – the library context was not initialized.

  • aoclsparse_status_invalid_sizemb, nb, nnzb or bsr_dim is invalid.

  • aoclsparse_status_invalid_pointerdescr, alpha, bsr_val, bsr_row_ind, bsr_col_ind, x, beta or y pointer is invalid.

  • aoclsparse_status_arch_mismatch – the device is not supported.

  • aoclsparse_status_not_implementedtrans is not aoclsparse_operation_none, or aoclsparse_matrix_type is not aoclsparse_matrix_type_general.

aoclsparse_?csrmv()#

aoclsparse_status aoclsparse_scsrmv(aoclsparse_operation trans, const float *alpha, aoclsparse_int m, aoclsparse_int n, aoclsparse_int nnz, const float *csr_val, const aoclsparse_int *csr_col_ind, const aoclsparse_int *csr_row_ptr, const aoclsparse_mat_descr descr, const float *x, const float *beta, float *y)#
aoclsparse_status aoclsparse_dcsrmv(aoclsparse_operation trans, const double *alpha, aoclsparse_int m, aoclsparse_int n, aoclsparse_int nnz, const double *csr_val, const aoclsparse_int *csr_col_ind, const aoclsparse_int *csr_row_ptr, const aoclsparse_mat_descr descr, const double *x, const double *beta, double *y)#

Real single and double precision sparse matrix-vector multiplication using CSR storage format.

aoclsparse_?csrmv multiplies the scalar \(\alpha\) with a sparse \(m \times n\) matrix, defined in CSR storage format, and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that

\[ y = \alpha \, op(A) \, x + \beta \, y, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]
Parameters
  • trans[in] matrix operation type.

  • alpha[in] scalar \(\alpha\).

  • m[in] number of rows of the sparse CSR matrix.

  • n[in] number of columns of the sparse CSR matrix.

  • nnz[in] number of non-zero entries of the sparse CSR matrix.

  • csr_val[in] array of nnz elements of the sparse CSR matrix.

  • csr_col_ind[in] array of nnz elements containing the column indices of the sparse CSR matrix.

  • csr_row_ptr[in] array of m +1 elements that point to the start of every row of the sparse CSR matrix.

  • descr[in] descriptor of the sparse CSR matrix. Currently, only aoclsparse_matrix_type_general and aoclsparse_matrix_type_symmetric is supported.

  • x[in] array of n elements ( \(op(A) = A\)) or m elements ( \(op(A) = A^T\) or \(op(A) = A^H\)).

  • beta[in] scalar \(\beta\).

  • y[inout] array of m elements ( \(op(A) = A\)) or n elements ( \(op(A) = A^T\) or \(op(A) = A^H\)).

Return values

aoclsparse_?csrsv()#

aoclsparse_status aoclsparse_scsrsv(aoclsparse_operation trans, const float *alpha, aoclsparse_int m, const float *csr_val, const aoclsparse_int *csr_col_ind, const aoclsparse_int *csr_row_ptr, const aoclsparse_mat_descr descr, const float *x, float *y)#
aoclsparse_status aoclsparse_dcsrsv(aoclsparse_operation trans, const double *alpha, aoclsparse_int m, const double *csr_val, const aoclsparse_int *csr_col_ind, const aoclsparse_int *csr_row_ptr, const aoclsparse_mat_descr descr, const double *x, double *y)#

Sparse triangular solve using CSR storage format for single and double data precisions.

Deprecated:

This API is superseded by aoclsparse_strsv() and aoclsparse_dtrsv().

aoclsparse_?csrsv solves a sparse triangular linear system of a sparse \(m \times m\) matrix, defined in CSR storage format, a dense solution vector \(y\) and the right-hand side \(x\) that is multiplied by \(\alpha\), such that

\[ op(A) \, y = \alpha \, x, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]

Note

Only trans = aoclsparse_operation_none is supported.

Note

The input matrix has to be sparse upper or lower triangular matrix with unit or non-unit main diagonal. Matrix has to be sorted. No diagonal element can be omitted from a sparse storage if the solver is called with the non-unit indicator.

Parameters
  • trans[in] matrix operation type.

  • alpha[in] scalar \(\alpha\).

  • m[in] number of rows of the sparse CSR matrix.

  • csr_val[in] array of nnz elements of the sparse CSR matrix.

  • csr_row_ptr[in] array of m+1 elements that point to the start of every row of the sparse CSR matrix.

  • csr_col_ind[in] array of nnz elements containing the column indices of the sparse CSR matrix.

  • descr[in] descriptor of the sparse CSR matrix.

  • x[in] array of m elements, holding the right-hand side.

  • y[out] array of m elements, holding the solution.

Return values

Level 3#

This module holds all sparse level 3 routines.

The sparse level 3 routines describe operations between matrices.

aoclsparse_?trsm()#

aoclsparse_status aoclsparse_strsm(const aoclsparse_operation trans, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const float *B, aoclsparse_int n, aoclsparse_int ldb, float *X, aoclsparse_int ldx)#
aoclsparse_status aoclsparse_dtrsm(const aoclsparse_operation trans, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const double *B, aoclsparse_int n, aoclsparse_int ldb, double *X, aoclsparse_int ldx)#
aoclsparse_status aoclsparse_ctrsm(aoclsparse_operation trans, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const aoclsparse_float_complex *B, aoclsparse_int n, aoclsparse_int ldb, aoclsparse_float_complex *X, aoclsparse_int ldx)#
aoclsparse_status aoclsparse_ztrsm(aoclsparse_operation trans, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const aoclsparse_double_complex *B, aoclsparse_int n, aoclsparse_int ldb, aoclsparse_double_complex *X, aoclsparse_int ldx)#

Solve sparse triangular linear system of equations with multiple right hand sides for real/complex single and double data precisions.

aoclsparse_?trsm solves a sparse triangular linear system of equations with multiple right hand sides, of the form

\[ op(A)\; X = \alpha B, \]
where \(A\) is a sparse matrix of size \(m\), \(op()\) is a linear operator, \(X\) and \(B\) are rectangular dense matrices of appropiate size, while \(\alpha\) is a scalar. The sparse matrix \(A\) can be interpreted either as a lower triangular or upper triangular. This is indicated by fill_mode from the matrix descriptor descr where either upper or lower triangular portion of the matrix is only referenced. The matrix can also be of class symmetric in which case only the selected triangular part is used. Matrix \(A\) must be of full rank, that is, the matrix must be invertible. The linear operator \(op()\) can define the transposition or Hermitian transposition operations. By default, no transposition is performed. The right-hand-side matrix \(B\) and the solution matrix \(X\) are dense and must be of the correct size, that is \(m\) by \(n\), see ldb and ldx input parameters for further details.

Explicitly, this kernel solves

\[ op(A)\; X = \alpha \; B\text{, with solution } X = \alpha \; (op(A)^{-1}) \; B, \]
where
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]
If a linear operator is applied then, the possible problems solved are
\[ A^T \; X = \alpha \; B\text{, with solution } X = \alpha \; A^{-T} \; B\text{, and } \; A^H \; X = \alpha \; B\text{, with solution } X = \alpha \; A^{-H} \; B. \]

Notes

  1. If the matrix descriptor descr specifies that the matrix \(A\) is to be regarded as having a unitary diagonal, then the main diagonal entries of matrix \(A\) are not accessed and are considered to all be ones.

  2. If the matrix \(A\) is described as upper triangular, then only the upper triangular portion of the matrix is referenced. Conversely, if the matrix \(A\) is described lower triangular, then only the lower triangular portion of the matrix is used.

  3. This set of APIs allocates work array of size \(m\) for each case where the matrices \(B\) or \(X\) are stored in row-major format (ref aoclsparse_order_row).

  4. A subset of kernels are parallel (on parallel builds) and can be expected potential acceleration in the solve. These kernels are available when both dense matrices \(X\) and \(B\) are stored in column-major format (ref aoclsparse_order_column) and thread count is greater than 1 on a parallel build.

  5. There is aoclsparse_trsm_kid (Kernel ID) variation of TRSM, namely with a suffix of _kid, this solver allows to choose which underlying TRSV kernels to use (if possible). Currently, all the existing kernels avilable in aoclsparse_trsv_kid are supported.

  6. This routine supports only sparse matrices in CSR format.

  1/* ************************************************************************
  2 * Copyright (c) 2023-2024 Advanced Micro Devices, Inc.
  3 *
  4 * Permission is hereby granted, free of charge, to any person obtaining a copy
  5 * of this software and associated documentation files (the "Software"), to deal
  6 * in the Software without restriction, including without limitation the rights
  7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  8 * copies of the Software, and to permit persons to whom the Software is
  9 * furnished to do so, subject to the following conditions:
 10 *
 11 * The above copyright notice and this permission notice shall be included in
 12 * all copies or substantial portions of the Software.
 13 *
 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 20 * THE SOFTWARE.
 21 *
 22 * ************************************************************************ */
 23
 24#include "aoclsparse.h"
 25
 26#include <iomanip>
 27#include <iostream>
 28#include <limits>
 29#include <vector>
 30
 31int main(void)
 32{
 33    std::cout << "------------------------------------------------" << std::endl
 34              << " Triangle Solve with Multiple Right Hand Sides" << std::endl
 35              << " sample program for real data type" << std::endl
 36              << "------------------------------------------------" << std::endl
 37              << std::endl;
 38
 39    /*
 40     * This example illustrates how to solve two triangular
 41     * linear system of equations. The real matrix used is
 42     *     | 1  2  0  0 |
 43     * A = | 3  1  2  0 |
 44     *     | 0  3  1  2 |
 45     *     | 0  0  3  1 |
 46     *
 47     * The first linear system is L X = alpha * B with L = tril(A), X and B
 48     * are two dense matrices of size 4x2 stored in column-major format.
 49     *
 50     * The second linear system is U X = alpha * B with U = triu(A), X and B
 51     * are two dense matrices of size 4x2 stored in row-major format.
 52     *
 53     */
 54
 55    // Create a tri-diagonal matrix in CSR format
 56    const aoclsparse_int n = 4, m = 4, nnz = 10;
 57    aoclsparse_int       icrow[5] = {0, 2, 5, 8, 10};
 58    aoclsparse_int       icol[10] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};
 59    double               aval[10] = {1, 2, 3, 1, 2, 3, 1, 2, 3, 1};
 60    aoclsparse_status    status;
 61    const aoclsparse_int k = 2;
 62    aoclsparse_int       ldb, ldx;
 63    bool                 oki, ok;
 64    double               tol = 20.0 * std::numeric_limits<double>::epsilon();
 65
 66    // Create aoclsparse matrix and its descriptor
 67    aoclsparse_matrix     A;
 68    aoclsparse_index_base base = aoclsparse_index_base_zero;
 69    aoclsparse_order      order;
 70    aoclsparse_mat_descr  descr_a;
 71    aoclsparse_operation  trans;
 72    status = aoclsparse_create_dcsr(&A, base, m, n, nnz, icrow, icol, aval);
 73    if(status != aoclsparse_status_success)
 74    {
 75        std::cerr << "Error returned from aoclsparse_create_dcsr, status = " << status << "."
 76                  << std::endl;
 77        return 3;
 78    }
 79    aoclsparse_create_mat_descr(&descr_a);
 80
 81    /* Case 1: Solving the lower triangular system L^T X = alpha*B.
 82     *     |1, -1|
 83     * B = |2, -2| stored in column-major layout
 84     *     |3, -6| k = 2 and ldb = 4
 85     *     |4, -8|
 86     *
 87     * Linear system L X = B, with solution matrix
 88     *     | 86,-167|
 89     * X = |-29,  56| stored in column-major layout
 90     *     |  9, -18| k = 2 and ldx = 4
 91     *     | -4,   8|
 92     */
 93
 94    double alpha = -1.0;
 95    double X[m * k];
 96
 97    double              B[m * k] = {1, 2, 3, 4, -1, -2, -6, -8};
 98    std::vector<double> XRef;
 99    XRef.assign({86, -29, 9, -4, -167, 56, -18, 8});
100
101    // Indicate to only access the lower triangular part of matrix A
102    aoclsparse_set_mat_type(descr_a, aoclsparse_matrix_type_triangular);
103    aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_lower);
104    trans = aoclsparse_operation_transpose;
105
106    order  = aoclsparse_order_column;
107    ldb    = m;
108    ldx    = m;
109    status = aoclsparse_set_sm_hint(A, trans, descr_a, order, 1);
110    if(status != aoclsparse_status_success)
111    {
112        std::cerr << "Error returned from aoclsparse_set_sm_hint, status = " << status << "."
113                  << std::endl;
114        return 3;
115    }
116    status = aoclsparse_optimize(A);
117    if(status != aoclsparse_status_success)
118    {
119        std::cerr << "Error returned from aoclsparse_optimize, status = " << status << "."
120                  << std::endl;
121        return 3;
122    }
123    // Solve
124    status = aoclsparse_dtrsm(trans, alpha, A, descr_a, order, &B[0], k, ldb, &X[0], ldx);
125    if(status != aoclsparse_status_success)
126    {
127        std::cerr << "Error returned from aoclsparse_dtrsm, status = " << status << "."
128                  << std::endl;
129        return 3;
130    }
131
132    // Print and check the result
133    std::cout << "Solving L^T X = alpha B, where  L=tril(A), and" << std::endl;
134    std::cout << "  X and B are dense rectangular martices (column-major layout)" << std::endl;
135    std::cout << "  Solution matrix X = " << std::endl;
136    std::cout << std::fixed;
137    std::cout.precision(1);
138    ok = true;
139    size_t idx;
140    for(int row = 0; row < m; ++row)
141    {
142        for(int col = 0; col < k; ++col)
143        {
144            idx = row + col * ldx;
145            oki = std::abs(X[idx] - XRef[idx]) <= tol;
146            std::cout << X[idx] << (oki ? "  " : "! ");
147            ok &= oki;
148        }
149        std::cout << std::endl;
150    }
151    std::cout << std::endl;
152
153    /* Case 2: Solving the lower triangular system U X = alpha*B.
154     * In this case we "reinterpret" B as a col-major layout, having the
155     * form
156     *     | 1,  2|
157     * B = | 3,  4| stored in row-major layout
158     *     |-1, -2| k = 2 and ldb = 2
159     *     |-6, -8|
160     *
161     * Linear system L X = B, with solution matrix
162     *     | 39,  50|
163     * X = |-19, -24| stored in row-major layout
164     *     | 11,  14| k = 2 and ldx = 2
165     *     | -6,  -8|
166     */
167
168    // Store same XRef in row-major format
169    XRef.assign({39, 50, -19, -24, 11, 14, -6, -8});
170    alpha = 1.0;
171    ldb   = k;
172    ldx   = k;
173    // Indicate to use only the conjugate transpose of the upper part of A
174    aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_upper);
175    trans  = aoclsparse_operation_none;
176    order  = aoclsparse_order_row;
177    status = aoclsparse_set_sm_hint(A, trans, descr_a, order, 1);
178    if(status != aoclsparse_status_success)
179    {
180        std::cerr << "Error returned from aoclsparse_set_sm_hint, status = " << status << "."
181                  << std::endl;
182        return 3;
183    }
184    status = aoclsparse_optimize(A);
185    if(status != aoclsparse_status_success)
186    {
187        std::cerr << "Error returned from aoclsparse_optimize, status = " << status << "."
188                  << std::endl;
189        return 3;
190    }
191    // Solve
192    status = aoclsparse_dtrsm(trans, alpha, A, descr_a, order, &B[0], k, ldb, &X[0], ldx);
193    if(status != aoclsparse_status_success)
194    {
195        std::cerr << "Error returned from aoclsparse_dtrsm, status = " << status << "."
196                  << std::endl;
197        return 3;
198    }
199    // Print and check the result
200    std::cout << "Solving U X = alpha B, where  U=triu(A), and" << std::endl;
201    std::cout << "  X and B are dense rectangular martices (row-major layout)" << std::endl;
202    std::cout << "  Solution matrix X = " << std::endl;
203    for(int row = 0; row < m; ++row)
204    {
205        for(int col = 0; col < k; col++)
206        {
207            idx = row * ldx + col;
208            oki = std::abs(X[idx] - XRef[idx]) <= tol;
209            std::cout << X[idx] << (oki ? " " : "!  ");
210            ok &= oki;
211        }
212        std::cout << std::endl;
213    }
214    std::cout << std::endl;
215
216    // Destroy the aoclsparse memory
217    aoclsparse_destroy_mat_descr(descr_a);
218    aoclsparse_destroy(&A);
219
220    return ok ? 0 : 5;
221}

  1/* ************************************************************************
  2 * Copyright (c) 2023-2024 Advanced Micro Devices, Inc.
  3 *
  4 * Permission is hereby granted, free of charge, to any person obtaining a copy
  5 * of this software and associated documentation files (the "Software"), to deal
  6 * in the Software without restriction, including without limitation the rights
  7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  8 * copies of the Software, and to permit persons to whom the Software is
  9 * furnished to do so, subject to the following conditions:
 10 *
 11 * The above copyright notice and this permission notice shall be included in
 12 * all copies or substantial portions of the Software.
 13 *
 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 20 * THE SOFTWARE.
 21 *
 22 * ************************************************************************ */
 23
 24#include "aoclsparse.h"
 25
 26#include <complex>
 27#include <iomanip>
 28#include <iostream>
 29#include <limits>
 30#include <vector>
 31
 32int main(void)
 33{
 34    std::cout << "------------------------------------------------" << std::endl
 35              << " Triangle Solve with Multiple Right Hand Sides" << std::endl
 36              << " sample program for complex data type" << std::endl
 37              << "------------------------------------------------" << std::endl
 38              << std::endl;
 39
 40    /*
 41     * This example illustrates how to solve two triangular
 42     * linear system of equations. The complex matrix used is
 43     *     | 1+3i  2+5i     0     0 |
 44     * A = | 3     1+2i  2-2i     0 |
 45     *     |    0     3     1  2+3i |
 46     *     |    0     0  3+1i  1+1i |
 47     *
 48     * The first linear system is L X = alpha * B with L = tril(A), X and B
 49     * are two dense matrices of size 4x2 stored in column-major format.
 50     *
 51     * The second linear system is U^H X = alpha * B with U = triu(A), X and B
 52     * are two dense matrices of size 4x2 stored in row-major format.
 53     *
 54     */
 55
 56    // Create a tri-diagonal matrix in CSR format
 57    const aoclsparse_int                   n = 4, m = 4, nnz = 10;
 58    aoclsparse_int                         icrow[5] = {0, 2, 5, 8, 10};
 59    aoclsparse_int                         icol[10] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};
 60    std::vector<aoclsparse_double_complex> aval;
 61    // clang-format off
 62    aval.assign({{1, 3}, {2, 5}, {3, 0}, {1, 2}, {2, -2}, {3, 0},
 63                 {1, 0}, {2, 3}, {3, 1}, {1, 1}});
 64    // clang-format on
 65    aoclsparse_status status;
 66    aoclsparse_int    ldb, ldx;
 67    bool              oki, ok;
 68    double            tol = 20 * std::numeric_limits<double>::epsilon();
 69
 70    aoclsparse_matrix     A;
 71    const aoclsparse_int  k     = 2; // number of columns of X and B
 72    aoclsparse_index_base base  = aoclsparse_index_base_zero;
 73    aoclsparse_order      order = aoclsparse_order_column;
 74    aoclsparse_mat_descr  descr_a;
 75    aoclsparse_operation  trans = aoclsparse_operation_none;
 76    status = aoclsparse_create_zcsr(&A, base, m, n, nnz, icrow, icol, aval.data());
 77    if(status != aoclsparse_status_success)
 78    {
 79        std::cerr << "Error returned from aoclsparse_create_zcsr, status = " << status << "."
 80                  << std::endl;
 81        return 3;
 82    }
 83    aoclsparse_create_mat_descr(&descr_a);
 84
 85    /* Case 1: Solving the lower triangular system L X = B.
 86     *     |-2+4i,   -3+i|
 87     * B = |3+13i,  4+11i| stored in column-major layout
 88     *     |16+7i,  16+2i| k = 2 and ldb = 4
 89     *     |15+11i,10+16i|
 90     *
 91     * Linear system L X = B, with solution matrix
 92     *     |1+i,     i|
 93     * X = |4+2i,    4| stored in column-major layout
 94     *     |4+i,  4+2i| k = 2 and ldx = 4
 95     *     |4,    3+3i|
 96     */
 97
 98    aoclsparse_double_complex alpha = {1.0, 0.};
 99    aoclsparse_double_complex X[m * k];
100
101    std::vector<aoclsparse_double_complex> B;
102    B.assign({{-2, 4}, {3, 13}, {16, 7}, {15, 11}, {-3, 1}, {4, 11}, {16, 2}, {10, 16}});
103    std::vector<aoclsparse_double_complex> XRef;
104    XRef.assign({{1, 1}, {4, 2}, {4, 1}, {4, 0}, {0, 1}, {4, 0}, {4, 2}, {3, 3}});
105
106    // indicate to only access the lower triangular part of matrix A
107    aoclsparse_set_mat_type(descr_a, aoclsparse_matrix_type_triangular);
108    aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_lower);
109
110    // Prepare and call solver
111    order  = aoclsparse_order_column;
112    ldb    = m;
113    ldx    = m;
114    status = aoclsparse_set_sm_hint(A, trans, descr_a, order, 1);
115    if(status != aoclsparse_status_success)
116    {
117        std::cerr << "Error returned from aoclsparse_set_sm_hint, status = " << status << "."
118                  << std::endl;
119        return 3;
120    }
121    status = aoclsparse_optimize(A);
122    if(status != aoclsparse_status_success)
123    {
124        std::cerr << "Error returned from aoclsparse_optimize, status = " << status << "."
125                  << std::endl;
126        return 3;
127    }
128
129    // Solve
130    status = aoclsparse_ztrsm(trans, alpha, A, descr_a, order, B.data(), k, ldb, &X[0], ldx);
131    if(status != aoclsparse_status_success)
132    {
133        std::cerr << "Error from aoclsparse_ztrsm, status = " << status << std::endl;
134        return 3;
135    }
136
137    // Print and check the result
138    std::cout << "Solving L X = alpha B, where  L=tril(A), and" << std::endl;
139    std::cout << "  X and B are dense rectangular martices (column-major layout)" << std::endl;
140    std::cout << "  Solution matrix X = " << std::endl;
141    std::cout << std::fixed;
142    std::cout.precision(1);
143    ok = true;
144    size_t idx;
145    for(int row = 0; row < m; ++row)
146    {
147        for(int col = 0; col < k; ++col)
148        {
149            idx = row + col * ldx;
150            oki = std::abs(X[idx].real - XRef[idx].real) <= tol
151                  && std::abs(X[idx].imag - XRef[idx].imag) <= tol;
152            std::cout << "(" << X[idx].real << ", " << X[idx].imag << "i) " << (oki ? "  " : "! ");
153            ok &= oki;
154        }
155        std::cout << std::endl;
156    }
157    std::cout << std::endl;
158
159    /* Case 2: Solving the lower triangular system U^H X = alpha*B.
160     *     |2+4i,    -1+3i|
161     * B = |9+15i,    6+9i| stored in row-major layout
162     *     |-13+8i,-10+12i| k = 2 and ldx = 2
163     *     |14+15i,  8+20i|
164     *
165     * Linear system L X = B, with solution matrix
166     *     |1+i,     i|
167     * X = |4+2i,    4| stored in row-major layout
168     *     |4+i,  4+2i| k = 2 and ldx = 2
169     *     |4,    3+3i|
170     */
171
172    B.assign({{2, 4}, {-1, 3}, {9, 15}, {6, 9}, {-13, 8}, {-10, 12}, {14, 15}, {8, 20}});
173    // Store same XRef in row-major format
174    XRef.assign({{1, 1}, {0, 1}, {4, 2}, {4, 0}, {4, 1}, {4, 2}, {4, 0}, {3, 3}});
175    alpha = {0, -1};
176    ldb   = k;
177    ldx   = k;
178    // Indicate to use only the conjugate transpose of the upper part of A
179    trans = aoclsparse_operation_conjugate_transpose;
180    order = aoclsparse_order_row;
181    aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_upper);
182    status = aoclsparse_set_sm_hint(A, trans, descr_a, order, 1);
183    if(status != aoclsparse_status_success)
184    {
185        std::cerr << "Error returned from aoclsparse_set_sm_hint, status = " << status << "."
186                  << std::endl;
187        return 3;
188    }
189    status = aoclsparse_optimize(A);
190    if(status != aoclsparse_status_success)
191    {
192        std::cerr << "Error returned from aoclsparse_optimize, status = " << status << "."
193                  << std::endl;
194        return 3;
195    }
196    // Solve
197    status = aoclsparse_ztrsm(trans, alpha, A, descr_a, order, B.data(), k, ldb, &X[0], ldx);
198    if(status != aoclsparse_status_success)
199    {
200        std::cerr << "Error from aoclsparse_ztrsm, status = " << status << std::endl;
201        return 3;
202    }
203    // Print and check the result
204    std::cout << "Solving U^H X = alpha B, where  U=triu(A), and" << std::endl;
205    std::cout << "  X and B are dense rectangular martices (row-major layout)" << std::endl;
206    std::cout << "  Solution matrix X = " << std::endl;
207    for(int row = 0; row < m; ++row)
208    {
209        for(int col = 0; col < k; col++)
210        {
211            idx = row * ldx + col;
212            oki = std::abs(X[idx].real - XRef[idx].real) <= tol
213                  && std::abs(X[idx].imag - XRef[idx].imag) <= tol;
214            std::cout << "(" << X[idx].real << ", " << X[idx].imag << "i) " << (oki ? "  " : "! ");
215            ok &= oki;
216        }
217        std::cout << std::endl;
218    }
219    std::cout << std::endl;
220    // Destroy the aoclsparse memory
221    aoclsparse_destroy_mat_descr(descr_a);
222    aoclsparse_destroy(&A);
223
224    return ok ? 0 : 5;
225}
Parameters
  • trans[in] matrix operation to perform on \(A\). Possible values are aoclsparse_operation_none, aoclsparse_operation_transpose, and aoclsparse_operation_conjugate_transpose.

  • alpha[in] scalar \(\alpha\).

  • A[in] sparse matrix \(A\) of size \(m\).

  • descr[in] descriptor of the sparse matrix \(A\).

  • order[in] storage order of dense matrices \(B\) and \(X\). Possible options are aoclsparse_order_row and aoclsparse_order_column.

  • B[in] dense matrix, potentially rectangular, of size \(m \times n\).

  • n[in] \(n,\) number of columns of the dense matrix \(B\).

  • ldb[in] leading dimension of \(B\). Eventhough the matrix \(B\) is considered of size \(m \times n\), its memory layout may correspond to a larger matrix (ldb by \(N>n\)) in which only the submatrix \(B\) is of interest. In this case, this parameter provides means to access the correct elements of \(B\) within the larger layout.

    matrix layout

    row count

    column count

    aoclsparse_order_row

    \(m\)

    ldb with ldb \(\ge n\)

    aoclsparse_order_column

    ldb with ldb \(\ge m\)

    \(n\)

  • X[out] solution matrix \(X,\) dense and potentially rectangular matrix of size \(m \times n\).

  • ldx[in] leading dimension of \(X\). Eventhough the matrix \(X\) is considered of size \(m \times n\), its memory layout may correspond to a larger matrix (ldx by \(N>n\)) in which only the submatrix \(X\) is of interest. In this case, this parameter provides means to access the correct elements of \(X\) within the larger layout.

    matrix layout

    row count

    column count

    aoclsparse_order_row

    \(m\)

    ldx with ldx \(\ge n\)

    aoclsparse_order_column

    ldx with ldx \(\ge m\)

    \(n\)

Return values
  • aoclsparse_status_success – indicates that the operation completed successfully.

  • aoclsparse_status_invalid_size – informs that either m, n, nnz, ldb or ldx is invalid.

  • aoclsparse_status_invalid_pointer – informs that either descr, alpha, A, B, or X pointer is invalid.

  • aoclsparse_status_not_implemented – this error occurs when the provided matrix aoclsparse_matrix_type is aoclsparse_matrix_type_general or aoclsparse_matrix_type_hermitian or when matrix A is not in CSR format.

aoclsparse_status aoclsparse_strsm_kid(const aoclsparse_operation trans, const float alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const float *B, aoclsparse_int n, aoclsparse_int ldb, float *X, aoclsparse_int ldx, const aoclsparse_int kid)#
aoclsparse_status aoclsparse_dtrsm_kid(const aoclsparse_operation trans, const double alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const double *B, aoclsparse_int n, aoclsparse_int ldb, double *X, aoclsparse_int ldx, const aoclsparse_int kid)#
aoclsparse_status aoclsparse_ctrsm_kid(aoclsparse_operation trans, const aoclsparse_float_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const aoclsparse_float_complex *B, aoclsparse_int n, aoclsparse_int ldb, aoclsparse_float_complex *X, aoclsparse_int ldx, const aoclsparse_int kid)#
aoclsparse_status aoclsparse_ztrsm_kid(aoclsparse_operation trans, const aoclsparse_double_complex alpha, aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const aoclsparse_double_complex *B, aoclsparse_int n, aoclsparse_int ldb, aoclsparse_double_complex *X, aoclsparse_int ldx, const aoclsparse_int kid)#

Solve sparse triangular linear system of equations with multiple right hand sides for real/complex single and double data precisions (kernel flag variation).

For full details refer to aoclsparse_?trsm().

This variation of TRSM, namely with a suffix of _kid, allows to choose which underlying TRSV kernels to use (if possible). Currently, all the existing kernels supported by aoclsparse_?trsv_kid() are available here as well.

Parameters
  • trans[in] matrix operation to perform on \(A\). Possible values are aoclsparse_operation_none, aoclsparse_operation_transpose, and aoclsparse_operation_conjugate_transpose.

  • alpha[in] scalar \(\alpha\).

  • A[in] sparse matrix \(A\) of size \(m\).

  • descr[in] descriptor of the sparse matrix \(A\).

  • order[in] storage order of dense matrices \(B\) and \(X\). Possible options are aoclsparse_order_row and aoclsparse_order_column.

  • B[in] dense matrix, potentially rectangular, of size \(m \times n\).

  • n[in] \(n,\) number of columns of the dense matrix \(B\).

  • ldb[in] leading dimension of \(B\). Eventhough the matrix \(B\) is considered of size \(m \times n\), its memory layout may correspond to a larger matrix (ldb by \(N>n\)) in which only the submatrix \(B\) is of interest. In this case, this parameter provides means to access the correct elements of \(B\) within the larger layout.

    matrix layout

    row count

    column count

    aoclsparse_order_row

    \(m\)

    ldb with ldb \(\ge n\)

    aoclsparse_order_column

    ldb with ldb \(\ge m\)

    \(n\)

  • X[out] solution matrix \(X,\) dense and potentially rectangular matrix of size \(m \times n\).

  • ldx[in] leading dimension of \(X\). Eventhough the matrix \(X\) is considered of size \(m \times n\), its memory layout may correspond to a larger matrix (ldx by \(N>n\)) in which only the submatrix \(X\) is of interest. In this case, this parameter provides means to access the correct elements of \(X\) within the larger layout.

    matrix layout

    row count

    column count

    aoclsparse_order_row

    \(m\)

    ldx with ldx \(\ge n\)

    aoclsparse_order_column

    ldx with ldx \(\ge m\)

    \(n\)

  • kid[in] kernel ID, hints which kernel to use.

aoclsparse_sp2m()#

aoclsparse_status aoclsparse_sp2m(aoclsparse_operation opA, const aoclsparse_mat_descr descrA, const aoclsparse_matrix A, aoclsparse_operation opB, const aoclsparse_mat_descr descrB, const aoclsparse_matrix B, const aoclsparse_request request, aoclsparse_matrix *C)#

Sparse matrix Sparse matrix multiplication for real and complex datatypes.

aoclsparse_?sp2m multiplies two sparse matrices in CSR storage format. The result is stored in a newly allocated sparse matrix in CSR format, such that

\[ C = op(A) \, op(B), \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]
and
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if } {\bf\mathsf{opB}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ B^T, & \text{if } {\bf\mathsf{opB}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ B^H, & \text{if } {\bf\mathsf{opB}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]
where \(A\) is a \(m \times k\) matrix , \(B\) is a \(k \times n\) matrix, resulting in \(m \times n\) matrix \(C\), for opA and opB = aoclsparse_operation_none. \(A\) is a \(k \times m\) matrix when opA = aoclsparse_operation_transpose or aoclsparse_operation_conjugate_transpose and \(B\) is a \(n \times k\) matrix when opB = aoclsparse_operation_transpose or aoclsparse_operation_conjugate_transpose

aoclsparse_sp2m can be run in single-stage or two-stage. The single-stage algorithm allocates and computes the entire output matrix in a single stage aoclsparse_stage_full_computation. Whereas, in two-stage algorithm, the first stage aoclsparse_stage_nnz_count allocates memory for the output matrix and computes the number of entries of the matrix. The second stage aoclsparse_stage_finalize computes column indices of non-zero elements and values of the output matrix. The second stage has to be invoked only after the first stage. But, it can be also be invoked multiple times consecutively when the sparsity structure of input matrices remains unchanged, with only the values getting updated.

  1/* ************************************************************************
  2 * Copyright (c) 2023 Advanced Micro Devices, Inc. All rights reserved.
  3 *
  4 * Permission is hereby granted, free of charge, to any person obtaining a copy
  5 * of this software and associated documentation files (the "Software"), to deal
  6 * in the Software without restriction, including without limitation the rights
  7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  8 * copies of the Software, and to permit persons to whom the Software is
  9 * furnished to do so, subject to the following conditions:
 10 *
 11 * The above copyright notice and this permission notice shall be included in
 12 * all copies or substantial portions of the Software.
 13 *
 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 20 * THE SOFTWARE.
 21 *
 22 * ************************************************************************ */
 23
 24#include "aoclsparse.h"
 25
 26#include <complex>
 27#include <iomanip>
 28#include <iostream>
 29
 30/* Computes multiplication of 2 sparse matrices A and B in CSR format. */
 31
 32// Comment this out for single stage computation
 33//#define TWO_STAGE_COMPUTATION
 34
 35int main(void)
 36{
 37    std::cout << "-------------------------------" << std::endl
 38              << "----- SP2M sample program -----" << std::endl
 39              << "-------------------------------" << std::endl
 40              << std::endl;
 41
 42    aoclsparse_status     status;
 43    aoclsparse_int        nnz_C;
 44    aoclsparse_request    request;
 45    aoclsparse_index_base base = aoclsparse_index_base_zero;
 46    aoclsparse_operation  opA  = aoclsparse_operation_transpose;
 47    aoclsparse_operation  opB  = aoclsparse_operation_none;
 48
 49    // Print aoclsparse version
 50    std::cout << aoclsparse_get_version() << std::endl;
 51
 52    // Initialise matrix descriptor and csr matrix structure of inputs A and B
 53    aoclsparse_mat_descr descrA;
 54    aoclsparse_mat_descr descrB;
 55    aoclsparse_matrix    csrA;
 56    aoclsparse_matrix    csrB;
 57
 58    // Create matrix descriptor of input matrices
 59    // aoclsparse_create_mat_descr set aoclsparse_matrix_type to aoclsparse_matrix_type_general
 60    // and aoclsparse_index_base to aoclsparse_index_base_zero.
 61    aoclsparse_create_mat_descr(&descrA);
 62    aoclsparse_create_mat_descr(&descrB);
 63
 64    // Matrix sizes
 65    aoclsparse_int m = 5, n = 5, k = 5;
 66    aoclsparse_int nnz_A = 10, nnz_B = 10;
 67    // Matrix A
 68    aoclsparse_int            row_ptr_A[] = {0, 1, 2, 5, 9, 10};
 69    aoclsparse_int            col_ind_A[] = {0, 0, 1, 2, 4, 0, 1, 2, 3, 4};
 70    aoclsparse_double_complex val_A[]     = {{-0.86238, 0.454626},
 71                                             {-2.62138, -0.442597},
 72                                             {-0.875679, 0.137933},
 73                                             {-0.661939, -1.09106},
 74                                             {0.0501717, -2.37527},
 75                                             {-1.48812, -0.420546},
 76                                             {-0.588085, -0.708977},
 77                                             {0.310933, -0.96569},
 78                                             {-0.88964, -2.37881},
 79                                             {-1.23201, 0.213152}};
 80    aoclsparse_create_zcsr(
 81        &csrA, base, k, m, nnz_A, row_ptr_A, col_ind_A, (aoclsparse_double_complex *)val_A);
 82
 83    // Matrix B
 84    aoclsparse_int            row_ptr_B[] = {0, 4, 4, 7, 8, 10};
 85    aoclsparse_int            col_ind_B[] = {0, 1, 2, 4, 0, 1, 2, 2, 2, 3};
 86    aoclsparse_double_complex val_B[]     = {{-1.59204, -0.259325},
 87                                             {0.467532, -0.980612},
 88                                             {0.078412, -0.513591},
 89                                             {-1.52364, 0.403911},
 90                                             {0.211966, -1.33485},
 91                                             {-1.37901, -1.44562},
 92                                             {1.42472, -2.08662},
 93                                             {-2.26549, -1.0073},
 94                                             {-1.75098, 0.207783},
 95                                             {-1.8152, 0.482205}};
 96
 97    aoclsparse_create_zcsr(
 98        &csrB, base, k, n, nnz_B, row_ptr_B, col_ind_B, (aoclsparse_double_complex *)val_B);
 99
100    aoclsparse_matrix          csrC          = NULL;
101    aoclsparse_int            *csr_row_ptr_C = NULL;
102    aoclsparse_int            *csr_col_ind_C = NULL;
103    aoclsparse_double_complex *csr_val_C     = NULL;
104    aoclsparse_int             C_M, C_N;
105
106    // expected output matrices
107    aoclsparse_int            C_M_exp = 5, C_N_exp = 5, nnz_C_exp = 15;
108    aoclsparse_int            csr_row_ptr_C_exp[] = {0, 4, 7, 10, 11, 15};
109    aoclsparse_int            csr_col_ind_C_exp[] = {0, 1, 2, 4, 0, 1, 2, 0, 1, 2, 2, 0, 1, 2, 3};
110    aoclsparse_double_complex csr_val_C_exp[]     = {{1.49084, -0.500145},
111                                                     {0.0426217, 1.05821},
112                                                     {3.11358, 2.93029},
113                                                     {1.13033, -1.04101},
114                                                     {-0.0014949, 1.19813},
115                                                     {1.40697, 1.07569},
116                                                     {-0.34163, 4.22229},
117                                                     {-1.59671, 0.652319},
118                                                     {-0.664445, 2.4615},
119                                                     {-4.89686, 1.70132},
120                                                     {-0.380707, 6.28532},
121                                                     {-3.15998, -0.570448},
122                                                     {-3.50293, 3.20298},
123                                                     {-2.77187, -4.11799},
124                                                     {2.13356, -0.980994}};
125
126#ifdef TWO_STAGE_COMPUTATION
127    std::cout << "Invoking aoclsparse_sp2m with aoclsparse_stage_nnz_count..\n";
128    // aoclsparse_stage_nnz_count : Only rowIndex array of the CSR matrix
129    // is computed internally.
130    request = aoclsparse_stage_nnz_count;
131    status  = aoclsparse_sp2m(opA, descrA, csrA, opB, descrB, csrB, request, &csrC);
132    if(status != aoclsparse_status_success)
133        return 1;
134
135    std::cout << "Invoking aoclsparse_sp2m with aoclsparse_stage_finalize..\n";
136    // aoclsparse_stage_finalize : Finalize computation of remaining
137    // output arrays ( column indices and values of output matrix entries) .
138    // Has to be called only after aoclsparse_sp2m call with
139    // aoclsparse_stage_nnz_count parameter.
140    request = aoclsparse_stage_finalize;
141    status  = aoclsparse_sp2m(opA, descrA, csrA, opB, descrB, csrB, request, &csrC);
142    if(status != aoclsparse_status_success)
143        return 2;
144
145#else // SINGLE STAGE
146    std::cout << "Invoking aoclsparse_sp2m with aoclsparse_stage_full_computation..\n";
147    // aoclsparse_stage_full_computation :  Whole computation is performed in
148    // single step.
149    request = aoclsparse_stage_full_computation;
150    status  = aoclsparse_sp2m(opA, descrA, csrA, opB, descrB, csrB, request, &csrC);
151    if(status != aoclsparse_status_success)
152        return 3;
153
154#endif
155
156    aoclsparse_export_zcsr(
157        csrC, &base, &C_M, &C_N, &nnz_C, &csr_row_ptr_C, &csr_col_ind_C, &csr_val_C);
158    // Check and print the result
159    std::cout << std::fixed;
160    std::cout.precision(1);
161    bool oka, okb, okc, oki, okj, okk, ok = true;
162    std::cout << std::endl
163              << "Output Matrix C: " << std::endl
164              << std::setw(11) << "C_M" << std::setw(3) << "" << std::setw(11) << "expected"
165              << std::setw(3) << "" << std::setw(11) << "C_N" << std::setw(3) << "" << std::setw(11)
166              << "expected" << std::setw(3) << "" << std::setw(11) << "nnz_C" << std::setw(3) << ""
167              << std::setw(11) << "expected" << std::endl;
168    oka = C_M == C_M_exp;
169    ok &= oka;
170    std::cout << std::setw(11) << C_M << std::setw(3) << "" << std::setw(11) << C_M_exp
171              << std::setw(2) << (oka ? "" : " !");
172    okb = C_N == C_N_exp;
173    ok &= okb;
174    std::cout << std::setw(11) << C_N << std::setw(3) << "" << std::setw(11) << C_N_exp
175              << std::setw(2) << (okb ? "" : " !");
176    okc = nnz_C == nnz_C_exp;
177    ok &= okc;
178    std::cout << std::setw(11) << nnz_C << std::setw(3) << "" << std::setw(11) << nnz_C_exp
179              << std::setw(2) << (okc ? "" : " !");
180    std::cout << std::endl;
181    std::cout << std::endl;
182    std::cout << std::setw(11) << "csr_val_C" << std::setw(3) << "" << std::setw(11) << "expected"
183              << std::setw(3) << "" << std::setw(11) << "csr_col_ind_C" << std::setw(3) << ""
184              << std::setw(11) << "expected" << std::setw(3) << "" << std::setw(11)
185              << "csr_row_ptr_C" << std::setw(3) << "" << std::setw(11) << "expected" << std::endl;
186    //Initializing precision tolerance range for double
187    const double tol = 1e-03;
188    for(aoclsparse_int i = 0; i < nnz_C; i++)
189    {
190        oki = ((std::abs(csr_val_C[i].real - csr_val_C_exp[i].real) <= tol)
191               && (std::abs(csr_val_C[i].imag - csr_val_C_exp[i].imag) <= tol));
192        ok &= oki;
193        std::cout << std::setw(11) << "(" << csr_val_C[i].real << ", " << csr_val_C[i].imag << "i) "
194                  << std::setw(3) << "" << std::setw(11) << "(" << csr_val_C_exp[i].real << ", "
195                  << csr_val_C_exp[i].imag << "i) " << std::setw(2) << (oki ? "" : " !");
196        okj = csr_col_ind_C[i] == csr_col_ind_C_exp[i];
197        ok &= okj;
198        std::cout << std::setw(11) << csr_col_ind_C[i] << std::setw(3) << "" << std::setw(11)
199                  << csr_col_ind_C_exp[i] << std::setw(2) << (okj ? "" : " !");
200        if(i < C_M)
201        {
202            okk = csr_row_ptr_C[i] == csr_row_ptr_C_exp[i];
203            ok &= okk;
204            std::cout << " " << std::setw(11) << csr_row_ptr_C[i] << std::setw(3) << ""
205                      << std::setw(11) << csr_row_ptr_C_exp[i] << std::setw(2) << (okk ? "" : " !");
206        }
207        std::cout << std::endl;
208    }
209
210    aoclsparse_destroy_mat_descr(descrA);
211    aoclsparse_destroy_mat_descr(descrB);
212    aoclsparse_destroy(&csrA);
213    aoclsparse_destroy(&csrB);
214    aoclsparse_destroy(&csrC);
215    return (ok ? 0 : 6);
216}
Parameters
  • opA[in] matrix \(A\) operation type.

  • descrA[in] descriptor of the sparse CSR matrix \(A\). Currently, only aoclsparse_matrix_type_general is supported.

  • A[in] sparse CSR matrix \(A\) .

  • opB[in] matrix \(B\) operation type.

  • descrB[in] descriptor of the sparse CSR matrix \(B\). Currently, only aoclsparse_matrix_type_general is supported.

  • B[in] sparse CSR matrix \(B\) .

  • request[in] Specifies full computation or two-stage algorithm aoclsparse_stage_nnz_count , Only rowIndex array of the CSR matrix is computed internally. The output sparse CSR matrix can be extracted to measure the memory required for full operation. aoclsparse_stage_finalize . Finalize computation of remaining output arrays ( column indices and values of output matrix entries) . Has to be called only after aoclsparse_sp2m call with aoclsparse_stage_nnz_count parameter. aoclsparse_stage_full_computation . Perform the entire computation in a single step.

  • *C[out] Pointer to sparse CSR matrix \(C\) . Matrix \(C\) arrays will always have zero-based indexing, irrespective of matrix \(A\) or matrix \(B\) being one-based or zero-based indexing. The column indices of the output matrix in CSR format can appear unsorted.

Return values
  • aoclsparse_status_success – the operation completed successfully.

  • aoclsparse_status_invalid_pointerdescrA, descrB, A, B, C is invalid.

  • aoclsparse_status_invalid_size – input size parameters contain an invalid value.

  • aoclsparse_status_invalid_value – input parameters contain an invalid value.

  • aoclsparse_status_wrong_type – A and B matrix datatypes dont match.

  • aoclsparse_status_memory_error – Memory allocation failure.

  • aoclsparse_status_not_implementedaoclsparse_matrix_type is not aoclsparse_matrix_type_general or input matrices A or B is not in CSR format

aoclsparse_spmm()#

aoclsparse_status aoclsparse_spmm(aoclsparse_operation opA, const aoclsparse_matrix A, const aoclsparse_matrix B, aoclsparse_matrix *C)#

Sparse matrix Sparse matrix multiplication for real and complex datatypes.

aoclsparse_?spmm multiplies two sparse matrices in CSR storage format. The result is stored in a newly allocated sparse matrix in CSR format, such that

\[\begin{split}C = op(A) \cdot B, \text{ with } op(A) = \left\{ \begin{array}{ll} A, & \text{ if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{ if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{ if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right.\end{split}\]
where \(A\) is a \(m \times k\) matrix , \(B\) is a \(k \times n\) matrix, resulting in \(m \times n\) matrix \(C\), for opA = aoclsparse_operation_none. \(A\) is a \(k \times m\) matrix when opA = aoclsparse_operation_transpose or aoclsparse_operation_conjugate_transpose
Parameters
  • opA[in] matrix \(A\) operation type.

  • A[in] sparse CSR matrix \(A\).

  • B[in] sparse CSR matrix \(B\).

  • *C[out] Pointer to sparse CSR matrix \(C\) . Matrix \(C\) arrays will always have zero-based indexing, irrespective of matrix \(A\) or matrix \(B\) being one-based or zero-based indexing. The column indices of the output matrix in CSR format can appear unsorted.

Return values
  • aoclsparse_status_success – the operation completed successfully.

  • aoclsparse_status_invalid_pointerA, B, C is invalid.

  • aoclsparse_status_invalid_size – input size parameters contain an invalid value.

  • aoclsparse_status_invalid_value – input parameters contain an invalid value.

  • aoclsparse_status_wrong_typeA and B matrix data types do not match.

  • aoclsparse_status_memory_error – Memory allocation failure.

  • aoclsparse_status_not_implemented – Input matrices A or B is not in CSR format

aoclsparse_?csrmm()#

aoclsparse_status aoclsparse_scsrmm(aoclsparse_operation op, const float alpha, const aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const float *B, aoclsparse_int n, aoclsparse_int ldb, const float beta, float *C, aoclsparse_int ldc)#
aoclsparse_status aoclsparse_dcsrmm(aoclsparse_operation op, const double alpha, const aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const double *B, aoclsparse_int n, aoclsparse_int ldb, const double beta, double *C, aoclsparse_int ldc)#
aoclsparse_status aoclsparse_ccsrmm(aoclsparse_operation op, const aoclsparse_float_complex alpha, const aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const aoclsparse_float_complex *B, aoclsparse_int n, aoclsparse_int ldb, const aoclsparse_float_complex beta, aoclsparse_float_complex *C, aoclsparse_int ldc)#
aoclsparse_status aoclsparse_zcsrmm(aoclsparse_operation op, const aoclsparse_double_complex alpha, const aoclsparse_matrix A, const aoclsparse_mat_descr descr, aoclsparse_order order, const aoclsparse_double_complex *B, aoclsparse_int n, aoclsparse_int ldb, const aoclsparse_double_complex beta, aoclsparse_double_complex *C, aoclsparse_int ldc)#

Sparse matrix dense matrix multiplication using CSR storage format.

aoclsparse_?csrmm multiplies a scalar \(\alpha\) with a sparse \(m \times k\) matrix \(A\), defined in CSR storage format, and a dense \(k \times n\) matrix \(B\) and adds the result to the dense \(m \times n\) matrix \(C\) that is multiplied by a scalar \(\beta\), such that

\[\begin{split} C = \alpha \, op(A) \, B + \beta \, C, \quad \text{ with } \quad op(A) = \left\{ \begin{array}{ll} A, & \text{ if } {\bf\mathsf{trans}\_\mathsf{A}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{ if } {\bf\mathsf{trans}\_\mathsf{A}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{ if } {\bf\mathsf{trans}\_\mathsf{A}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]

  1/* ************************************************************************
  2 * Copyright (c) 2023 Advanced Micro Devices, Inc.
  3 *
  4 * Permission is hereby granted, free of charge, to any person obtaining a copy
  5 * of this software and associated documentation files (the "Software"), to deal
  6 * in the Software without restriction, including without limitation the rights
  7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  8 * copies of the Software, and to permit persons to whom the Software is
  9 * furnished to do so, subject to the following conditions:
 10 *
 11 * The above copyright notice and this permission notice shall be included in
 12 * all copies or substantial portions of the Software.
 13 *
 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 20 * THE SOFTWARE.
 21 *
 22 * ************************************************************************ */
 23
 24#include "aoclsparse.h"
 25
 26#include <complex>
 27#include <cstring>
 28#include <iomanip>
 29#include <iostream>
 30#include <limits>
 31#include <vector>
 32
 33// Example to illustrate the usage of aoclsparse_dcsrmm API
 34int main(void)
 35{
 36
 37    aoclsparse_index_base base = aoclsparse_index_base_zero;
 38    aoclsparse_matrix     A;
 39    aoclsparse_status     status;
 40    aoclsparse_order      order = aoclsparse_order_row;
 41    aoclsparse_mat_descr  descr;
 42
 43    // By default aoclsparse_create_mat_descr sets aoclsparse_matrix_type to aoclsparse_matrix_type_general
 44    status = aoclsparse_create_mat_descr(&descr);
 45    if(status != aoclsparse_status_success)
 46        return status;
 47
 48    // Set aoclsparse_index_base to aoclsparse_index_base_zero.
 49    status = aoclsparse_set_mat_index_base(descr, base);
 50    if(status != aoclsparse_status_success)
 51        return status;
 52
 53    // Matrix sizes
 54    aoclsparse_int m = 3, n = 3, k = 3;
 55    aoclsparse_int nnz = 4;
 56    // Matrix A
 57    // [ 0.  42.   0.2]
 58    // [ 4.6  0.   0. ]
 59    // [ 0.   0.  -8. ]
 60    aoclsparse_int row_ptr[] = {0, 2, 3, 4};
 61    aoclsparse_int col_ind[] = {1, 2, 0, 2};
 62    double         csr_val[] = {42., 0.2, 4.6, -8};
 63
 64    //Dense Matrix B
 65    double B[]   = {-1.0, -2.7, 3.0, 4.5, 5.8, -6.0, 1.0, -2.0, 3.0};
 66    double alpha = 1, beta = 0;
 67    status = aoclsparse_create_dcsr(&A, base, m, k, nnz, row_ptr, col_ind, csr_val);
 68    if(status != aoclsparse_status_success)
 69    {
 70        return status;
 71    }
 72
 73    // expected output matrices
 74    double C_exp[] = {189.2, 243.2, -251.4, -4.6, -12.42, 13.8, -8, 16, -24};
 75
 76    // output matrices;
 77    double               C[9] = {0};
 78    aoclsparse_operation op   = aoclsparse_operation_none;
 79
 80    status = aoclsparse_dcsrmm(op, alpha, A, descr, order, B, n, k, beta, C, n);
 81    if(status != aoclsparse_status_success)
 82    {
 83        return status;
 84    }
 85
 86    std::cout << std::fixed;
 87    std::cout.precision(2);
 88    bool okij, ok_C = true;
 89    //Initializing precision tolerance range for double
 90    const double tol = std::sqrt((std::numeric_limits<double>::epsilon()));
 91
 92    aoclsparse_int i, j;
 93    std::cout << "The output C matrix\n";
 94    for(i = 0; i < m; i++)
 95    {
 96        for(j = 0; j < n; j++)
 97        {
 98            okij = std::abs(C[i * n + j] - C_exp[i * n + j]) <= tol;
 99            ok_C &= okij;
100            std::cout << std::setw(10) << C[i * n + j] << std::setw(3) << (okij ? "" : " !");
101        }
102        std::cout << "\n";
103    }
104    aoclsparse_destroy_mat_descr(descr);
105    aoclsparse_destroy(&A);
106
107    return ((ok_C ? 0 : 4));
108}
Parameters
  • op[in] Matrix \(A\) operation type.

  • alpha[in] Scalar \(\alpha\).

  • A[in] Sparse CSR matrix \(A\) structure.

  • descr[in] descriptor of the sparse CSR matrix \(A\). Currently, supports aoclsparse_matrix_type_general, aoclsparse_matrix_type_symmetric, and aoclsparse_matrix_type_hermitian matrices. Both, base-zero and base-one input arrays of CSR matrix are supported.

  • order[in] aoclsparse_order_row / aoclsparse_order_column for dense matrix

  • B[in] Array of dimension \(ldb \times n\) or \(ldb \times k\) .

  • n[in] Number of columns of the dense matrix \(B\) and \(C\).

  • ldb[in] Leading dimension of \(B\), must be at least \(\max{(1, k)}\) for \(op(A) = A\), or \(\max{(1, m)}\) when \(op(A) = A^T\) or \(op(A) = A^H\).

  • beta[in] Scalar \(\beta\).

  • C[inout] Array of dimension \(ldc \times n\).

  • ldc[in] Leading dimension of \(C\), must be at least \(\max{(1, m)}\) for \(op(A) = A\), or \(\max{(1, k)}\) when \(op(A) = A^T\) or \(op(A) = A^H\).

Return values

aoclsparse_?csr2m()#

aoclsparse_status aoclsparse_dcsr2m(aoclsparse_operation trans_A, const aoclsparse_mat_descr descrA, const aoclsparse_matrix csrA, aoclsparse_operation trans_B, const aoclsparse_mat_descr descrB, const aoclsparse_matrix csrB, const aoclsparse_request request, aoclsparse_matrix *csrC)#
aoclsparse_status aoclsparse_scsr2m(aoclsparse_operation trans_A, const aoclsparse_mat_descr descrA, const aoclsparse_matrix csrA, aoclsparse_operation trans_B, const aoclsparse_mat_descr descrB, const aoclsparse_matrix csrB, const aoclsparse_request request, aoclsparse_matrix *csrC)#

Sparse matrix Sparse matrix multiplication using CSR storage format for single and double precision datatypes.

aoclsparse_?csr2m multiplies a sparse \(m \times k\) matrix \(A\), defined in CSR storage format, and the sparse \(k \times n\) matrix \(B\), defined in CSR storage format and stores the result to the sparse \(m \times n\) matrix \(C\), such that

\[ C = op(A) \cdot op(B), \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{trans}\_\mathsf{A}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{trans}\_\mathsf{A}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{trans}\_\mathsf{A}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]
and
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if } {\bf\mathsf{trans}\_\mathsf{B}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ B^T, & \text{if } {\bf\mathsf{trans}\_\mathsf{B}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ B^H, & \text{if } {\bf\mathsf{trans}\_\mathsf{B}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]

  1/* ************************************************************************
  2 * Copyright (c) 2021-2023 Advanced Micro Devices, Inc. All rights reserved.
  3 *
  4 * Permission is hereby granted, free of charge, to any person obtaining a copy
  5 * of this software and associated documentation files (the "Software"), to deal
  6 * in the Software without restriction, including without limitation the rights
  7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  8 * copies of the Software, and to permit persons to whom the Software is
  9 * furnished to do so, subject to the following conditions:
 10 *
 11 * The above copyright notice and this permission notice shall be included in
 12 * all copies or substantial portions of the Software.
 13 *
 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 20 * THE SOFTWARE.
 21 *
 22 * ************************************************************************ */
 23
 24#include "aoclsparse.h"
 25
 26#include <cstdlib>
 27#include <cstring>
 28#include <iostream>
 29
 30/* Computes multiplication of 2 sparse matrices A and B in CSR format. */
 31
 32// Comment this out for single stage computation
 33#define TWO_STAGE_COMPUTATION
 34
 35#define PRINT_OUTPUT
 36
 37int main(void)
 38{
 39    aoclsparse_status     status;
 40    aoclsparse_int        nnz_C;
 41    aoclsparse_request    request;
 42    aoclsparse_index_base base   = aoclsparse_index_base_zero;
 43    aoclsparse_operation  transA = aoclsparse_operation_none;
 44    aoclsparse_operation  transB = aoclsparse_operation_none;
 45
 46    // Print aoclsparse version
 47    std::cout << aoclsparse_get_version() << std::endl;
 48
 49    // Initialise matrix descriptor and csr matrix structure of inputs A and B
 50    aoclsparse_mat_descr descrA;
 51    aoclsparse_mat_descr descrB;
 52    aoclsparse_matrix    csrA;
 53    aoclsparse_matrix    csrB;
 54
 55    // Create matrix descriptor of input matrices
 56    // aoclsparse_create_mat_descr set aoclsparse_matrix_type to aoclsparse_matrix_type_general
 57    // and aoclsparse_index_base to aoclsparse_index_base_zero.
 58    aoclsparse_create_mat_descr(&descrA);
 59    aoclsparse_create_mat_descr(&descrB);
 60
 61    // Matrix sizes
 62    aoclsparse_int m = 3, n = 3, k = 3;
 63    aoclsparse_int nnz_A = 6, nnz_B = 4;
 64    // Matrix A
 65    // 	1  0  2
 66    // 	0  0  3
 67    // 	4  5  6
 68    aoclsparse_int row_ptr_A[] = {0, 2, 3, 6};
 69    aoclsparse_int col_ind_A[] = {0, 2, 2, 0, 1, 2};
 70    float          val_A[]     = {1, 2, 3, 4, 5, 6};
 71    aoclsparse_create_scsr(&csrA, base, m, k, nnz_A, row_ptr_A, col_ind_A, val_A);
 72
 73    // Matrix B
 74    // 	1  2  0
 75    // 	0  0  3
 76    // 	0  4  0
 77    aoclsparse_int row_ptr_B[] = {0, 2, 3, 4};
 78    aoclsparse_int col_ind_B[] = {0, 1, 2, 1};
 79    float          val_B[]     = {1, 2, 3, 4};
 80    aoclsparse_create_scsr(&csrB, base, k, n, nnz_B, row_ptr_B, col_ind_B, val_B);
 81
 82    aoclsparse_matrix csrC          = NULL;
 83    aoclsparse_int   *csr_row_ptr_C = NULL;
 84    aoclsparse_int   *csr_col_ind_C = NULL;
 85    float            *csr_val_C     = NULL;
 86    aoclsparse_int    C_M, C_N;
 87
 88#ifdef TWO_STAGE_COMPUTATION
 89    std::cout << "Invoking aoclsparse_scsr2m with aoclsparse_stage_nnz_count..";
 90    // aoclsparse_stage_nnz_count : Only rowIndex array of the CSR matrix
 91    // is computed internally. The output sparse CSR matrix can be
 92    // extracted to measure the memory required for full operation.
 93    request = aoclsparse_stage_nnz_count;
 94    status  = aoclsparse_scsr2m(transA, descrA, csrA, transB, descrB, csrB, request, &csrC);
 95    if(status == aoclsparse_status_success)
 96        std::cout << "DONE\n";
 97    else
 98    {
 99        std::cout << "ERROR in aoclsparse_scsr2m\n";
100        exit(EXIT_FAILURE);
101    }
102
103    aoclsparse_export_scsr(
104        csrC, &base, &C_M, &C_N, &nnz_C, &csr_row_ptr_C, &csr_col_ind_C, &csr_val_C);
105
106#if 0
107    std::cout << "C_M"
108              << "\t"
109              << "C_N"
110              << "\t"
111              << "nnz_C" << std::endl;
112    std::cout << C_M << "\t" << C_N << "\t" << nnz_C << std::endl;
113
114    std::cout << "csr_row_ptr_C: " << std::endl;
115    for(aoclsparse_int i = 0; i < C_M + 1; i++)
116        std::cout << csr_row_ptr_C[i] << "\t";
117    std::cout << std::endl;
118#endif
119
120    std::cout << "Invoking aoclsparse_scsr2m with aoclsparse_stage_finalize..";
121    // aoclsparse_stage_finalize : Finalize computation of remaining
122    // output arrays ( column indices and values of output matrix entries) .
123    // Has to be called only after aoclsparse_scsr2m call with
124    // aoclsparse_stage_nnz_count parameter.
125    request = aoclsparse_stage_finalize;
126    status  = aoclsparse_scsr2m(transA, descrA, csrA, transB, descrB, csrB, request, &csrC);
127    if(status == aoclsparse_status_success)
128        std::cout << "DONE\n";
129    else
130    {
131        std::cout << "ERROR in aoclsparse_scsr2m\n";
132        exit(EXIT_FAILURE);
133    }
134    aoclsparse_export_scsr(
135        csrC, &base, &C_M, &C_N, &nnz_C, &csr_row_ptr_C, &csr_col_ind_C, &csr_val_C);
136
137#ifdef PRINT_OUTPUT
138    std::cout << "csr_col_ind_C: " << std::endl;
139    for(aoclsparse_int i = 0; i < nnz_C; i++)
140        std::cout << csr_col_ind_C[i] << "\t";
141    std::cout << std::endl;
142
143    std::cout << "csr_val_C: " << std::endl;
144    for(aoclsparse_int i = 0; i < nnz_C; i++)
145        std::cout << csr_val_C[i] << "\t";
146    std::cout << std::endl;
147#endif
148
149#else // SINGLE STAGE
150    std::cout << "Invoking aoclsparse_scsr2m with aoclsparse_stage_full_computation..";
151    // aoclsparse_stage_full_computation :  Whole computation is performed in
152    // single step.
153    request = aoclsparse_stage_full_computation;
154    status  = aoclsparse_scsr2m(transA, descrA, csrA, transB, descrB, csrB, request, &csrC);
155    if(status == aoclsparse_status_success)
156        std::cout << "DONE\n";
157    else
158    {
159        std::cout << "ERROR in aoclsparse_scsr2m\n";
160        exit(EXIT_FAILURE);
161    }
162
163    aoclsparse_export_scsr(
164        csrC, &base, &C_M, &C_N, &nnz_C, &csr_row_ptr_C, &csr_col_ind_C, &csr_val_C);
165
166#ifdef PRINT_OUTPUT
167    std::cout << "C_M"
168              << "\t"
169              << "C_N"
170              << "\t"
171              << "nnz_C" << std::endl;
172    std::cout << C_M << "\t" << C_N << "\t" << nnz_C << std::endl;
173
174    std::cout << "csr_row_ptr_C: " << std::endl;
175    for(aoclsparse_int i = 0; i < C_M + 1; i++)
176        std::cout << csr_row_ptr_C[i] << "\t";
177    std::cout << std::endl;
178
179    std::cout << "csr_col_ind_C: " << std::endl;
180    for(aoclsparse_int i = 0; i < nnz_C; i++)
181        std::cout << csr_col_ind_C[i] << "\t";
182    std::cout << std::endl;
183
184    std::cout << "csr_val_C: " << std::endl;
185    for(aoclsparse_int i = 0; i < nnz_C; i++)
186        std::cout << csr_val_C[i] << "\t";
187    std::cout << std::endl;
188#endif
189
190#endif
191    aoclsparse_destroy_mat_descr(descrA);
192    aoclsparse_destroy_mat_descr(descrB);
193    aoclsparse_destroy(&csrA);
194    aoclsparse_destroy(&csrB);
195    aoclsparse_destroy(&csrC);
196    return 0;
197}
Parameters
  • trans_A[in] matrix \(A\) operation type.

  • descrA[in] descriptor of the sparse CSR matrix \(A\). Currently, only aoclsparse_matrix_type_general is supported.

  • csrA[in] sparse CSR matrix \(A\) structure.

  • trans_B[in] matrix \(B\) operation type.

  • descrB[in] descriptor of the sparse CSR matrix \(B\). Currently, only aoclsparse_matrix_type_general is supported.

  • csrB[in] sparse CSR matrix \(B\) structure.

  • request[in] Specifies full computation or two-stage algorithm aoclsparse_stage_nnz_count , Only rowIndex array of the CSR matrix is computed internally. The output sparse CSR matrix can be extracted to measure the memory required for full operation. aoclsparse_stage_finalize . Finalize computation of remaining output arrays ( column indices and values of output matrix entries) . Has to be called only after aoclsparse_dcsr2m() call with aoclsparse_stage_nnz_count parameter. aoclsparse_stage_full_computation. Perform the entire computation in a single step.

  • *csrC[out] Pointer to sparse CSR matrix \(C\) structure.

Return values
  • aoclsparse_status_success – the operation completed successfully.

  • aoclsparse_status_invalid_size – input parameters contain an invalid value.

  • aoclsparse_status_invalid_pointerdescrA, csr, descrB, csrB, csrC is invalid.

  • aoclsparse_status_not_implementedaoclsparse_matrix_type is not aoclsparse_matrix_type_general or input matrices A or B is not in CSR format

aoclsparse_?add()#

aoclsparse_status aoclsparse_sadd(const aoclsparse_operation op, const aoclsparse_matrix A, const float alpha, const aoclsparse_matrix B, aoclsparse_matrix *C)#
aoclsparse_status aoclsparse_dadd(const aoclsparse_operation op, const aoclsparse_matrix A, const double alpha, const aoclsparse_matrix B, aoclsparse_matrix *C)#
aoclsparse_status aoclsparse_cadd(const aoclsparse_operation op, const aoclsparse_matrix A, const aoclsparse_float_complex alpha, const aoclsparse_matrix B, aoclsparse_matrix *C)#
aoclsparse_status aoclsparse_zadd(const aoclsparse_operation op, const aoclsparse_matrix A, const aoclsparse_double_complex alpha, const aoclsparse_matrix B, aoclsparse_matrix *C)#

Addition of two sparse matrices.

aoclsparse_?add adds two sparse matrices and returns a sparse matrix. Matrices can be either real or complex types but cannot be intermixed. It performs

\[\begin{split} C = \alpha \, op(A) + B \qquad\text{ with } \qquad op(A) = \left\{ \begin{array}{ll} A, & \text{ if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{ if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{ if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]
where \(A\) is a \(m \times n\) matrix and \(B\) is a \(m \times n\) matrix, if op = aoclsparse_operation_none. Otherwise \(A\) is \(n \times m\) and the result matrix \(C\) has the same dimension as \(B\).

Note

Only matrices in CSR format are supported in this release.

Parameters
  • op[in] matrix \(A\) operation type.

  • alpha[in] scalar with same precision as \(A\) and \(B\) matrix

  • A[in] source sparse matrix \(A\)

  • B[in] source sparse matrix \(B\)

  • *C[out] pointer to the sparse output matrix \(C\)

Return values
  • aoclsparse_status_success – The operation completed successfully.

  • aoclsparse_status_invalid_pointerA or B or C are invalid

  • aoclsparse_status_invalid_size – The dimensions of A and B are not compatible.

  • aoclsparse_status_internal_error – Internal Error Occured

  • aoclsparse_status_memory_error – Memory allocation failure.

  • aoclsparse_status_not_implemented – Matrices are not in CSR format.

aoclsparse_?spmmd()#

aoclsparse_status aoclsparse_sspmmd(const aoclsparse_operation op, const aoclsparse_matrix A, const aoclsparse_matrix B, const aoclsparse_order layout, float *C, const aoclsparse_int ldc)#
aoclsparse_status aoclsparse_dspmmd(const aoclsparse_operation op, const aoclsparse_matrix A, const aoclsparse_matrix B, const aoclsparse_order layout, double *C, const aoclsparse_int ldc)#
aoclsparse_status aoclsparse_cspmmd(const aoclsparse_operation op, const aoclsparse_matrix A, const aoclsparse_matrix B, const aoclsparse_order layout, aoclsparse_float_complex *C, const aoclsparse_int ldc)#
aoclsparse_status aoclsparse_zspmmd(const aoclsparse_operation op, const aoclsparse_matrix A, const aoclsparse_matrix B, const aoclsparse_order layout, aoclsparse_double_complex *C, const aoclsparse_int ldc)#

Matrix multiplication of two sparse matrices stored in the CSR storage format. The output matrix is stored in a dense format.

aoclsparse_?spmmd multiplies a sparse matrix \(A\) and a sparse matrix \(B\), both stored in the CSR storage format, and saves the result in a dense matrix \(C\), such that

\[ C := op(A) \cdot B, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if op} = \text{aoclsparse\_operation\_none} \\ A^T, & \text{if op} = \text{aoclsparse\_operation\_transpose} \\ A^H, & \text{if op} = \text{aoclsparse\_operation\_conjugate\_transpose} \end{array} \right. \end{split}\]
Parameters
  • op[in] Operation to perform on matrix \(A\).

  • A[in] Matrix structure containing sparse matrix \(A\) of size \(m \times k\).

  • B[in] Matrix structure containing sparse matrix \(B\) of size \(k \times n\) if op is aoclsparse_operation_none otherwise of size \(m \times n\).

  • layout[in] Ordering of the dense output matrix: valid values are aoclsparse_order_row and aoclsparse_order_column.

  • C[inout] Dense output matrix \(C\) of size \(m \times n\) if op is aoclsparse_operation_none, otherwise of size \(k \times n\) containing the matrix-matrix product of \(A\) and \(B\).

  • ldc[in] Leading dimension of \(C\), e.g., for C stored in aoclsparse_order_row, ldc must be at least \(\max{(1, m)}\) when \(op(A) = A\), or \(\max{(1, k)}\) if \(op(A) = A^T\) or \(op(A) = A^H\).

Return values
  • aoclsparse_status_success – The operation completed successfully.

  • aoclsparse_status_invalid_sizem, n, k, nnz or ldc is not valid.

  • aoclsparse_status_invalid_pointerA, B or C pointer is not valid.

  • aoclsparse_status_wrong_typeaoclsparse_matrix_data_type does not match the precision type.

  • aoclsparse_status_not_implemented – aoclsparse_matrix_format_type is not aoclsparse_csr_mat.

aoclsparse_?sp2md()#

aoclsparse_status aoclsparse_ssp2md(const aoclsparse_operation opA, const aoclsparse_mat_descr descrA, const aoclsparse_matrix A, const aoclsparse_operation opB, const aoclsparse_mat_descr descrB, const aoclsparse_matrix B, const float alpha, const float beta, float *C, const aoclsparse_order layout, const aoclsparse_int ldc)#
aoclsparse_status aoclsparse_dsp2md(const aoclsparse_operation opA, const aoclsparse_mat_descr descrA, const aoclsparse_matrix A, const aoclsparse_operation opB, const aoclsparse_mat_descr descrB, const aoclsparse_matrix B, const double alpha, const double beta, double *C, const aoclsparse_order layout, const aoclsparse_int ldc)#
aoclsparse_status aoclsparse_csp2md(const aoclsparse_operation opA, const aoclsparse_mat_descr descrA, const aoclsparse_matrix A, const aoclsparse_operation opB, const aoclsparse_mat_descr descrB, const aoclsparse_matrix B, aoclsparse_float_complex alpha, aoclsparse_float_complex beta, aoclsparse_float_complex *C, const aoclsparse_order layout, const aoclsparse_int ldc)#
aoclsparse_status aoclsparse_zsp2md(const aoclsparse_operation opA, const aoclsparse_mat_descr descrA, const aoclsparse_matrix A, const aoclsparse_operation opB, const aoclsparse_mat_descr descrB, const aoclsparse_matrix B, aoclsparse_double_complex alpha, aoclsparse_double_complex beta, aoclsparse_double_complex *C, const aoclsparse_order layout, const aoclsparse_int ldc)#

A variant of matrix multiplication of two sparse matrices stored in the CSR storage format. The output matrix is stored in a dense format. Supports operations on both sparse matrices.

aoclsparse_?sp2md multiplies a sparse matrix \(A\) and a sparse matrix \(B\), both stored in the CSR storage format, and saves the result in a dense matrix \(C\), such that

\[ C := \alpha \cdot op(A) \cdot op(B) + \beta \cdot C, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if opA} = \text{aoclsparse\_operation\_none} \\ A^T, & \text{if opA} = \text{aoclsparse\_operation\_transpose} \\ A^H, & \text{if opA} = \text{aoclsparse\_operation\_conjugate\_transpose} \end{array} \right. \end{split}\]
and
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if opB} = \text{aoclsparse\_operation\_none} \\ B^T, & \text{if opB} = \text{aoclsparse\_operation\_transpose} \\ B^H, & \text{if opB} = \text{aoclsparse\_operation\_conjugate\_transpose} \end{array} \right. \end{split}\]
Parameters
  • opA[in] Operation to perform on matrix \(A\).

  • descrA[in] Descriptor of A. Only aoclsparse_matrix_type_general is supported at present. As a consequence, all other parameters within the descriptor are ignored.

  • A[in] Matrix structure containing sparse matrix \(A\) of size \(m \times k\).

  • opB[in] Operation to perform on matrix \(B\).

  • descrB[in] Descriptor of B. Only aoclsparse_matrix_type_general is supported at present. As a consequence, all other parameters within the descriptor are ignored.

  • B[in] Matrix structure containing sparse matrix \(B\) of size \(k \times n\) if op is aoclsparse_operation_none otherwise of size \(m \times n\).

  • alpha[in] Value of \( \alpha\).

  • beta[in] Value of \( \beta\).

  • C[inout] Dense output matrix \(C\).

  • layout[in] Ordering of the dense output matrix: valid values are aoclsparse_order_row and aoclsparse_order_column.

  • ldc[in] Leading dimension of \(C\), e.g., for C stored in aoclsparse_order_row, ldc must be at least \(\max{(1, m)}\) ( \(op(A) = A\)) or \(\max{(1, k)}\) ( \(op(A) = A^T\) or \(op(A) = A^H\)).

Return values
  • aoclsparse_status_success – The operation completed successfully.

  • aoclsparse_status_invalid_sizem, n, k, nnz or ldc is not valid.

  • aoclsparse_status_invalid_pointerA, B or C pointer is not valid.

  • aoclsparse_status_wrong_typeaoclsparse_matrix_data_type does not match the precision type.

  • aoclsparse_status_not_implemented – aoclsparse_matrix_format_type is not aoclsparse_csr_mat.

  • aoclsparse_status_internal_error – An internal error occurred.

aoclsparse_syrk()#

aoclsparse_status aoclsparse_syrk(const aoclsparse_operation opA, const aoclsparse_matrix A, aoclsparse_matrix *C)#

Multiplication of a sparse matrix and its transpose (or conjugate transpose) stored as a sparse matrix.

aoclsparse_syrk multiplies a sparse matrix with its transpose (or conjugate transpose) in CSR storage format. The result is stored in a newly allocated sparse matrix in CSR format, such that

\[ C := A \cdot op(A) \]
if \(opA\) is aoclsparse_operation_none.

Otherwise,

\[ C := op(A) \cdot A, \]

where

\[\begin{split} op(A) = \left\{ \begin{array}{ll} A^T, & \text{transpose of } {\bf\mathsf{A}} \text{ for real matrices}\\ A^H, & \text{conjugate transpose of } {\bf\mathsf{A}} \text{ for complex matrices}\\ \end{array} \right. \end{split}\]

where \(A\) is a \(m \times n\) matrix, opA is one of aoclsparse_operation_none, aoclsparse_operation_transpose (for real matrices) or aoclsparse_operation_conjugate_transpose (for complex matrices). The output matrix \(C\) is a sparse symmetric (or Hermitian) matrix stored as an upper triangular matrix in CSR format.

  1/* ************************************************************************
  2 * Copyright (c) 2024 Advanced Micro Devices, Inc.
  3 *
  4 * Permission is hereby granted, free of charge, to any person obtaining a copy
  5 * of this software and associated documentation files (the "Software"), to deal
  6 * in the Software without restriction, including without limitation the rights
  7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  8 * copies of the Software, and to permit persons to whom the Software is
  9 * furnished to do so, subject to the following conditions:
 10 *
 11 * The above copyright notice and this permission notice shall be included in
 12 * all copies or substantial portions of the Software.
 13 *
 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 20 * THE SOFTWARE.
 21 *
 22 * ************************************************************************ */
 23
 24#include "aoclsparse.h"
 25
 26#include <complex>
 27#include <cstring>
 28#include <iomanip>
 29#include <iostream>
 30#include <limits>
 31#include <vector>
 32
 33// An example to illustrate the usage of aoclsparse_syrk
 34int main(void)
 35{
 36    std::cout << "-------------------------------" << std::endl
 37              << "----- syrk sample program -----" << std::endl
 38              << "-------------------------------" << std::endl
 39              << std::endl;
 40
 41    aoclsparse_matrix     A;
 42    aoclsparse_matrix     C;
 43    aoclsparse_status     status;
 44    aoclsparse_index_base base          = aoclsparse_index_base_zero;
 45    aoclsparse_int       *csr_row_ptr_C = NULL;
 46    aoclsparse_int       *csr_col_ind_C = NULL;
 47    double               *csr_val_C     = NULL;
 48    aoclsparse_int        C_m, C_n, C_nnz;
 49
 50    // Matrix sizes
 51    aoclsparse_int m = 4, k = 3;
 52    aoclsparse_int nnz = 7;
 53    // Matrix A
 54    // [ 0.  -1.2   2.3]
 55    // [ 4.6  0.    0. ]
 56    // [ 0.   3.0  -8.1 ]
 57    // [ 0.3   0.  -5.1 ]
 58    aoclsparse_int       row_ptr[] = {0, 2, 3, 5, 7};
 59    aoclsparse_int       col_ind[] = {1, 2, 0, 1, 2, 0, 2};
 60    double               csr_val[] = {-1.2, 2.3, 4.6, 3.0, -8.1, 0.3, -5.1};
 61    aoclsparse_operation op        = aoclsparse_operation_none;
 62
 63    status = aoclsparse_create_dcsr(&A, base, m, k, nnz, row_ptr, col_ind, csr_val);
 64    if(status != aoclsparse_status_success)
 65    {
 66        return status;
 67    }
 68
 69    // expected output matrices
 70    aoclsparse_int C_m_exp = 4, C_n_exp = 4, C_nnz_exp = 8;
 71    aoclsparse_int csr_row_ptr_C_exp[] = {0, 3, 5, 7, 8};
 72    aoclsparse_int csr_col_ind_C_exp[] = {0, 2, 3, 1, 3, 2, 3, 3};
 73    double         csr_val_C_exp[]
 74        = {6.73000, -22.23000, -11.73000, 21.16000, 1.38000, 74.61000, 41.31000, 26.10000};
 75
 76    // output matrices;
 77
 78    status = aoclsparse_syrk(op, A, &C);
 79    if(status != aoclsparse_status_success)
 80    {
 81        return status;
 82    }
 83    aoclsparse_export_dcsr(
 84        C, &base, &C_m, &C_n, &C_nnz, &csr_row_ptr_C, &csr_col_ind_C, &csr_val_C);
 85
 86    bool oka, okb, okc, oki, okj, okk, ok = true;
 87    std::cout << std::endl
 88              << "Output Matrix C: " << std::endl
 89              << std::setw(11) << "C_m" << std::setw(3) << "" << std::setw(11) << "expected"
 90              << std::setw(3) << "" << std::setw(11) << "C_n" << std::setw(3) << "" << std::setw(11)
 91              << "expected" << std::setw(3) << "" << std::setw(11) << "C_nnz" << std::setw(3) << ""
 92              << std::setw(11) << "expected" << std::endl;
 93    oka = C_m == C_m_exp;
 94    ok &= oka;
 95    std::cout << std::setw(11) << C_m << std::setw(3) << "" << std::setw(11) << C_m_exp
 96              << std::setw(2) << (oka ? "" : " !");
 97    okb = C_n == C_n_exp;
 98    ok &= okb;
 99    std::cout << std::setw(11) << C_n << std::setw(3) << "" << std::setw(11) << C_n_exp
100              << std::setw(2) << (okb ? "" : " !");
101    okc = C_nnz == C_nnz_exp;
102    ok &= okc;
103    std::cout << std::setw(11) << C_nnz << std::setw(3) << "" << std::setw(11) << C_nnz_exp
104              << std::setw(2) << (okc ? "" : " !");
105    std::cout << std::endl;
106    std::cout << std::endl;
107    std::cout << std::setw(11) << "csr_val_C" << std::setw(3) << "" << std::setw(11) << "expected"
108              << std::setw(3) << "" << std::setw(11) << "csr_col_ind_C" << std::setw(3) << ""
109              << std::setw(11) << "expected" << std::setw(3) << "" << std::setw(11)
110              << "csr_row_ptr_C" << std::setw(3) << "" << std::setw(11) << "expected" << std::endl;
111    //Initializing precision tolerance range for double
112    const double tol = 1e-03;
113    for(aoclsparse_int i = 0; i < C_nnz; i++)
114    {
115        oki = ((std::abs(csr_val_C[i] - csr_val_C_exp[i]) <= tol));
116        ok &= oki;
117        std::cout << std::setw(11) << csr_val_C[i] << "" << std::setw(11) << csr_val_C_exp[i]
118                  << (oki ? "" : " !");
119        okj = csr_col_ind_C[i] == csr_col_ind_C_exp[i];
120        ok &= okj;
121        std::cout << std::setw(11) << csr_col_ind_C[i] << std::setw(3) << "" << std::setw(11)
122                  << csr_col_ind_C_exp[i] << std::setw(2) << (okj ? "" : " !");
123        if(i <= C_m)
124        {
125            okk = csr_row_ptr_C[i] == csr_row_ptr_C_exp[i];
126            ok &= okk;
127            std::cout << " " << std::setw(11) << csr_row_ptr_C[i] << std::setw(3) << ""
128                      << std::setw(11) << csr_row_ptr_C_exp[i] << std::setw(2) << (okk ? "" : " !");
129        }
130        std::cout << std::endl;
131    }
132    aoclsparse_destroy(&A);
133    aoclsparse_destroy(&C);
134
135    return (ok ? 0 : 6);
136}

Note

aoclsparse_syrk assumes that the input CSR matrix has sorted column indices in each row. If not, call aoclsparse_order_mat() before calling aoclsparse_syrk.

Note

aoclsparse_syrk currently does not support aoclsparse_operation_transpose for complex A.

Parameters
  • opA[in] Matrix \(A\) operation type.

  • A[in] Sorted sparse CSR matrix \(A\).

  • *C[out] Pointer to the new sparse CSR symmetric/Hermitian matrix \(C\). Only upper triangle of the result matrix is computed. The column indices of the output matrix in CSR format might be unsorted. The matrix should be freed by aoclsparse_destroy() when no longer needed.

Return values
  • aoclsparse_status_success – The operation completed successfully.

  • aoclsparse_status_invalid_pointerA, C is invalid.

  • aoclsparse_status_wrong_type – A and its operation type do not match.

  • aoclsparse_status_not_implemented – The input matrix is not in the CSR format or opA is aoclsparse_operation_transpose and A has complex values.

  • aoclsparse_status_invalid_value – The value of opA is invalid.

  • aoclsparse_status_unsorted_input – Input matrices are not sorted.

  • aoclsparse_status_memory_error – Memory allocation failure.

aoclsparse_?syrkd()#

aoclsparse_status aoclsparse_ssyrkd(const aoclsparse_operation opA, const aoclsparse_matrix A, const float alpha, const float beta, float *C, const aoclsparse_order orderC, const aoclsparse_int ldc)#
aoclsparse_status aoclsparse_dsyrkd(const aoclsparse_operation opA, const aoclsparse_matrix A, const double alpha, const double beta, double *C, const aoclsparse_order orderC, const aoclsparse_int ldc)#
aoclsparse_status aoclsparse_csyrkd(const aoclsparse_operation opA, const aoclsparse_matrix A, const aoclsparse_float_complex alpha, const aoclsparse_float_complex beta, aoclsparse_float_complex *C, const aoclsparse_order orderC, const aoclsparse_int ldc)#
aoclsparse_status aoclsparse_zsyrkd(const aoclsparse_operation opA, const aoclsparse_matrix A, const aoclsparse_double_complex alpha, const aoclsparse_double_complex beta, aoclsparse_double_complex *C, const aoclsparse_order orderC, const aoclsparse_int ldc)#

Multiplication of a sparse matrix and its transpose (or conjugate transpose) for all data types.

aoclsparse_syrkd multiplies a sparse matrix with its transpose (or conjugate transpose) in CSR storage format. The result is stored in a dense format, such that

\[ C := \alpha \cdot A \cdot op(A) + \beta \cdot C \]
if \(opA\) is aoclsparse_operation_none.

Otherwise,

\[ C := \alpha \cdot op(A) \cdot A + \beta \cdot C \]

\[\begin{split} op(A) = \left\{ \begin{array}{ll} A^T, & \text{transpose of } {\bf\mathsf{A}} \text{ for real matrices}\\ A^H, & \text{conjugate transpose of } {\bf\mathsf{A}} \text{ for complex matrices}\\ \end{array} \right. \end{split}\]

where \(A\) is a \(m \times n\) sparse matrix, opA is one of aoclsparse_operation_none, aoclsparse_operation_transpose (for real matrices) or aoclsparse_operation_conjugate_transpose (for complex matrices). The output matrix \(C\) is a dense symmetric (or Hermitian) matrix stored as an upper triangular matrix.

  1/* ************************************************************************
  2 * Copyright (c) 2024 Advanced Micro Devices, Inc.
  3 *
  4 * Permission is hereby granted, free of charge, to any person obtaining a copy
  5 * of this software and associated documentation files (the "Software"), to deal
  6 * in the Software without restriction, including without limitation the rights
  7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  8 * copies of the Software, and to permit persons to whom the Software is
  9 * furnished to do so, subject to the following conditions:
 10 *
 11 * The above copyright notice and this permission notice shall be included in
 12 * all copies or substantial portions of the Software.
 13 *
 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 20 * THE SOFTWARE.
 21 *
 22 * ************************************************************************ */
 23
 24#include "aoclsparse.h"
 25
 26#include <complex>
 27#include <cstring>
 28#include <iomanip>
 29#include <iostream>
 30#include <limits>
 31#include <vector>
 32
 33// An example to illustrate the usage of aoclsparse_syrk
 34int main(void)
 35{
 36    std::cout << "-------------------------------" << std::endl
 37              << "----- syrkd sample program -----" << std::endl
 38              << "-------------------------------" << std::endl
 39              << std::endl;
 40
 41    aoclsparse_matrix     A;
 42    aoclsparse_status     status;
 43    aoclsparse_index_base base = aoclsparse_index_base_zero;
 44    aoclsparse_mat_descr  descr;
 45    double                alpha  = 2.1;
 46    double                beta   = 1.3;
 47    aoclsparse_int        ldc    = 10;
 48    aoclsparse_operation  op     = aoclsparse_operation_none;
 49    aoclsparse_order      layout = aoclsparse_order_row;
 50    aoclsparse_int        m_C;
 51    double               *C     = nullptr;
 52    double               *C_exp = nullptr;
 53
 54    // By default aoclsparse_create_mat_descr sets aoclsparse_matrix_type to aoclsparse_matrix_type_general
 55    status = aoclsparse_create_mat_descr(&descr);
 56    if(status != aoclsparse_status_success)
 57        return status;
 58
 59    // Matrix sizes
 60    aoclsparse_int m = 4, k = 3;
 61    aoclsparse_int nnz = 7;
 62    // Matrix A
 63    // [ 0.  -1.2   2.3]
 64    // [ 4.6  0.    0. ]
 65    // [ 0.   3.0  -8.1 ]
 66    // [ 0.3   0.  -5.1 ]
 67    aoclsparse_int row_ptr[] = {0, 2, 3, 5, 7};
 68    aoclsparse_int col_ind[] = {1, 2, 0, 1, 2, 0, 2};
 69    double         csr_val[] = {-1.2, 2.3, 4.6, 3.0, -8.1, 0.3, -5.1};
 70
 71    status = aoclsparse_create_dcsr(&A, base, m, k, nnz, row_ptr, col_ind, csr_val);
 72    if(status != aoclsparse_status_success)
 73    {
 74        return status;
 75    }
 76
 77    // output matrix dimension
 78    m_C = 4;
 79
 80    C     = (double *)malloc(sizeof(double) * ldc * m_C);
 81    C_exp = (double *)malloc(sizeof(double) * ldc * m_C);
 82
 83    // set the large C matrix to -1, only a window of size 4x4 (upper triangular part)
 84    // is updated with the syrkd call
 85    for(aoclsparse_int i = 0; i < ldc * m_C; i++)
 86    {
 87        C[i]     = -1;
 88        C_exp[i] = -1;
 89    }
 90
 91    // Expected value of C which is C_exp
 92    // [ 12.8330    -1.3000   -47.9830   -25.9330,  -1, -1, ....]
 93    // [ -1         43.1360    -1.3000     1.5980,  -1, -1, ....]
 94    // [ -1         -1         155.3810    85.4510, -1, -1, ....]
 95    // [ -1         -1         -1          53.5100, -1, -1, ....]
 96    // [ -1         -1         -1          -1,      -1, -1, ....]
 97    // ...
 98    C_exp[0]  = 12.8330;
 99    C_exp[1]  = -1.3000;
100    C_exp[2]  = -47.9830;
101    C_exp[3]  = -25.9330;
102    C_exp[11] = 43.1360;
103    C_exp[12] = -1.3000;
104    C_exp[13] = 1.5980;
105    C_exp[22] = 155.3810;
106    C_exp[23] = 85.4510;
107    C_exp[33] = 53.5100;
108
109    status = aoclsparse_dsyrkd(op, A, alpha, beta, C, layout, ldc);
110    if(status != aoclsparse_status_success)
111    {
112        return status;
113    }
114
115    bool oki, ok = true;
116    //Initializing precision tolerance range for double
117    const double tol = 1e-03;
118    std::cout << "Expected output\n"
119              << "-------------------------------\n";
120    for(aoclsparse_int i = 0; i < m_C; i++)
121    {
122        for(aoclsparse_int j = 0; j < ldc; j++)
123        {
124            std::cout << std::setw(10) << C_exp[i * ldc + j] << std::setw(3) << "";
125        }
126        std::cout << std::endl;
127    }
128
129    std::cout << "Actual output\n"
130              << "-------------------------------\n";
131    for(aoclsparse_int i = 0; i < m_C; i++)
132    {
133        for(aoclsparse_int j = 0; j < ldc; j++)
134        {
135            oki = ((std::abs(C[i * ldc + j] - C_exp[i * ldc + j]) <= tol));
136            ok &= oki;
137            std::cout << std::setw(10) << C[i * ldc + j] << std::setw(3) << (oki ? "" : " !");
138        }
139        std::cout << std::endl;
140    }
141    aoclsparse_destroy_mat_descr(descr);
142    aoclsparse_destroy(&A);
143    free(C);
144    free(C_exp);
145
146    return (ok ? 0 : 6);
147}

Note

aoclsparse_syrkd assumes that the input CSR matrix has sorted column indices in each row. If not, call aoclsparse_order_mat() before calling aoclsparse_syrkd.

Note

For complex type, only the real parts of \(\alpha\) and \(\beta\) are taken into account to preserve Hermitian \(C\).

Note

aoclsparse_syrkd currently does not support aoclsparse_operation_transpose for complex A.

Parameters
  • opA[in] Matrix \(A\) operation type.

  • A[in] Sorted sparse CSR matrix \(A\).

  • alpha[in] Scalar \(\alpha\).

  • beta[in] Scalar \(\beta\).

  • C[inout] Output dense matrix. Only upper triangular part of the matrix is processed during the computation, the strictly lower triangle is not modified.

  • orderC[in] Storage format of the output dense matrix, \(C\). It can be aoclsparse_order_row or aoclsparse_order_column.

  • ldc[in] Leading dimension of \(C\).

Return values
  • aoclsparse_status_success – The operation completed successfully.

  • aoclsparse_status_invalid_pointerA, C is invalid.

  • aoclsparse_status_wrong_typeA and its operation type do not match.

  • aoclsparse_status_not_implemented – The input matrix is not in the CSR format or opA is aoclsparse_operation_transpose and A has complex values.

  • aoclsparse_status_invalid_value – The value of opA, orderC or ldc is invalid.

  • aoclsparse_status_unsorted_input – Input matrix is not sorted.

  • aoclsparse_status_memory_error – Memory allocation failure.

aoclsparse_?sypr()#

aoclsparse_status aoclsparse_sypr(aoclsparse_operation opA, const aoclsparse_matrix A, const aoclsparse_matrix B, const aoclsparse_mat_descr descrB, aoclsparse_matrix *C, const aoclsparse_request request)#

Symmetric product of three sparse matrices for real and complex datatypes stored as a sparse matrix.

aoclsparse_sypr multiplies three sparse matrices in CSR storage format. The result is returned in a newly allocated symmetric or Hermitian sparse matrix stored as an upper triangle in CSR format.

If \(opA\) is aoclsparse_operation_none,

\[ C = A \cdot B \cdot A^T, \]
or
\[ C = A \cdot B \cdot A^H, \]
for real or complex input matrices, respectively, where \(A\) is a \(m \times n\) general matrix , \(B\) is a \(n \times n\) symmetric (for real data types) or Hermitian (for complex data types) matrix, resulting in a symmetric or Hermitian \(m \times m\) matrix \(C\).

Otherwise,

\[ C = op(A) \cdot B \cdot A, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A^T, & \text{if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]
where \(A\) is a \(m \times n\) matrix and \(B\) is a \(m \times m\) symmetric (or Hermitian) matrix, resulting in a \(n \times n\) symmetric (or Hermitian) matrix \(C\).

Depending on request, aoclsparse_sypr might compute the result in a single stage (aoclsparse_stage_full_computation) or in two stages. Then the first stage (aoclsparse_stage_nnz_count) allocates memory for the new output matrix \(C\) and computes its number of non-zeros and their structure which is followed by the second stage (aoclsparse_stage_finalize) to compute the column indices and values of all elements. The second stage can be invoked multiple times (either after aoclsparse_stage_full_computation or aoclsparse_stage_nnz_count) to recompute the numerical values of \(C\) on assumption that the sparsity structure of the input matrices remained unchanged and only the values of the non-zero elements were modified (e.g., by a call to aoclsparse_supdate_values() and variants).

  1/* ************************************************************************
  2 * Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved.
  3 *
  4 * Permission is hereby granted, free of charge, to any person obtaining a copy
  5 * of this software and associated documentation files (the "Software"), to deal
  6 * in the Software without restriction, including without limitation the rights
  7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  8 * copies of the Software, and to permit persons to whom the Software is
  9 * furnished to do so, subject to the following conditions:
 10 *
 11 * The above copyright notice and this permission notice shall be included in
 12 * all copies or substantial portions of the Software.
 13 *
 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 20 * THE SOFTWARE.
 21 *
 22 * ************************************************************************ */
 23
 24#include "aoclsparse.h"
 25
 26#include <complex>
 27#include <iomanip>
 28#include <iostream>
 29
 30/* Computes triple symmetric product of complex sparse matrices
 31 * C:=opA(A)*B*A, where opA = conjugate transpose
 32*/
 33
 34// Comment this out for single stage computation
 35//#define TWO_STAGE_COMPUTATION
 36
 37int main(void)
 38{
 39    std::cout << "-------------------------------" << std::endl
 40              << "----- SYPR sample program -----" << std::endl
 41              << "-------------------------------" << std::endl
 42              << std::endl;
 43
 44    aoclsparse_status     status;
 45    aoclsparse_int        nnz_C;
 46    aoclsparse_request    request;
 47    aoclsparse_index_base base = aoclsparse_index_base_zero;
 48    aoclsparse_operation  opA  = aoclsparse_operation_conjugate_transpose;
 49
 50    // Print aoclsparse version
 51    std::cout << aoclsparse_get_version() << std::endl;
 52
 53    // Initialise matrix descriptor and csr matrix structure of inputs A and B
 54    aoclsparse_mat_descr descrB;
 55    aoclsparse_matrix    csrA;
 56    aoclsparse_matrix    csrB;
 57
 58    // Create matrix descriptor of input matrices
 59    status = aoclsparse_create_mat_descr(&descrB);
 60    if(status != aoclsparse_status_success)
 61    {
 62        std::cerr << "Error returned from aoclsparse_create_mat_descr, status = " << status << "."
 63                  << std::endl;
 64        return 1;
 65    }
 66
 67    aoclsparse_set_mat_type(descrB, aoclsparse_matrix_type_hermitian);
 68    aoclsparse_set_mat_fill_mode(descrB, aoclsparse_fill_mode_lower);
 69    aoclsparse_set_mat_index_base(descrB, aoclsparse_index_base_zero);
 70
 71    // Matrix sizes
 72    aoclsparse_int m_a = 5, n_a = 5, m_b = 5, n_b = 5;
 73    aoclsparse_int nnz_A = 10, nnz_B = 9;
 74    // Matrix A
 75    aoclsparse_int            row_ptr_A[] = {0, 1, 2, 5, 9, 10};
 76    aoclsparse_int            col_ind_A[] = {0, 0, 1, 2, 4, 0, 1, 2, 3, 4};
 77    aoclsparse_double_complex val_A[]     = {{-0.86, 0.45},
 78                                             {-2.62, -0.44},
 79                                             {-0.87, 0.13},
 80                                             {-0.66, -1.09},
 81                                             {0.05, -2.37},
 82                                             {-1.48, -0.42},
 83                                             {-0.58, -0.70},
 84                                             {0.31, -0.96},
 85                                             {-0.88, -2.37},
 86                                             {-1.23, 0.21}};
 87    status = aoclsparse_create_zcsr(&csrA, base, m_a, n_a, nnz_A, row_ptr_A, col_ind_A, val_A);
 88    if(status != aoclsparse_status_success)
 89    {
 90        std::cerr << "Error returned from aoclsparse_create_zcsr, status = " << status << "."
 91                  << std::endl;
 92        return 2;
 93    }
 94
 95    // Hermitian Matrix B
 96    aoclsparse_int            row_ptr_B[] = {0, 1, 2, 4, 6, 9};
 97    aoclsparse_int            col_ind_B[] = {0, 1, 0, 2, 1, 3, 1, 2, 4};
 98    aoclsparse_double_complex val_B[]     = {{-1.59, 0},
 99                                             {0.46, 0},
100                                             {0.07, -0.51},
101                                             {-1.52, 0},
102                                             {0.21, -1.33},
103                                             {-1.37, 0},
104                                             {1.42, -2.08},
105                                             {-2.26, -1.00},
106                                             {-1.81, 0}};
107
108    status = aoclsparse_create_zcsr(&csrB, base, m_b, n_b, nnz_B, row_ptr_B, col_ind_B, val_B);
109    if(status != aoclsparse_status_success)
110    {
111        std::cerr << "Error returned from aoclsparse_create_zcsr, status = " << status << "."
112                  << std::endl;
113        return 3;
114    }
115
116    aoclsparse_matrix          csrC          = NULL;
117    aoclsparse_int            *csr_row_ptr_C = NULL;
118    aoclsparse_int            *csr_col_ind_C = NULL;
119    aoclsparse_double_complex *csr_val_C     = NULL;
120    aoclsparse_int             C_M, C_N;
121
122    // expected output matrix
123    aoclsparse_int            C_M_exp = 5, C_N_exp = 5, nnz_C_exp = 14;
124    aoclsparse_int            csr_row_ptr_C_exp[] = {0, 5, 9, 12, 13, 14};
125    aoclsparse_int            csr_col_ind_C_exp[] = {0, 1, 2, 3, 4, 1, 2, 3, 4, 2, 4, 3, 3, 4};
126    aoclsparse_double_complex csr_val_C_exp[]     = {{-0.982439, 0.000000},
127                                                     {-3.380974, 2.107664},
128                                                     {-4.156461, -1.960739},
129                                                     {-10.188348, 1.376974},
130                                                     {5.609324, 4.536282},
131                                                     {-2.308344, 0.000000},
132                                                     {-1.331714, -2.631938},
133                                                     {-2.972078, -1.039282},
134                                                     {-1.922892, -1.975280},
135                                                     {-3.862273, 0.000000},
136                                                     {-3.714510, 1.465694},
137                                                     {-2.743288, 2.163915},
138                                                     {-8.756081, -0.000000},
139                                                     {-3.022874, 0.000000}};
140
141#ifdef TWO_STAGE_COMPUTATION
142    std::cout << "Invoking aoclsparse_sypr with aoclsparse_stage_nnz_count...\n";
143    // aoclsparse_stage_nnz_count: Only rowIndex array of the CSR matrix
144    // is computed internally.
145    request = aoclsparse_stage_nnz_count;
146    status  = aoclsparse_sypr(opA, csrA, csrB, descrB, &csrC, request);
147    if(status != aoclsparse_status_success)
148    {
149        std::cerr << "Error returned from aoclsparse_sypr, status = " << status << "." << std::endl;
150        return 4;
151    }
152
153    std::cout << "Invoking aoclsparse_sypr with aoclsparse_stage_finalize...\n";
154    // aoclsparse_stage_finalize: Finalize computation of remaining
155    // output arrays (column indices and values of output matrix entries).
156    // Has to be called only after aoclsparse_sypr call with
157    // aoclsparse_stage_nnz_count parameter.
158    request = aoclsparse_stage_finalize;
159    status  = aoclsparse_sypr(opA, csrA, csrB, descrB, &csrC, request);
160    if(status != aoclsparse_status_success)
161    {
162        std::cerr << "Error returned from aoclsparse_sypr, status = " << status << "." << std::endl;
163        return 5;
164    }
165
166#else // SINGLE STAGE
167    std::cout << "Invoking aoclsparse_sypr with aoclsparse_stage_full_computation...\n";
168    // aoclsparse_stage_full_computation: Whole computation is performed in
169    // a single step.
170    request = aoclsparse_stage_full_computation;
171    status  = aoclsparse_sypr(opA, csrA, csrB, descrB, &csrC, request);
172    if(status != aoclsparse_status_success)
173    {
174        std::cerr << "Error returned from aoclsparse_sypr, status = " << status << "." << std::endl;
175        return 6;
176    }
177
178#endif
179
180    aoclsparse_export_zcsr(
181        csrC, &base, &C_M, &C_N, &nnz_C, &csr_row_ptr_C, &csr_col_ind_C, &csr_val_C);
182    // Check and print the result
183    std::cout << std::fixed;
184    std::cout.precision(1);
185    bool oka, okb, okc, oki, okj, okk, ok = true;
186    std::cout << std::endl
187              << "Output Matrix C: " << std::endl
188              << std::setw(11) << "C_M" << std::setw(3) << "" << std::setw(11) << "expected"
189              << std::setw(2) << "" << std::setw(11) << "C_N" << std::setw(3) << "" << std::setw(11)
190              << "expected" << std::setw(2) << "" << std::setw(11) << "nnz_C" << std::setw(3) << ""
191              << std::setw(11) << "expected" << std::endl;
192    oka = C_M == C_M_exp;
193    ok &= oka;
194    std::cout << std::setw(11) << C_M << std::setw(3) << "" << std::setw(11) << C_M_exp
195              << std::setw(2) << (oka ? "" : " !");
196    okb = C_N == C_N_exp;
197    ok &= okb;
198    std::cout << std::setw(11) << C_N << std::setw(3) << "" << std::setw(11) << C_N_exp
199              << std::setw(2) << (okb ? "" : " !");
200    okc = nnz_C == nnz_C_exp;
201    ok &= okc;
202    std::cout << std::setw(11) << nnz_C << std::setw(3) << "" << std::setw(11) << nnz_C_exp
203              << std::setw(2) << (okc ? "" : " !");
204    std::cout << std::endl;
205    std::cout << std::endl;
206    std::cout << std::setw(14) << "csr_val_C" << std::setw(16) << "expected" << std::setw(14)
207              << "csr_col_ind_C" << std::setw(14) << "expected" << std::setw(14) << "csr_row_ptr_C"
208              << std::setw(14) << "expected" << std::endl;
209    // Initializing precision tolerance range for double
210    const double tol = 1e-06;
211    for(aoclsparse_int i = 0; i < nnz_C; i++)
212    {
213        oki = ((std::abs(csr_val_C[i].real - csr_val_C_exp[i].real) <= tol)
214               && (std::abs(csr_val_C[i].imag - csr_val_C_exp[i].imag) <= tol));
215        ok &= oki;
216        std::cout << "(" << std::setw(5) << csr_val_C[i].real << "," << std::setw(5)
217                  << csr_val_C[i].imag << "i) "
218                  << " (" << std::setw(5) << csr_val_C_exp[i].real << "," << std::setw(5)
219                  << csr_val_C_exp[i].imag << "i) " << std::setw(2) << (oki ? "" : " !");
220        okj = csr_col_ind_C[i] == csr_col_ind_C_exp[i];
221        ok &= okj;
222        std::cout << std::setw(11) << csr_col_ind_C[i] << std::setw(3) << "" << std::setw(11)
223                  << csr_col_ind_C_exp[i] << std::setw(2) << (okj ? "" : " !");
224        if(i <= C_M)
225        {
226            okk = csr_row_ptr_C[i] == csr_row_ptr_C_exp[i];
227            ok &= okk;
228            std::cout << " " << std::setw(11) << csr_row_ptr_C[i] << std::setw(3) << ""
229                      << std::setw(11) << csr_row_ptr_C_exp[i] << std::setw(2) << (okk ? "" : " !");
230        }
231        std::cout << std::endl;
232    }
233
234    aoclsparse_destroy_mat_descr(descrB);
235    aoclsparse_destroy(&csrA);
236    aoclsparse_destroy(&csrB);
237    aoclsparse_destroy(&csrC);
238    return (ok ? 0 : 7);
239}

Note

aoclsparse_sypr supports only matrices in CSR format which have sorted column indices in each row. If the matrices are unsorted, you might want to call aoclsparse_order_mat().

Note

Currently, opA = aoclsparse_operation_transpose is supported only for real data types.

Parameters
Return values
  • aoclsparse_status_success – the operation completed successfully.

  • aoclsparse_status_invalid_pointerdescrB, A, B or C is invalid.

  • aoclsparse_status_invalid_size – Matrix dimensions do not match A or B is not square.

  • aoclsparse_status_invalid_value – Input parameters are invalid, for example, descrB does not match B indexing or B is not symmetric/Hermitian, C has been modified between stages or opA or request is not recognized.

  • aoclsparse_status_wrong_typeA and B matrix data types do not match.

  • aoclsparse_status_not_implemented – Input matrix A or B is not in CSR format.

  • aoclsparse_status_unsorted_input – Input matrices are not sorted.

  • aoclsparse_status_memory_error – Memory allocation failure.

aoclsparse_?syprd()#

aoclsparse_status aoclsparse_ssyprd(const aoclsparse_operation op, const aoclsparse_matrix A, const float *B, const aoclsparse_order orderB, const aoclsparse_int ldb, const float alpha, const float beta, float *C, const aoclsparse_order orderC, const aoclsparse_int ldc)#
aoclsparse_status aoclsparse_dsyprd(const aoclsparse_operation op, const aoclsparse_matrix A, const double *B, const aoclsparse_order orderB, const aoclsparse_int ldb, const double alpha, const double beta, double *C, const aoclsparse_order orderC, const aoclsparse_int ldc)#
aoclsparse_status aoclsparse_csyprd(const aoclsparse_operation op, const aoclsparse_matrix A, const aoclsparse_float_complex *B, const aoclsparse_order orderB, const aoclsparse_int ldb, const aoclsparse_float_complex alpha, const aoclsparse_float_complex beta, aoclsparse_float_complex *C, const aoclsparse_order orderC, const aoclsparse_int ldc)#
aoclsparse_status aoclsparse_zsyprd(const aoclsparse_operation op, const aoclsparse_matrix A, const aoclsparse_double_complex *B, const aoclsparse_order orderB, const aoclsparse_int ldb, const aoclsparse_double_complex alpha, const aoclsparse_double_complex beta, aoclsparse_double_complex *C, const aoclsparse_order orderC, const aoclsparse_int ldc)#

Performs symmetric triple product of a sparse matrix and a dense matrix and stores the output as a dense matrix.

aoclsparse_?syprd performs product of a scalar \(\alpha\), with the symmetric triple product of a sparse \(m \times k\) matrix \(A\), defined in CSR format, with a \(k \times k\) symmetric dense (or Hermitian) matrix \(B\), and a \(k \times m\) \(op(A)\). Adds the resulting matrix to \(m \times m\) symmetric dense (or Hermitian) matrix \(C\) that is multiplied by a scalar \(\beta\), such that

\[ C := \alpha \cdot A \cdot B \cdot op(A) + \beta \cdot C \]
if \(op\) is aoclsparse_operation_none.

Otherwise,

\[ C := \alpha \cdot op(A) \cdot B \cdot A + \beta \cdot C \]

\[\begin{split} op(A) = \left\{ \begin{array}{ll} A^T, & \text{if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \text{ (real matrices)}\\ A^H, & \text{if } {\bf\mathsf{op}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \text{ (complex matrices)} \end{array} \right. \end{split}\]

Notes

1. This routine assumes the dense matrices (B and C) are stored in full although the computations happen on the upper triangular portion of the matrices.

2. aoclsparse_operation_transpose is only supported for real matrices.

3. aoclsparse_operation_conjugate_transpose is only supported for complex matrices.

4. Complex dense matrices are assumed to be Hermitian matrices.

Parameters
  • op[in] Matrix \(A\) operation type.

  • A[in] Sparse CSR matrix \(A\) structure.

  • B[in] Array of dimension \(ldb \times ldb\). Only the upper triangular matrix is used for computation.

  • orderB[in] aoclsparse_order_row or aoclsparse_order_column for dense matrix B.

  • ldb[in] Leading dimension of \(B\), must be at least \(\max{(1, k)}\) ( \(op(A) = A\)) or \(\max{(1, m)}\) ( \(op(A) = A^T\) or \(op(A) = A^H\)).

  • alpha[in] Scalar \(\alpha\).

  • beta[in] Scalar \(\beta\).

  • C[inout] Array of dimension \(ldc \times ldc\). Only upper triangular part of the matrix is processed.

  • orderC[in] aoclsparse_order_row or aoclsparse_order_column for dense matrix C.

  • ldc[in] Leading dimension of \(C\), must be at least \(\max{(1, m)}\) ( \(op(A) = A\)) or \(\max{(1, k)}\) ( \(op(A) = A^T\) or \(op(A) = A^H\)).

Return values
  • aoclsparse_status_success – The operation completed successfully.

  • aoclsparse_invalid_operation – The operation is invalid if the matrix B and C has a different layout ordering.

  • aoclsparse_status_wrong_type – The data type of the matrices are not matching or invalid.

  • aoclsparse_status_invalid_size – The value of m, k, nnz, ldb or ldc is invalid.

  • aoclsparse_status_invalid_pointer – The pointer A, B, or C is invalid.

  • aoclsparse_status_not_implemented – The values of orderB and orderC are different.

Miscellaneous#

aoclsparse_ilu_?smoother()#

aoclsparse_status aoclsparse_silu_smoother(aoclsparse_operation op, aoclsparse_matrix A, const aoclsparse_mat_descr descr, float **precond_csr_val, const float *approx_inv_diag, float *x, const float *b)#
aoclsparse_status aoclsparse_dilu_smoother(aoclsparse_operation op, aoclsparse_matrix A, const aoclsparse_mat_descr descr, double **precond_csr_val, const double *approx_inv_diag, double *x, const double *b)#

Incomplete LU factorization with zero fill-in, ILU(0).

Performs incomplete LU factorization with zero fill-in on symmetric sparse matrix A of size \(n \times n\). It also performs a solve for x in

\[ L U x = b, \qquad \text{where} \qquad LU\approx A.\]
Matrix A should be numerically of full rank. Currently single and double precision datatypes are supported.

  1/* ************************************************************************
  2 * Copyright (c) 2022-2023 Advanced Micro Devices, Inc.
  3 *
  4 * Permission is hereby granted, free of charge, to any person obtaining a copy
  5 * of this software and associated documentation files (the "Software"), to deal
  6 * in the Software without restriction, including without limitation the rights
  7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  8 * copies of the Software, and to permit persons to whom the Software is
  9 * furnished to do so, subject to the following conditions:
 10 *
 11 * The above copyright notice and this permission notice shall be included in
 12 * all copies or substantial portions of the Software.
 13 *
 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 20 * THE SOFTWARE.
 21 *
 22 * ************************************************************************ */
 23
 24#include "aoclsparse.h"
 25
 26#include <cfloat>
 27#include <iomanip>
 28#include <iostream>
 29#include <math.h>
 30#include <vector>
 31
 32#define PREMATURE_STOP_TOLERANCE 1e-4
 33const char gmres_sol[] = "gmres_direct_d";
 34
 35double calculate_l2Norm_solvers(const double *xSol, const double *x, aoclsparse_int n)
 36{
 37    double norm = 0.0f;
 38    //residual = xSol - x
 39    //xSol obtained by initial spmv
 40    //x obtained by ILU-GMRES iterations
 41    for(aoclsparse_int i = 0; i < n; i++)
 42    {
 43        double a = xSol[i] - x[i];
 44        norm += a * a;
 45    }
 46    norm = sqrt(norm);
 47    return norm;
 48}
 49
 50aoclsparse_int monit(aoclsparse_int n,
 51                     const double  *x,
 52                     const double  *r __attribute__((unused)),
 53                     double        *rinfo,
 54                     void          *udata __attribute__((unused)))
 55{
 56    int    it  = (int)rinfo[30];
 57    double tol = PREMATURE_STOP_TOLERANCE;
 58
 59    std::ios oldState(nullptr);
 60    oldState.copyfmt(std::cout);
 61
 62    std::ios_base::fmtflags fmt = std::cout.flags();
 63    fmt |= std::ios_base::scientific | std::ios_base::right | std::ios_base::showpos;
 64
 65    if(!(it % 10))
 66    {
 67        std::cout << std::setw(5) << std::right << " iter"
 68                  << " " << std::setw(16) << std::right << "optim";
 69        for(int i = 0; i < n; i++)
 70            std::cout << std::setw(8) << std::right << "x[" << i << "]";
 71        std::cout << std::endl;
 72    }
 73    std::cout << std::setw(5) << std::right << (int)rinfo[30] << " " << std::setw(16) << std::right
 74              << std::scientific << std::setprecision(8) << rinfo[0];
 75    std::cout << std::setprecision(2) << std::showpos;
 76    for(int i = 0; i < n; i++)
 77        std::cout << " " << x[i];
 78    std::cout << std::endl;
 79    std::cout << std::resetiosflags(fmt);
 80
 81    //reset std::cout state
 82    std::cout.copyfmt(oldState);
 83
 84    if(rinfo[0] < tol) // check for premature stop
 85    {
 86        return 1; // request to interrupt
 87    }
 88    return 0;
 89}
 90
 91int main()
 92{
 93    std::vector<aoclsparse_int> csr_row_ptr;
 94    std::vector<aoclsparse_int> csr_col_ind;
 95    std::vector<double>         csr_val;
 96
 97    int               m, n, nnz;
 98    aoclsparse_status exit_status = aoclsparse_status_success;
 99
100    std::string filename = "cage4.mtx";
101    //symmetry = 0; //real unsymmetric, symmetry = 0%, spd = no
102    //https://www.cise.ufl.edu/research/sparse/MM/vanHeukelum/cage4.tar.gz
103    n   = 9;
104    m   = 9;
105    nnz = 49;
106    csr_row_ptr.assign({0, 5, 10, 15, 20, 26, 32, 38, 44, 49});
107    csr_col_ind.assign({0, 1, 3, 4, 7, 0, 1, 2, 4, 5, 1, 2, 3, 5, 6, 0, 2, 3, 6, 7, 0, 1, 4, 5, 6,
108                        8, 1, 2, 4, 5, 7, 8, 2, 3, 4, 6, 7, 8, 0, 3, 5, 6, 7, 8, 4, 5, 6, 7, 8});
109    csr_val.assign({0.75, 0.14, 0.11, 0.14, 0.11, 0.08, 0.69, 0.11, 0.08, 0.11, 0.09, 0.67, 0.08,
110                    0.09, 0.08, 0.09, 0.14, 0.73, 0.14, 0.09, 0.04, 0.04, 0.54, 0.14, 0.11, 0.25,
111                    0.05, 0.05, 0.08, 0.45, 0.08, 0.15, 0.04, 0.04, 0.09, 0.47, 0.09, 0.18, 0.05,
112                    0.05, 0.14, 0.11, 0.55, 0.25, 0.08, 0.08, 0.09, 0.08, 0.17});
113
114    // Create aocl sparse matrix
115    aoclsparse_matrix     A;
116    aoclsparse_index_base base = aoclsparse_index_base_zero;
117    aoclsparse_mat_descr  descr_a;
118    aoclsparse_operation  trans = aoclsparse_operation_none;
119    aoclsparse_create_dcsr(&A,
120                           base,
121                           (aoclsparse_int)n,
122                           (aoclsparse_int)n,
123                           (aoclsparse_int)nnz,
124                           csr_row_ptr.data(),
125                           csr_col_ind.data(),
126                           csr_val.data());
127    aoclsparse_create_mat_descr(&descr_a);
128    aoclsparse_set_mat_type(descr_a, aoclsparse_matrix_type_symmetric);
129    aoclsparse_set_mat_fill_mode(descr_a, aoclsparse_fill_mode_lower);
130    aoclsparse_set_mat_index_base(descr_a, base);
131    aoclsparse_set_sv_hint(A, trans, descr_a, 100);
132    aoclsparse_optimize(A);
133
134    // Initialize initial point x0 and right hand side b
135    double *expected_sol = NULL;
136    double *x            = NULL;
137    double *b            = NULL;
138    double  norm         = 0.0;
139    double  rinfo[100];
140    double  alpha = 1.0, beta = 0.;
141    int     rs_iters = 7;
142    char    rs_iters_string[16];
143
144    expected_sol = new double[n];
145    x            = new double[n];
146    b            = new double[n];
147
148    double init_x = 1.0, ref_x = 0.5;
149    for(int i = 0; i < n; i++)
150    {
151        expected_sol[i] = ref_x;
152        x[i]            = init_x;
153    }
154
155    aoclsparse_dmv(trans, &alpha, A, descr_a, expected_sol, &beta, b);
156
157    double                  tol    = PREMATURE_STOP_TOLERANCE;
158    aoclsparse_itsol_handle handle = NULL;
159    // create GMRES handle
160    aoclsparse_itsol_d_init(&handle);
161
162    exit_status = aoclsparse_itsol_option_set(handle, "iterative method", "GMRES");
163    if(exit_status != aoclsparse_status_success)
164        printf("Warning an iterative method option could not be set, exit status = %d\n",
165               exit_status);
166
167    exit_status = aoclsparse_itsol_option_set(handle, "gmres preconditioner", "ILU0");
168    if(exit_status != aoclsparse_status_success)
169        printf("Warning gmres preconditioner option could not be set, exit status = %d\n",
170               exit_status);
171
172    exit_status = aoclsparse_itsol_option_set(handle, "gmres iteration limit", "50");
173    if(exit_status != aoclsparse_status_success)
174        printf("Warning gmres iteration limit option could not be set, exit status = %d\n",
175               exit_status);
176
177    sprintf(rs_iters_string, "%d", (int)rs_iters);
178    exit_status = aoclsparse_itsol_option_set(handle, "gmres restart iterations", rs_iters_string);
179    if(exit_status != aoclsparse_status_success)
180        printf("Warning gmres restart iterations option could not be set, exit status = %d\n",
181               exit_status);
182
183    aoclsparse_itsol_handle_prn_options(handle);
184
185    // Call GMRES solver
186    aoclsparse_status status;
187
188    status = aoclsparse_itsol_d_solve(handle, n, A, descr_a, b, x, rinfo, NULL, monit, &n);
189    if(status == aoclsparse_status_success || status == aoclsparse_status_user_stop
190       || aoclsparse_status_maxit == status)
191    {
192        norm = calculate_l2Norm_solvers(expected_sol, x, n);
193        printf("input = %s\n", filename.c_str());
194        printf("solver = %s\n", gmres_sol);
195        printf("no of rows = %d\n", (int)m);
196        printf("no of cols = %d\n", (int)n);
197        printf("no of nnz = %d\n", (int)nnz);
198        printf("monitoring tolerance = %e\n", tol);
199        printf("restart iterations = %d\n", (int)rs_iters);
200        printf("residual achieved = %e\n", rinfo[0]);
201        printf("total iterations = %d\n", (int)rinfo[30]);
202        printf("l2 Norm = %e\n", norm);
203    }
204    else
205    {
206        std::cout << "Something unexpected happened!, Status = " << status << std::endl;
207    }
208
209    delete[] expected_sol;
210    delete[] x;
211    delete[] b;
212    aoclsparse_itsol_destroy(&handle);
213    aoclsparse_destroy_mat_descr(descr_a);
214    aoclsparse_destroy(&A);
215    printf("\n");
216    fflush(stdout);
217
218    return 0;
219}
Parameters
  • op[in] matrix A operation type. Transpose not supported in this release.

  • A[in] sparse symmetric matrix handle. Currently ILU functionality is supported only for CSR matrix format.

  • descr[in] descriptor of the sparse matrix handle A. Currently, only aoclsparse_matrix_type_symmetric is supported.

  • precond_csr_val[out] pointer that contains L and U factors after ILU factorization operation. A is not overwritten with the factors.

  • approx_inv_diag[in] Reserved for future use.

  • x[out] array of n elements containing the solution to solving aproximatly \(Ax=b\).

  • b[in] Right-hand-side of the linear system of equations \(Ax = b\).

Return values
  • aoclsparse_status_success – the operation completed successfully.

  • aoclsparse_status_invalid_size – input parameters contain an invalid value.

  • aoclsparse_status_invalid_pointerdescr, A is invalid.

  • aoclsparse_status_not_implementedaoclsparse_matrix_type is not aoclsparse_matrix_type_symmetric or input matrix A is not in CSR format