-
aoclsparse_status aoclsparse_sp2m(aoclsparse_operation opA, const aoclsparse_mat_descr descrA, const aoclsparse_matrix A, aoclsparse_operation opB, const aoclsparse_mat_descr descrB, const aoclsparse_matrix B, const aoclsparse_request request, aoclsparse_matrix *C)#
Sparse matrix Sparse matrix multiplication for real and complex datatypes.
aoclsparse_sp2mmultiplies two sparse matrices in CSR storage format. The result is stored in a newly allocated sparse matrix in CSR format, such that\[ C = op(A) \, op(B), \]with\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ A^T, & \text{if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ A^H, & \text{if } {\bf\mathsf{opA}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]and\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if } {\bf\mathsf{opB}} = \text{aoclsparse}\_\text{operation}\_\text{none} \\ B^T, & \text{if } {\bf\mathsf{opB}} = \text{aoclsparse}\_\text{operation}\_\text{transpose} \\ B^H, & \text{if } {\bf\mathsf{opB}} = \text{aoclsparse}\_\text{operation}\_\text{conjugate}\_\text{transpose} \end{array} \right. \end{split}\]where \(A\) is a \(m \times k\) matrix , \(B\) is a \(k \times n\) matrix, resulting in \(m \times n\) matrix \(C\), foropAandopB= aoclsparse_operation_none. \(A\) is a \(k \times m\) matrix whenopA= aoclsparse_operation_transpose or aoclsparse_operation_conjugate_transpose and \(B\) is a \(n \times k\) matrix whenopB= aoclsparse_operation_transpose or aoclsparse_operation_conjugate_transposeaoclsparse_sp2m can be run in single-stage or two-stage. The single-stage algorithm allocates and computes the entire output matrix in a single stage aoclsparse_stage_full_computation. Whereas, in two-stage algorithm, the first stage aoclsparse_stage_nnz_count allocates memory for the output matrix and computes the number of entries of the matrix. The second stage aoclsparse_stage_finalize computes column indices of non-zero elements and values of the output matrix. The second stage has to be invoked only after the first stage. But, it can be also be invoked multiple times consecutively when the sparsity structure of input matrices remains unchanged, with only the values getting updated.
1/* ************************************************************************ 2 * Copyright (c) 2023 Advanced Micro Devices, Inc. All rights reserved. 3 * 4 * Permission is hereby granted, free of charge, to any person obtaining a copy 5 * of this software and associated documentation files (the "Software"), to deal 6 * in the Software without restriction, including without limitation the rights 7 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 * copies of the Software, and to permit persons to whom the Software is 9 * furnished to do so, subject to the following conditions: 10 * 11 * The above copyright notice and this permission notice shall be included in 12 * all copies or substantial portions of the Software. 13 * 14 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 * THE SOFTWARE. 21 * 22 * ************************************************************************ */ 23 24#include "aoclsparse.h" 25 26#include <complex> 27#include <iomanip> 28#include <iostream> 29 30/* Computes multiplication of 2 sparse matrices A and B in CSR format. */ 31 32// Comment this out for single stage computation 33//#define TWO_STAGE_COMPUTATION 34 35int main(void) 36{ 37 std::cout << "-------------------------------" << std::endl 38 << "----- SP2M sample program -----" << std::endl 39 << "-------------------------------" << std::endl 40 << std::endl; 41 42 aoclsparse_status status; 43 aoclsparse_int nnz_C; 44 aoclsparse_request request; 45 aoclsparse_index_base base = aoclsparse_index_base_zero; 46 aoclsparse_operation opA = aoclsparse_operation_transpose; 47 aoclsparse_operation opB = aoclsparse_operation_none; 48 49 // Print aoclsparse version 50 std::cout << aoclsparse_get_version() << std::endl; 51 52 // Initialise matrix descriptor and csr matrix structure of inputs A and B 53 aoclsparse_mat_descr descrA; 54 aoclsparse_mat_descr descrB; 55 aoclsparse_matrix csrA; 56 aoclsparse_matrix csrB; 57 58 // Create matrix descriptor of input matrices 59 // aoclsparse_create_mat_descr set aoclsparse_matrix_type to aoclsparse_matrix_type_general 60 // and aoclsparse_index_base to aoclsparse_index_base_zero. 61 aoclsparse_create_mat_descr(&descrA); 62 aoclsparse_create_mat_descr(&descrB); 63 64 // Matrix sizes 65 aoclsparse_int m = 5, n = 5, k = 5; 66 aoclsparse_int nnz_A = 10, nnz_B = 10; 67 // Matrix A 68 aoclsparse_int row_ptr_A[] = {0, 1, 2, 5, 9, 10}; 69 aoclsparse_int col_ind_A[] = {0, 0, 1, 2, 4, 0, 1, 2, 3, 4}; 70 aoclsparse_double_complex val_A[] = {{-0.86238, 0.454626}, 71 {-2.62138, -0.442597}, 72 {-0.875679, 0.137933}, 73 {-0.661939, -1.09106}, 74 {0.0501717, -2.37527}, 75 {-1.48812, -0.420546}, 76 {-0.588085, -0.708977}, 77 {0.310933, -0.96569}, 78 {-0.88964, -2.37881}, 79 {-1.23201, 0.213152}}; 80 aoclsparse_create_zcsr( 81 &csrA, base, k, m, nnz_A, row_ptr_A, col_ind_A, (aoclsparse_double_complex *)val_A); 82 83 // Matrix B 84 aoclsparse_int row_ptr_B[] = {0, 4, 4, 7, 8, 10}; 85 aoclsparse_int col_ind_B[] = {0, 1, 2, 4, 0, 1, 2, 2, 2, 3}; 86 aoclsparse_double_complex val_B[] = {{-1.59204, -0.259325}, 87 {0.467532, -0.980612}, 88 {0.078412, -0.513591}, 89 {-1.52364, 0.403911}, 90 {0.211966, -1.33485}, 91 {-1.37901, -1.44562}, 92 {1.42472, -2.08662}, 93 {-2.26549, -1.0073}, 94 {-1.75098, 0.207783}, 95 {-1.8152, 0.482205}}; 96 97 aoclsparse_create_zcsr( 98 &csrB, base, k, n, nnz_B, row_ptr_B, col_ind_B, (aoclsparse_double_complex *)val_B); 99 100 aoclsparse_matrix csrC = NULL; 101 aoclsparse_int *csr_row_ptr_C = NULL; 102 aoclsparse_int *csr_col_ind_C = NULL; 103 aoclsparse_double_complex *csr_val_C = NULL; 104 aoclsparse_int C_M, C_N; 105 106 // expected output matrices 107 aoclsparse_int C_M_exp = 5, C_N_exp = 5, nnz_C_exp = 15; 108 aoclsparse_int csr_row_ptr_C_exp[] = {0, 4, 7, 10, 11, 15}; 109 aoclsparse_int csr_col_ind_C_exp[] = {0, 1, 2, 4, 0, 1, 2, 0, 1, 2, 2, 0, 1, 2, 3}; 110 aoclsparse_double_complex csr_val_C_exp[] = {{1.49084, -0.500145}, 111 {0.0426217, 1.05821}, 112 {3.11358, 2.93029}, 113 {1.13033, -1.04101}, 114 {-0.0014949, 1.19813}, 115 {1.40697, 1.07569}, 116 {-0.34163, 4.22229}, 117 {-1.59671, 0.652319}, 118 {-0.664445, 2.4615}, 119 {-4.89686, 1.70132}, 120 {-0.380707, 6.28532}, 121 {-3.15998, -0.570448}, 122 {-3.50293, 3.20298}, 123 {-2.77187, -4.11799}, 124 {2.13356, -0.980994}}; 125 126#ifdef TWO_STAGE_COMPUTATION 127 std::cout << "Invoking aoclsparse_sp2m with aoclsparse_stage_nnz_count..\n"; 128 // aoclsparse_stage_nnz_count : Only rowIndex array of the CSR matrix 129 // is computed internally. 130 request = aoclsparse_stage_nnz_count; 131 status = aoclsparse_sp2m(opA, descrA, csrA, opB, descrB, csrB, request, &csrC); 132 if(status != aoclsparse_status_success) 133 return 1; 134 135 std::cout << "Invoking aoclsparse_sp2m with aoclsparse_stage_finalize..\n"; 136 // aoclsparse_stage_finalize : Finalize computation of remaining 137 // output arrays ( column indices and values of output matrix entries) . 138 // Has to be called only after aoclsparse_sp2m call with 139 // aoclsparse_stage_nnz_count parameter. 140 request = aoclsparse_stage_finalize; 141 status = aoclsparse_sp2m(opA, descrA, csrA, opB, descrB, csrB, request, &csrC); 142 if(status != aoclsparse_status_success) 143 return 2; 144 145#else // SINGLE STAGE 146 std::cout << "Invoking aoclsparse_sp2m with aoclsparse_stage_full_computation..\n"; 147 // aoclsparse_stage_full_computation : Whole computation is performed in 148 // single step. 149 request = aoclsparse_stage_full_computation; 150 status = aoclsparse_sp2m(opA, descrA, csrA, opB, descrB, csrB, request, &csrC); 151 if(status != aoclsparse_status_success) 152 return 3; 153 154#endif 155 156 aoclsparse_export_zcsr( 157 csrC, &base, &C_M, &C_N, &nnz_C, &csr_row_ptr_C, &csr_col_ind_C, &csr_val_C); 158 // Check and print the result 159 std::cout << std::fixed; 160 std::cout.precision(1); 161 bool oka, okb, okc, oki, okj, okk, ok = true; 162 std::cout << std::endl 163 << "Output Matrix C: " << std::endl 164 << std::setw(11) << "C_M" << std::setw(3) << "" << std::setw(11) << "expected" 165 << std::setw(3) << "" << std::setw(11) << "C_N" << std::setw(3) << "" << std::setw(11) 166 << "expected" << std::setw(3) << "" << std::setw(11) << "nnz_C" << std::setw(3) << "" 167 << std::setw(11) << "expected" << std::endl; 168 oka = C_M == C_M_exp; 169 ok &= oka; 170 std::cout << std::setw(11) << C_M << std::setw(3) << "" << std::setw(11) << C_M_exp 171 << std::setw(2) << (oka ? "" : " !"); 172 okb = C_N == C_N_exp; 173 ok &= okb; 174 std::cout << std::setw(11) << C_N << std::setw(3) << "" << std::setw(11) << C_N_exp 175 << std::setw(2) << (okb ? "" : " !"); 176 okc = nnz_C == nnz_C_exp; 177 ok &= okc; 178 std::cout << std::setw(11) << nnz_C << std::setw(3) << "" << std::setw(11) << nnz_C_exp 179 << std::setw(2) << (okc ? "" : " !"); 180 std::cout << std::endl; 181 std::cout << std::endl; 182 std::cout << std::setw(11) << "csr_val_C" << std::setw(3) << "" << std::setw(11) << "expected" 183 << std::setw(3) << "" << std::setw(11) << "csr_col_ind_C" << std::setw(3) << "" 184 << std::setw(11) << "expected" << std::setw(3) << "" << std::setw(11) 185 << "csr_row_ptr_C" << std::setw(3) << "" << std::setw(11) << "expected" << std::endl; 186 //Initializing precision tolerance range for double 187 const double tol = 1e-03; 188 for(aoclsparse_int i = 0; i < nnz_C; i++) 189 { 190 oki = ((std::abs(csr_val_C[i].real - csr_val_C_exp[i].real) <= tol) 191 && (std::abs(csr_val_C[i].imag - csr_val_C_exp[i].imag) <= tol)); 192 ok &= oki; 193 std::cout << std::setw(11) << "(" << csr_val_C[i].real << ", " << csr_val_C[i].imag << "i) " 194 << std::setw(3) << "" << std::setw(11) << "(" << csr_val_C_exp[i].real << ", " 195 << csr_val_C_exp[i].imag << "i) " << std::setw(2) << (oki ? "" : " !"); 196 okj = csr_col_ind_C[i] == csr_col_ind_C_exp[i]; 197 ok &= okj; 198 std::cout << std::setw(11) << csr_col_ind_C[i] << std::setw(3) << "" << std::setw(11) 199 << csr_col_ind_C_exp[i] << std::setw(2) << (okj ? "" : " !"); 200 if(i < C_M) 201 { 202 okk = csr_row_ptr_C[i] == csr_row_ptr_C_exp[i]; 203 ok &= okk; 204 std::cout << " " << std::setw(11) << csr_row_ptr_C[i] << std::setw(3) << "" 205 << std::setw(11) << csr_row_ptr_C_exp[i] << std::setw(2) << (okk ? "" : " !"); 206 } 207 std::cout << std::endl; 208 } 209 210 aoclsparse_destroy_mat_descr(descrA); 211 aoclsparse_destroy_mat_descr(descrB); 212 aoclsparse_destroy(&csrA); 213 aoclsparse_destroy(&csrB); 214 aoclsparse_destroy(&csrC); 215 return (ok ? 0 : 6); 216}
- Parameters:
opA – [in] matrix \(A\) operation type.
descrA – [in] descriptor of the sparse CSR matrix \(A\). Currently, only aoclsparse_matrix_type_general is supported.
A – [in] sparse CSR matrix \(A\) .
opB – [in] matrix \(B\) operation type.
descrB – [in] descriptor of the sparse CSR matrix \(B\). Currently, only aoclsparse_matrix_type_general is supported.
B – [in] sparse CSR matrix \(B\) .
request – [in] Specifies full computation or two-stage algorithm aoclsparse_stage_nnz_count , Only rowIndex array of the CSR matrix is computed internally. The output sparse CSR matrix can be extracted to measure the memory required for full operation. aoclsparse_stage_finalize . Finalize computation of remaining output arrays ( column indices and values of output matrix entries) . Has to be called only after aoclsparse_sp2m call with aoclsparse_stage_nnz_count parameter. aoclsparse_stage_full_computation . Perform the entire computation in a single step.
*C – [out] Pointer to sparse CSR matrix \(C\) . Matrix \(C\) arrays will always have zero-based indexing, irrespective of matrix \(A\) or matrix \(B\) being one-based or zero-based indexing. The column indices of the output matrix in CSR format can appear unsorted.
- Return values:
aoclsparse_status_success – the operation completed successfully.
aoclsparse_status_invalid_pointer –
descrA,descrB,A,B,Cis invalid.aoclsparse_status_invalid_size – input size parameters contain an invalid value.
aoclsparse_status_invalid_value – input parameters contain an invalid value.
aoclsparse_status_wrong_type – A and B matrix datatypes do not match.
aoclsparse_status_memory_error – Memory allocation failure.
aoclsparse_status_not_implemented – aoclsparse_matrix_type is not aoclsparse_matrix_type_general or input matrices
AorBis not in CSR format