The following table gives results for the Matrix Multiply function with a wide variety of supported parameters, which are defined in: Matrix Multiply Configuration Parameters.
Library Element | AIE_VARIANT | TT_DATA_A | TT_DATA_B | TP_DIM_A | TP_DIM_AB | TP_DIM_B | TP_ADD_TILING_A | TP_ADD_TILING_B | TP_ADD_DETILING_OUT | TP_INPUT_WINDOW_VSIZE_A | TP_INPUT_WINDOW_VSIZE_B | TP_CASC_LEN | TP_SSR | Dynamic Power (W) | Latency (ns) | Throughput (MSa/s) | NUM_BANKS | NUM_AIE | DATA_MEMORY | PROGRAM_MEMORY |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
matrix_mult | AIE | cint16 | cint16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 1 | 1 | 1.072 | 4829 | 137 | 7 | 1 | 20617 | 2214 |
matrix_mult | AIE | cint16 | cint16 | 16 | 64 | 16 | 1 | 1 | 1 | 1024 | 1024 | 1 | 1 | 1.142 | 9399 | 122 | 12 | 3 | 43412 | 3766 2216 3410 |
matrix_mult | AIE | cint16 | cint16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 1 | 2 | 1.237 | 2523 | 267 | 14 | 2 | 30994 | 2146 2146 |
matrix_mult | AIE | cint16 | cint16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 2 | 1 | 1.267 | 2986 | 238 | 12 | 2 | 22801 | 2156 2296 |
matrix_mult | AIE | cint16 | cint16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 4 | 2 | 1.902 | 1236 | 703 | 41 | 8 | 44098 | 2114 2202 2202 2228 2114 2202 2202 2228 |
matrix_mult | AIE | cint16 | cint16 | 4 | 64 | 8 | 0 | 0 | 0 | 256 | 512 | 4 | 1 | 1.177 | 562 | 217 | 22 | 4 | 15137 | 1746 1768 1768 1838 |
matrix_mult | AIE | cint16 | int16 | 8 | 64 | 4 | 0 | 0 | 0 | 512 | 256 | 4 | 1 | 1.161 | 366 | 313 | 20 | 4 | 14107 | 1736 1752 1752 1824 |
matrix_mult | AIE | cint32 | cint16 | 4 | 64 | 8 | 0 | 0 | 0 | 256 | 512 | 4 | 1 | 1.371 | 734 | 160 | 21 | 4 | 17435 | 1780 1796 1796 1872 |
matrix_mult | AIE | cint32 | cint32 | 16 | 32 | 16 | 0 | 0 | 0 | 512 | 512 | 1 | 1 | 1.053 | 10142 | 70 | 7 | 1 | 22662 | 2174 |
matrix_mult | AIE | cint32 | cint32 | 16 | 32 | 16 | 1 | 1 | 1 | 512 | 512 | 1 | 1 | 1.121 | 17501 | 69 | 11 | 3 | 47505 | 1570 2176 1602 |
matrix_mult | AIE | cint32 | cint32 | 16 | 32 | 16 | 0 | 0 | 0 | 512 | 512 | 1 | 2 | 1.289 | 5202 | 138 | 14 | 2 | 33036 | 2174 2174 |
matrix_mult | AIE | cint32 | cint32 | 16 | 32 | 16 | 0 | 0 | 0 | 512 | 512 | 2 | 1 | 1.254 | 6080 | 122 | 11 | 2 | 24845 | 2088 2208 |
matrix_mult | AIE | cint32 | cint32 | 16 | 32 | 16 | 0 | 0 | 0 | 512 | 512 | 4 | 2 | 1.991 | 2213 | 373 | 41 | 8 | 46134 | 2104 2176 2176 2224 2104 2176 2176 2224 |
matrix_mult | AIE | cint32 | cint32 | 4 | 64 | 8 | 0 | 0 | 0 | 256 | 512 | 4 | 1 | 1.474 | 1264 | 101 | 22 | 4 | 21531 | 2098 2146 2146 2176 |
matrix_mult | AIE | float | cfloat | 16 | 32 | 16 | 0 | 0 | 0 | 512 | 512 | 1 | 1 | 1.011 | 11688 | 63 | 7 | 1 | 18824 | 2598 |
matrix_mult | AIE | float | cfloat | 16 | 32 | 16 | 1 | 1 | 1 | 512 | 512 | 1 | 1 | 1.072 | 19413 | 63 | 12 | 3 | 40083 | 1602 2600 2926 |
matrix_mult | AIE | float | cfloat | 16 | 32 | 16 | 0 | 0 | 0 | 512 | 512 | 1 | 2 | 1.2 | 5975 | 125 | 14 | 2 | 29456 | 2598 2598 |
matrix_mult | AIE | float | cfloat | 16 | 32 | 16 | 0 | 0 | 0 | 512 | 512 | 2 | 1 | 1.17 | 7034 | 109 | 11 | 2 | 21262 | 2464 2662 |
matrix_mult | AIE | float | cfloat | 16 | 32 | 16 | 0 | 0 | 0 | 512 | 512 | 4 | 2 | 1.872 | 3172 | 257 | 42 | 8 | 44084 | 2658 2530 2530 3154 2658 2530 2530 3154 |
matrix_mult | AIE | int16 | int16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 1 | 1 | 1.061 | 1282 | 472 | 7 | 1 | 11398 | 1968 |
matrix_mult | AIE | int16 | int16 | 16 | 64 | 16 | 1 | 1 | 1 | 1024 | 1024 | 1 | 1 | 1.063 | 8040 | 129 | 12 | 3 | 24977 | 3762 1952 4256 |
matrix_mult | AIE | int16 | int16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 1 | 2 | 1.197 | 1095 | 621 | 14 | 2 | 17676 | 1900 1900 |
matrix_mult | AIE | int16 | int16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 2 | 1 | 1.122 | 972 | 725 | 11 | 2 | 13579 | 1844 2000 |
matrix_mult | AIE | int16 | int16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 4 | 2 | 1.645 | 554 | 1706 | 42 | 8 | 30762 | 1804 1836 1836 1932 1804 1836 1836 1932 |
matrix_mult | AIE | int16 | int16 | 8 | 64 | 4 | 0 | 0 | 0 | 512 | 256 | 4 | 1 | 1.193 | 405 | 376 | 20 | 4 | 11925 | 1426 1426 1426 1762 |
matrix_mult | AIE | int16 | int32 | 8 | 64 | 4 | 0 | 0 | 0 | 512 | 256 | 4 | 1 | 1.226 | 431 | 320 | 21 | 4 | 13082 | 1666 1698 1698 1786 |
matrix_mult | AIE | int32 | int16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 1 | 1 | 1.047 | 2256 | 254 | 7 | 1 | 16521 | 2016 |
matrix_mult | AIE | int32 | int16 | 16 | 64 | 16 | 1 | 1 | 1 | 1024 | 1024 | 1 | 1 | 1.098 | 8884 | 117 | 11 | 3 | 35220 | 4070 2000 3410 |
matrix_mult | AIE | int32 | int16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 1 | 2 | 1.306 | 1235 | 484 | 14 | 2 | 22802 | 1948 1948 |
matrix_mult | AIE | int32 | int16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 2 | 1 | 1.247 | 1615 | 409 | 11 | 2 | 18703 | 1882 2000 |
matrix_mult | AIE | int32 | int16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 4 | 2 | 1.762 | 765 | 1080 | 40 | 8 | 35894 | 1840 1856 1856 1932 1840 1856 1856 1932 |
matrix_mult | AIE | int32 | int16 | 8 | 64 | 4 | 0 | 0 | 0 | 512 | 256 | 4 | 1 | 1.233 | 366 | 313 | 20 | 4 | 14107 | 1736 1752 1752 1824 |
matrix_mult | AIE | int32 | int32 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 1 | 1 | 1.072 | 4829 | 137 | 7 | 1 | 20617 | 2214 |
matrix_mult | AIE | int32 | int32 | 16 | 64 | 16 | 1 | 1 | 1 | 1024 | 1024 | 1 | 1 | 1.141 | 9399 | 122 | 12 | 3 | 43412 | 3766 2216 3410 |
matrix_mult | AIE | int32 | int32 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 1 | 2 | 1.232 | 2523 | 267 | 14 | 2 | 30994 | 2146 2146 |
matrix_mult | AIE | int32 | int32 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 2 | 1 | 1.267 | 2986 | 238 | 12 | 2 | 22801 | 2156 2296 |
matrix_mult | AIE | int32 | int32 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 4 | 2 | 1.886 | 1236 | 703 | 41 | 8 | 44098 | 2114 2202 2202 2228 2114 2202 2202 2228 |
matrix_mult | AIE | int32 | int32 | 8 | 64 | 4 | 0 | 0 | 0 | 512 | 256 | 4 | 1 | 1.313 | 616 | 231 | 21 | 4 | 15137 | 2022 2110 2110 2154 |
matrix_mult | AIE-ML | cint16 | cint16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 1 | 1 | 0.591 | 2203 | 260 | 6 | 1 | 20625 | 1728 |
matrix_mult | AIE-ML | cint16 | cint16 | 16 | 64 | 16 | 1 | 1 | 1 | 1024 | 1024 | 1 | 1 | 0.745 | 3796 | 254 | 12 | 3 | 43422 | 1744 1728 1088 |
matrix_mult | AIE-ML | cint16 | cint16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 1 | 2 | 0.749 | 2130 | 311 | 14 | 2 | 31010 | 1728 1728 |
matrix_mult | AIE-ML | cint16 | cint16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 2 | 1 | 0.784 | 1524 | 426 | 10 | 2 | 22808 | 1664 1792 |
matrix_mult | AIE-ML | cint16 | cint16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 4 | 2 | 1.795 | 742 | 1147 | 42 | 8 | 44108 | 1648 1664 1664 1792 1648 1664 1664 1792 |
matrix_mult | AIE-ML | cint32 | cint32 | 16 | 32 | 16 | 0 | 0 | 0 | 512 | 512 | 1 | 1 | 0.617 | 4758 | 139 | 7 | 1 | 22673 | 1968 |
matrix_mult | AIE-ML | cint32 | cint32 | 16 | 32 | 16 | 1 | 1 | 1 | 512 | 512 | 1 | 1 | 0.764 | 7976 | 139 | 12 | 3 | 47518 | 1776 1968 1088 |
matrix_mult | AIE-ML | cint32 | cint32 | 16 | 32 | 16 | 0 | 0 | 0 | 512 | 512 | 1 | 2 | 0.859 | 2565 | 262 | 12 | 2 | 33058 | 1968 1968 |
matrix_mult | AIE-ML | cint32 | cint32 | 16 | 32 | 16 | 0 | 0 | 0 | 512 | 512 | 2 | 1 | 0.844 | 2928 | 243 | 11 | 2 | 24856 | 1840 2016 |
matrix_mult | AIE-ML | cint32 | cint32 | 16 | 32 | 16 | 0 | 0 | 0 | 512 | 512 | 4 | 2 | 1.948 | 1222 | 711 | 42 | 8 | 46156 | 1840 1872 1872 2032 1840 1872 1872 2032 |
matrix_mult | AIE-ML | int16 | int16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 1 | 1 | 0.523 | 785 | 619 | 6 | 1 | 11407 | 1536 |
matrix_mult | AIE-ML | int16 | int16 | 16 | 64 | 16 | 1 | 1 | 1 | 1024 | 1024 | 1 | 1 | 0.643 | 2276 | 413 | 11 | 3 | 24987 | 1952 1744 1536 |
matrix_mult | AIE-ML | int16 | int16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 1 | 2 | 0.668 | 1111 | 621 | 14 | 2 | 17694 | 1504 1504 |
matrix_mult | AIE-ML | int16 | int16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 2 | 1 | 0.68 | 560 | 1168 | 10 | 2 | 13588 | 1456 1584 |
matrix_mult | AIE-ML | int16 | int16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 4 | 2 | 1.624 | 459 | 2438 | 42 | 8 | 30780 | 1216 1440 1440 1552 1216 1440 1440 1552 |
matrix_mult | AIE-ML | int32 | int16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 1 | 1 | 0.538 | 1435 | 311 | 7 | 1 | 16527 | 1776 |
matrix_mult | AIE-ML | int32 | int16 | 16 | 64 | 16 | 1 | 1 | 1 | 1024 | 1024 | 1 | 1 | 0.724 | 2410 | 311 | 11 | 3 | 35226 | 1776 1776 1344 |
matrix_mult | AIE-ML | int32 | int16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 1 | 2 | 0.713 | 769 | 619 | 12 | 2 | 22814 | 1728 1728 |
matrix_mult | AIE-ML | int32 | int16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 2 | 1 | 0.72 | 872 | 621 | 11 | 2 | 18708 | 1648 1824 |
matrix_mult | AIE-ML | int32 | int16 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 4 | 2 | 1.684 | 511 | 1828 | 42 | 8 | 35900 | 1616 1648 1648 1776 1616 1648 1648 1776 |
matrix_mult | AIE-ML | int32 | int32 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 1 | 1 | 0.586 | 2362 | 250 | 6 | 1 | 20623 | 1984 |
matrix_mult | AIE-ML | int32 | int32 | 16 | 64 | 16 | 1 | 1 | 1 | 1024 | 1024 | 1 | 1 | 0.775 | 4263 | 247 | 12 | 3 | 43418 | 1536 1984 1344 |
matrix_mult | AIE-ML | int32 | int32 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 1 | 2 | 0.755 | 2129 | 311 | 14 | 2 | 31006 | 1936 1936 |
matrix_mult | AIE-ML | int32 | int32 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 2 | 1 | 0.773 | 1622 | 408 | 10 | 2 | 22804 | 1824 2016 |
matrix_mult | AIE-ML | int32 | int32 | 16 | 64 | 16 | 0 | 0 | 0 | 1024 | 1024 | 4 | 2 | 1.803 | 800 | 1201 | 42 | 8 | 44092 | 1808 1856 1856 1968 1808 1856 1856 1968 |