The Model Composer
THROUGHPUT_FACTOR pragma
provides some control over the throughput of an xmcImportFunction block. You can add the THROUGHPUT_FACTOR pragma to your function header file, along with the
SUPPORTS_STREAMING pragma as shown in the following
example:
#pragma XMC THROUGHPUT_FACTOR TF_param: 1,2,4
#pragma XMC SUPPORTS_STREAMING
template<int ROWS, int COLS, int TF_param>
void DilationWrap(const uint8_t src[ROWS][COLS], uint8_t dst[ROWS][COLS])
The syntax of the pragma as shown in the prior example
is:
#pragma XMC THROUGHPUT_FACTOR TF_param: 1,2,4
Where:
- The
TF_parammust be aninttype template parameter, as is in the example above. - It is optional, though recommended, to specify any specific throughput
factors that are supported by the function. In the example above,
1,2,4specifies the supported throughput factors in the pragma, expressed as positive integers, and must include the value 1. If you do not explicitly specify the throughput factors, theTF_paramis assumed to be valid for any positive throughput factor up to the upper limit of 16 that is supported by Model Composer.
As discussed in Controlling the Throughput of the Implementation, you specify
the throughput factor for the model in the Model Composer Hub block. You can specify a
throughput factor for the Hub block that divides evenly into one of the
THROUGHPUT_FACTOR values on the xmcImportFunction block. Important: If the throughput factor of the Hub block does not
match, or does not divide evenly into the
THROUGHPUT_FACTOR specified by the xmcImportFunction block, then the throughput is reduced to 1 for the
block function.Please note the following requirements:
-
THROUGHPUT_FACTORpragma must be used on Template functions. -
THROUGHPUT_FACTORpragma must be used withSUPPORTS_STREAMINGpragma. - Only one
THROUGHPUT_FACTORpragma can be specified for anxmcImportFunctionblock. - The block function will be called with actual arguments that have cyclic
ARRAY_RESHAPEdirectives with factor=TF (see example below). For more information on theARRAY_RESHAPEpragma, refer to HLS Pragmas in the Vitis Unified Software Platform Documentation (UG1416). - The read accesses from a non-scalar input argument of the function should be compliant with the requirements for streaming, and Vitis HLS should be able to combine groups of TF reads into 1 read of the reshaped array.
- The write accesses into a non-scalar output argument of the function should be compliant with the requirements for streaming, and Vitis HLS should be able to combine groups of TF writes into 1 write of the reshaped array.
The following is an example function specifying both
SUPPORTS_STREAMING and THROUGHPUT_FACTOR
pragmas:#include <stdint.h>
#pragma XMC THROUGHPUT_FACTOR TF: 1, 2, 4, 8, 16
#pragma XMC SUPPORTS_STREAMING
template<int TF>
void mac(const int32_t In1[240], const int32_t In2[240], const int32_t In3[240],
int32_t Out1 [240])
{
#pragma HLS ARRAY_RESHAPE variable=In1 cyclic factor=TF
#pragma HLS ARRAY_RESHAPE variable=In2 cyclic factor=TF
#pragma HLS ARRAY_RESHAPE variable=In3 cyclic factor=TF
#pragma HLS ARRAY_RESHAPE variable=Out1 cyclic factor=TF
for (uint32_t k0 = 0; k0 < 240 / TF; ++k0) {
#pragma HLS pipeline II=1
int32_t Product_in2m[TF];
int32_t Sum_in2m[TF];
int32_t Product_in1m[TF];
int32_t Sum_outm[TF];
for (uint32_t k1 = 0; k1 < TF; ++k1) {
Product_in2m[k1] = In2[(k0 * TF + k1)];
}
for (uint32_t k1 = 0; k1 < TF; ++k1) {
Sum_in2m[k1] = In3[(k0 * TF + k1)];
}
for (uint32_t k1 = 0; k1 < TF; ++k1) {
Product_in1m[k1] = In1[(k0 * TF + k1)];
}
for (uint32_t k1 = 0; k1 < TF; ++k1) {
int32_t Product_in2s;
int32_t Sum_in2s;
int32_t Product_in1s;
int32_t Product_outs;
int32_t Sum_outs;
Product_in2s = Product_in2m[k1];
Sum_in2s = Sum_in2m[k1];
Product_in1s = Product_in1m[k1];
Product_outs = Product_in1s * Product_in2s;
Sum_outs = Product_outs + Sum_in2s;
Sum_outm[k1] = Sum_outs;
}
for (uint32_t k1 = 0; k1 < TF; ++k1) {
Out1[(k0 * TF + k1)] = Sum_outm[k1];
}
}
}