The Model Composer
THROUGHPUT_FACTOR
pragma
provides some control over the throughput of an xmcImportFunction
block. You can add the THROUGHPUT_FACTOR
pragma to your function header file, along with the
SUPPORTS_STREAMING
pragma as shown in the following
example:
#pragma XMC THROUGHPUT_FACTOR TF_param: 1,2,4
#pragma XMC SUPPORTS_STREAMING
template<int ROWS, int COLS, int TF_param>
void DilationWrap(const uint8_t src[ROWS][COLS], uint8_t dst[ROWS][COLS])
The syntax of the pragma as shown in the prior example
is:
#pragma XMC THROUGHPUT_FACTOR TF_param: 1,2,4
Where:
- The
TF_param
must be anint
type template parameter, as is in the example above. - It is optional, though recommended, to specify any specific
throughput factors that are supported by the function. In the example above,
1,2,4
specifies the supported throughput factors in the pragma, expressed as positive integers, and must include the value 1. If you do not explicitly specify the throughput factors, theTF_param
is assumed to be valid for any positive throughput factor up to the upper limit of 16 that is supported by Model Composer.
As discussed in Controlling the Throughput of the Implementation, you specify
the throughput factor for the model in the Model Composer Hub block. You can specify a
throughput factor for the Hub block that divides evenly into one of the
THROUGHPUT_FACTOR
values on the xmcImportFunction
block. Important: If the throughput factor of the Hub block does not
match, or does not divide evenly into the
THROUGHPUT_FACTOR
specified by the xmcImportFunction
block, then the throughput is reduced to 1 for the
block function.Note the following requirements:
-
THROUGHPUT_FACTOR
pragma must be used on Template functions. -
THROUGHPUT_FACTOR
pragma must be used withSUPPORTS_STREAMING
pragma. - Only one
THROUGHPUT_FACTOR
pragma can be specified for anxmcImportFunction
block. - The block function will be called with actual arguments that have
cyclic
ARRAY_RESHAPE
directives with factor=TF (see example below). For more information on theARRAY_RESHAPE
pragma, refer to HLS Pragmas in the Vitis Unified Software Platform Acceleration Development Reference Guide (UG1702). - The read accesses from a non-scalar input argument of the function should be compliant with the requirements for streaming, and AMD Vitis™ HLS should be able to combine groups of TF reads into 1 read of the reshaped array.
- The write accesses into a non-scalar output argument of the function should be compliant with the requirements for streaming, and AMD Vitis™ HLS should be able to combine groups of TF writes into 1 write of the reshaped array.
The following is an example function specifying both
SUPPORTS_STREAMING
and THROUGHPUT_FACTOR
pragmas:#include <stdint.h>
#pragma XMC THROUGHPUT_FACTOR TF: 1, 2, 4, 8, 16
#pragma XMC SUPPORTS_STREAMING
template<int TF>
void mac(const int32_t In1[240], const int32_t In2[240], const int32_t In3[240],
int32_t Out1 [240])
{
#pragma HLS ARRAY_RESHAPE variable=In1 cyclic factor=TF
#pragma HLS ARRAY_RESHAPE variable=In2 cyclic factor=TF
#pragma HLS ARRAY_RESHAPE variable=In3 cyclic factor=TF
#pragma HLS ARRAY_RESHAPE variable=Out1 cyclic factor=TF
for (uint32_t k0 = 0; k0 < 240 / TF; ++k0) {
#pragma HLS pipeline II=1
int32_t Product_in2m[TF];
int32_t Sum_in2m[TF];
int32_t Product_in1m[TF];
int32_t Sum_outm[TF];
for (uint32_t k1 = 0; k1 < TF; ++k1) {
Product_in2m[k1] = In2[(k0 * TF + k1)];
}
for (uint32_t k1 = 0; k1 < TF; ++k1) {
Sum_in2m[k1] = In3[(k0 * TF + k1)];
}
for (uint32_t k1 = 0; k1 < TF; ++k1) {
Product_in1m[k1] = In1[(k0 * TF + k1)];
}
for (uint32_t k1 = 0; k1 < TF; ++k1) {
int32_t Product_in2s;
int32_t Sum_in2s;
int32_t Product_in1s;
int32_t Product_outs;
int32_t Sum_outs;
Product_in2s = Product_in2m[k1];
Sum_in2s = Sum_in2m[k1];
Product_in1s = Product_in1m[k1];
Product_outs = Product_in1s * Product_in2s;
Sum_outs = Product_outs + Sum_in2s;
Sum_outm[k1] = Sum_outs;
}
for (uint32_t k1 = 0; k1 < TF; ++k1) {
Out1[(k0 * TF + k1)] = Sum_outm[k1];
}
}
}