Resource utilization and power are measured using Vivado Design Suite, vcdanalyze, and Xilinx Power Estimator (XPE) for Versal (2024.2 version) tools.
Use the following steps to find the registers and CLB LUT utilization information in the Vivado project:
Open the Vivado project:
$(BUILD_TARGET_DIR)/_x/link/vivado/vpl/prj/prj.xpr.Go to Open Implemented Design, then click Report Utilization. In the Utilization tab shown in the following figure, select ai_engine_0 and view the Registers and CLB LUTs for gemm_32x32x32:
or,
Do make report_metrics TARGET=hw, (recipe expanded below), alongwith relevant options, to generate utilization_hierarchical.txt under $(BLD_REPORTS_DIR)/ directory:
...
VIVADO_METRICS_SCRIPTS_REPO := $(DESIGN_REPO)/vivado_metrics_scripts
...
REPORTS_REPO := $(PROJECT_REPO)/reports_dir
BLD_REPORTS_DIR := $(REPORTS_REPO)/gemm_$(MAT_DIMS)/x$(GEMM_INSTS)
...
report_metrics: xsa $(BLD_REPORTS_DIR)
ifeq ($(TARGET),hw_emu)
@echo "This build target (report-metrics) not valid when design target is hw_emu"
else
rm -rf $(BLD_REPORTS_DIR)
mkdir -p $(BLD_REPORTS_DIR)
cd $(BLD_REPORTS_DIR); \
vivado -mode batch -source $(VIVADO_METRICS_SCRIPTS_REPO)/report_metrics.tcl $(BUILD_TARGET_DIR)/_x/link/vivado/vpl/prj/prj.xpr
endif
...
The vcdanalyze tool generates a graph.xpe file. You can input this file to XPE for viewing the AI Engine resource utilization and power. The steps are as follows:
Run
make vcd(recipe expanded below) to create thegraph.xpefile under$(BUILD_TARGET_DIR)/aiesim_xpe/:vcd: graph create_ioFiles $(XPE_FILE) $(XPE_FILE): $(BLD_TGT_VCD_FILE) cd $(BUILD_TARGET_DIR); \ vcdanalyze --vcd $(VCD_FILE_NAME).vcd --xpe $(BLD_TGT_VCD_FILE): $(AIE_SRC_REPO)/aiesim_data/* cd $(BUILD_TARGET_DIR); \ aiesimulator $(AIE_SIM_FLAGS) --profile --dump-vcd $(VCD_FILE_NAME) 2>&1 | tee -a vcd.log
Load the
graph.xpeinto PDM to see the AI Engine power comsumption and resource utilization for the gemm_32x32x32 design:
A summary of resource utilization and power for all variations is given in the following table.
GeMM Configuration |
Number of Compute Cores |
Vector Load |
Number of Active Memory Banks |
Mem R/W Rate |
Active AI Engine Tiles |
Interconnect Load |
FF (Regs) |
CLB LUTS |
Dynamic Power |
|---|---|---|---|---|---|---|---|---|---|
32x32x32 |
24 |
14.54% |
231 |
3.575% |
44 |
12.87% |
13633 |
2934 |
2741 |
64x64x64 |
24 |
35.53% |
252 |
6.345% |
43 |
13.10% |
13579 |
2881 |
3401 |
128x128x128 |
24 |
36.62% |
231 |
8.910% |
43 |
13.10% |
13507 |
3006 |
3563 |
256x256x256 |
24 |
61.82% |
231 |
14.725% |
43 |
13.10% |
13576 |
2880 |
4432 |
512x512x512 |
24 |
71.42% |
252 |
12.125% |
43 |
13.41% |
13507 |
3006 |
4524 |
1024x1024x1024 |
24 |
82.96% |
252 |
13.980% |
43 |
12.57% |
13540 |
2834 |
4876 |