Machine HPM Event Register (mhpmevent) - 2025.1 English - UG1629

MicroBlaze V Processor Reference Guide (UG1629)

Document ID
UG1629
Release Date
2025-07-09
Version
2025.1 English

The hardware performance monitor (HPM) includes up to 29 64-bit event counters, mhpmcounter3 – mhpmcounter31. The event selector registers, mhpmevent3 – mhpmevent31, are read/write registers that control which event causes the corresponding counter to increment. If more than one event is enabled in the event register and two or more of the enabled events occur simultaneously, the counter only increments by one.

Each event selector register is divided into a class, and individual events. Within each class, one or more of the events can be counted. Only events that are defined within a class can be set.

The latency class requires using a counter implemented as a latency counter for sum and max/min, but should use an event counter to count the number of events. For this class, it is not recommended to count more than one event with each counter. The class of a latency counter is fixed to 5.

Event 0 is defined to mean “no event.”

If class is set to an undefined value or an individual event bit that is not defined for the class is set, the counter does not increment.

Figure 1. Machine HPM Event Register

Table 1. Machine HPM Event Register
Bits Name Description Reset Value
24:5 Event Event mask. Defined for each event class. 0
4:1 Class 0

1

2

3

4

5

Retired instructions

Branches and traps

Instruction and data cache

Pipeline stalls

Miscellaneous

Latency

0
0 No Event Set to represent no event 0
Table 2. Event Class 0 – Retired Instructions
Event Bit Description
5 Integer load instruction retired
6 Integer store instruction retired
7 Atomic instruction retired
8 System instruction retired, including ECALL and EBREAK
9 Integer arithmetic instruction, including C.NOP retired
10 Integer multiply instruction retired
11 Integer divide/remainder instruction retired
12 Custom instruction retired
13 Bit manipulation instruction retired
14 Compressed instructions retired
15 JAL or C.J instruction retired
16 JALR or C.JR instruction retired
17 Floating point load instruction retired
18 Floating point store instruction retired
19 Floating point add/sub instruction retired
20 Floating point multiply instruction retired
21 Floating point divide instruction retired
22 Floating point fused instruction retired
23 Floating point other instruction retired
24 Cache invalidate or flush retired
Table 3. Event Class 1 – Branches and Traps
Event Bit Description
5 Taken conditional branch
6 Not taken conditional branch
7 Exception taken
8 Interrupt occurred
9 Branch target cache hit
10 Branch target mispredict
Table 4. Event Class 2 – Instruction and Data Cache
Event Bit Description
5 Data request from instruction cache
6 Hit in instruction cache
7 Read data requested from data cache
8 Read data hit in data cache
9 Write data request from data cache
10 Write data hit in data cache
Table 5. Event Class 3 – Pipeline Stalls
Event Bit Description
6 Pipeline stalled due to operand fetch stage (OF)
7 Pipeline stalled due to execute stage (EX)
8 Pipeline stalled due to memory stage:
  • 5-stage pipeline: MEM
  • 8-stage pipeline: M0, M1, M2, or M3
Table 6. Event Class 4 – Miscellaneous
Event Bit Description
5 Divide/remainder by zero operation
6 Floating-point subnormal result
Table 7. Event Class 5 – Latency
Event Bit Description
5 Interrupt: total sum

Interrupt: max (31:16) and min (15:0)

7 Data cache memory read: total sum

Data cache memory read: max (31:16) and min (15:0)

9 Data cache memory write: total sum

Data cache memory write:max (31:16) and min (15:0)

11 Instruction cache memory read: total sum

Instruction cache memory read: max (31:16) and min (15:0)

13 Peripheral AXI data read: total sum

Peripheral AXI data read: max (31:16) and min (15:0)

15 Peripheral AXI data write: total sum

Peripheral AXI data write: max (31:16) and min (15:0)

The number of event counters and event selector registers is set by C_DEBUG_EVENT_COUNTERS + 2 * C_DEBUG_LATENCY_COUNTERS. The lower registers are implemented as event counters, whereas the higher are implemented as pairs of latency counters, consisting of the latency sum and min/max latency. The selected event for a pair of latency counters is set in the corresponding event selector register for the first counter in the pair.

An example with C_DEBUG_EVENT_COUNTERS = 5 and C_DEBUG_LATENCY_COUNTERS = 2 illustrates how the event counters are allocated.

Table 8. Event Counter Allocation
Event Counter Kind Description
mhpmcounter3 Event Counters Used to count events from Class 0 – 5.
mhpmcounter4
mhpmcounter5
mhpmcounter6
mhpmcounter7
mhpmcounter8 Latency Counter: Total sum Used to count a latency event from Class 5.

Use mhpmevent8 to set events for both counters.

mhpmcounter9 Latency Counter: Min/Max
mhpmcounter10 Latency Counter: Total sum Used to count a latency event from Class 5.

Use mhpmevent10 to set events for both counters.

mhpmcounter11 Latency Counter: Min/Max