Figure 1. wdc
Description
Write into the data cache tag to invalidate or flush a cache line. The mnemonic wdc.flush is used to set the F bit, wdc.clear is used to set the T bit, wdc.clear.ea is used to set the T and EA bits, wdc.ext.flush is used to set the E, F, and T bits, and wdc.ext.clear is used to set the E and T bits.
When C_DCACHE_USE_WRITEBACK
is set to 1:
- If the F bits is set, the instruction will flush and invalidate the cache line.
- Otherwise, the instruction will only invalidate the cache line and discard any data that has not been written to memory.
- If the T bit is set, only a cache line with a matching address is
invalidated:
- If the EA bit is set register rA concatenated with rB is the extended address of the affected cache line.
- Otherwise, register rA added with rB is the address of the affected cache line.
- The EA bit is only taken into account when the parameter
C_ADDR_SIZE
> 32.
- The E bit is not taken into account.
- The F and T bits cannot be used at the same time.
When C_DCACHE_USE_WRITEBACK
is cleared to 0:
- If the E bit is not set, the instruction will invalidate the cache line. Register rA contains the address of the affected cache line, and the register rB value is not used.
- Otherwise, MicroBlaze will request that the matching address in an external cache should be invalidated or flushed, depending on the value of the F bit, and invalidate the internal affected cache line. Register rA added with rB is the address in the external cache, and of the affected cache line.
- The E bit is only taken into account when the parameter
C_INTERCONNECT
is set to 3 (ACE).
When MicroBlaze is configured to use an MMU
(C_USE_MMU
>= 1) the instruction is privileged. This means that
if the instruction is attempted in User Mode
(
= 1) a Privileged Instruction
exception occurs.MSR[UM]
Pseudocode
if MSR[UM] = 1 then
ESR[EC] ← 00111
else
if C_DCACHE_USE_WRITEBACK = 1 then
if T = 1 and EA = 1 then
address ← (rA) & (rB)
else
address ← (rA) + (rB)
else if E = 0 then
address ← (rA)
else
address ← (rA) + (rB)
if C_DCACHE_LINE_LEN = 4 then
cacheline_mask ← (1 << log2(C_DCACHE_BYTE_SIZE) - 4) - 1
cacheline ← (DCache Line)[(address >> 4) ˄ cacheline_mask]
cacheline_addr ← address & 0xfffffff0
if C_DCACHE_LINE_LEN = 8 then
cacheline_mask ← (1 << log2(C_DCACHE_BYTE_SIZE) - 5) - 1
cacheline ← (DCache Line)[(address >> 5) ˄ cacheline_mask]
cacheline_addr ← address & 0xffffffe0
if C_DCACHE_LINE_LEN = 16 then
cacheline_mask ← (1 << log2(C_DCACHE_BYTE_SIZE) - 6) - 1
cacheline ← (DCache Line)[(address >> 6) ˄ cacheline_mask]
cacheline_addr ← address & 0xffffffc0
if E = 0 and F = 1 and cacheline.Dirty then
for i = 0 .. C_DCACHE_LINE_LEN - 1 loop
if cacheline.Valid[i] then
Mem(cacheline_addr + i * 4) ← cacheline.Data[i]
if T = 0 or C_DCACHE_USE_WRITEBACK = 0 then
cacheline.Tag ← 0
else if cacheline.Address = cacheline_addr then
cacheline.Tag ← 0
if E = 1 then
if F = 1 then
request external cache flush with address
else
request external cache invalidate with address
Registers Altered
- ESR[EC], in case a privileged instruction exception is generated
Latency
- 2 cycles for wdc.clear
- 2 cycles for wdc with
C_AREA_OPTIMIZED
=0 or 2 - 3 cycles for wdc with
C_AREA_OPTIMIZED
=0 - 2 + N cycles for wdc.flush, where N is the number of clock cycles required to flush the cache line to memory when necessary
Notes
- The wdc, wdc.flush, wdc.clear and wdc.clear.ea instructions are independent of data cache enable (MSR[DCE]), and can be used either with the data cache enabled or disabled.
- The wdc.clear and wdc.clear.ea instructions are intended to invalidate a specific area in memory, for example a buffer to be written by a Direct Memory Access device.
- Using this instruction ensures that other cache lines are not inadvertently invalidated, erroneously discarding data that has not yet been written to memory.
- The address of the affected cache line is always the physical
address, independent of the parameter
C_USE_MMU
and whether the MMU is in virtual mode or real mode. - When using wdc.flush in a loop to flush the entire cache, the loop can be
optimized by using rA as the cache base address and rB as the loop
counter:
addik r5,r0,C_DCACHE_BASEADDR addik r6,r0,C_DCACHE_BYTE_SIZE-C_DCACHE_LINE_LEN*4 loop: wdc.flush r5,r6 bgtid r6,loop addik r6,r6,-C_DCACHE_LINE_LEN*4
- When using wdc.clear in a loop to invalidate a memory area in the cache, the
loop can be optimized by using rA as the memory area base address and rB as the
loop
counter:
addik r5,r0,memory_area_base_address addik r6,r0,memory_area_byte_size-C_DCACHE_LINE_LEN*4 loop: wdc.clear r5,r6 bgtid r6,loop addik r6,r6,-C_DCACHE_LINE_LEN*4