Various multimedia use-cases involving video codecs such as audio/video conferencing, video- on-demand, playback, and record use-cases also involve multiple other peripherals such as ethernet, video capture pipeline related IPs including image sensor and image signal processing engines, DMA engines, and display pipeline related IP like video mixers and HDMI transmitters, which in turn use unique interrupt lines for communicating with the CPU.
In these scenarios, it becomes important to distribute the interrupt processing load across multiple CPU cores instead of using the same core for all the peripherals/IP. Distributing the IRQ across CPU cores optimizes the latency and performance of the running use-case as the IRQ context switching and ISR handling load gets distributed across multiple CPU cores.
$cat /proc/interrupts
The Versal has 2 CPU cores available. If running a plain PetaLinux image withoutany irqbalance daemon, then by default all IRQ requests are processed by CPU 0 by the Linux scheduler. To assign a different CPU core to process a particular IRQ number, the IRQ affinity for that particular interrupt needs to be changed. The IRQ affinity value defines which CPU cores are allowed to process that particular IRQ. For more information, see https://www.kernel.org/doc/Documentation/IRQ-affinity.txt.
$cat /proc/irq/42/smp_affinity
output: f
echo 2 > /proc/irq/42/smp_affinity
The following section shows how IRQ balancing can be performed before running a multistream video conferencing use-case that involves multiple peripherals and video IP.
Consider you have various DMA channels to capture different video streams, which in turn also use different interrupt lines as depicted by the versal-dma blocks in the following figure.
As seen in the previous figure, all interrupt requests from different peripherals goes to CPU 0 by default.
To distribute the interrupt requests across different CPU cores as show in the following figure, follow these steps:
- Find the IRQ numbers for each of the above
peripherals.
root@vek280:~/ # cat /proc/interrupts | grep al5 49: 1250127 47679 GICv2 127 Level a0120000.al5d root@vek280:~/# cat /proc/interrupts | grep xilinx_frame 52: 18662 0 GICv2 122 Level xilinx_framebuffer 53: 19170 0 interrupt-controller@a0055000 3 Level -level xilinx_framebuffer 54: 18825 0 interrupt-controller@a0055000 0 Level -level xilinx_framebuffer 55: 18463 0 interrupt-controller@a0055000 1 Level -level xilinx_framebuffer 57: 0 0 GICv2 121 Level xilinx_framebuffer root@vek280:~/ # cat /proc/interrupts | grep xilinx-hdmi 56: 544834 0 GICv2 123 Level xilinx-hdmi-rx 58: 86730 0 GICv2 125 Level xilinx-hdmitxss root@vek280:~/ # cat /proc/interrupts | grep mixer 59: 86752 0 GICv2 128 Level xlnx-mixer root@ vek280:~/ # cat /proct/interrupts | grep versal-dma 12: 42151036 0 GICv2 156 Level versal-dma 13: 31494805 10644207 GICv2 157 Level versal-dma 14: 31483922 0 GICv2 158 Level versal-dma 15: 31518024 0 GICv2 159 Level versal-dma NOTE: Here there are multiple versal -dma interrupt lines so to check which ones are getting, you first need to run the usecase and then check which interrupt lines are getting utilized.
The numbers on the left are the IRQ numbers for the respective peripherals.
- Assign CPU 0 to VDU IRQ with number
49.
echo 1 > /proc/irq/49/smp_affinity #VDU
- Assign CPU 0 to HDMI RX and the framebuffer write
IP
echo 1 > /proc/irq/52/smp_affinity #Frame buffer echo 1 > /proc/irq/56/smp_affinity #Primary HDMI Rx
- Assign CPU 1 to HDMI TX and Video mixer
IP
echo 2 > /proc/irq/58/smp_affinity #Tx echo 2 > /proc/irq/59/smp_affinity #Mixer
By default, the interrupts for video1xilinx_framebuffer
DMA engine and various other peripherals are already being processed by CPU 0 so there is no need to modify thesmp_affinity
for the same. Using the previous commands, the IRQ is distributed as per the scheme mentioned in the previous figure, which can also be seen by running the following command when the use-case is running and observing whether interrupts for the peripherals are going to respective CPU cores as intended or not. Likewise, similar scheme of distributing interrupts can be followed for other use-cases too depending upon the peripherals being used, system load, and intended performance.$ cat /proc/interrupts
- Assign a unique CPU to each
versal-dma
channel if possible.echo 1 > /proc/irq/12/smp_affinity #versal-dma1 echo 1 > /proc/irq/13/smp_affinity # versal-dma2 echo 2 > /proc/irq/14/smp_affinity # versal-dma3 echo 2 > /proc/irq/15/smp_affinity # versal-dma4
By default the interrupts for other peripherals is processed by cpu 0 so there is no need to modify thesmp_affinity
for the same. Using the preceding commands, the IRQ gets distributed as per the scheme mentioned in which can also be seen by running the following command when the use-case is running:cat /proc/interrupts 12: 42151036 0 0 0 GICv2 156 Level zynqmp-dma 13: 31494805 10644207 0 0 GICv2 157 Level zynqmp-dma 14: 31483922 0 10643127 0 GICv2 158 Level zynqmp-dma 15: 31518024 0 0 10595920 GICv2 159 Level versal-dma 49: 1250127 47679 0 0 GICv2 127 Level a0120000.al5d, a0100000.al5e 52: 18662 0 822 0 GICv2 122 Level xilinx_framebuffer