Video compression standards often use block-based motion estimation. To decode a 16x16 pixel block in the current frame, the decoder may receive a motion vector relative to a past or future reference frame from which the best approximation of the current block can be fetched. As a result the majority of DRAM access done by a video decoder is in the form of rectangular block fetch from an image frame buffer, with arbitrary offset or alignment. The result depends on the speed and direction of the motion in the video.
- DRAM pages are shaped into rectangular image regions, referred to as tiles.
- Each tile is large enough to maximize the probability of a block fetch being fully contained within the tile.
- Adjacent tiles in any image direction (left, right, top, bottom) belong to alternate bank groups.
The following figure illustrates video decode frame buffer tiling. In the buffer, 50 8x8 blocks are randomly placed to model a video decoder motion-compensated block fetch.
It can be seen that most blocks fall within a single tile and therefore exhibit very efficient DRAM access. Some blocks span two tiles either horizontally or vertically, and exhibit somewhat reduced efficiency. On rare occasions, a block may fall on a corner of four tiles and will exhibit yet lower efficiency.
- The frame buffer start address must be page-aligned.
- If the video decode line size is not a power-of-2 value (as in HD: 1920), the line length is rounded up to the next power-of-2 value (2048 in this example), and therefore some portion of the line is unused. This is referred to as Line Stride, being larger that the line length.
Address Mapping | Efficiency [%] |
---|---|
BG-optimized RBC (16R-2B-1BG-7C-1BG-3C | 64 |
RBC (16R-2B-2BG-10C) | 64 |
D4_64t4k (14R-2B-5C-1BG-3R-2C-1BG-3C) | 75 |
D4_64t2k (15R-2B-5C-1BG-2R-2C-1BG-3C) | 58 |
RCB (16R-7C-2B-2BG-3C) | 40 |