H.264/H.265 Video Codec Unit v1.2
Introduction
Features
IP Facts
Overview
Applications
Unsupported Features
Licensing and Ordering
Product Specification
Standards
Performance
Resource Utilization
Core Interfaces
Port Descriptions
Common Interface Signals
Register Space
Core Architecture
Encoder Block
Features
Functional Description
Interfaces and Ports
Clocking
Reset
MCU Subsystem
Data Path
Control Path
Encoder Buffer
Encoder Buffer Requirements
Memory Requirements
DDR Memory Footprint Requirements
CMA Size Requirements
Memory Footprint Calculation
Memory Bandwidth
Source Frame Format
Encoder Block Register Overview
Improving VCU Encoder Quality
Decoder Block
Features
Functional Description
Interfaces and Ports
Clocking
Reset
Data Path
Control Path
Decoder Buffer Requirements
Memory Footprint Requirements
CMA Size Requirements
Memory Bandwidth
Memory Format
VCU Decoder
Decoder Block Register Overview
Microcontroller Unit Overview
Functional Description
Interfaces and Ports
Control Flow
MCU Register Overview
AXI Performance Monitor
Overview
Functional Description
Operating Timing Window Generation
Start/Stop Mode
Fixed Duration Timing Window
APM Registers
Designing with the Core
General Design Guidelines
Interrupts
Clocking and Resets
Functional Description
Clocking
PLL Overview
Generation of Primary Clock
VCO Frequency and MF Value
Reset Sequence
PLL Integer Divider Programming
Reset
Clocking and Reset Registers
Design Flow Steps
Customizing and Generating the Core
Basic Configuration Tab
Advanced Configuration Tab
Decoder Configuration for Multi-Stream Use Cases
Interfacing the Core with Zynq UltraScale+ MPSoC Devices
Enabling PL-DDR for VCU
Build Steps
Steps to Run The Commands
Constraining the Core
Synthesis and Implementation
Simulation
Zynq UltraScale+ EV Architecture Video Codec Unit DDR4 LogiCORE IP v1.1
Introduction
IP Facts
Overview
DDR4 DSDRAM Feature Summary
Licensing and Ordering
Product Specification
Standards
Performance
Resource Utilization
Port Descriptions
AXI Ports
Core Design Signals
Core Architecture
Overview
Memory Controller
Read and Write Coalescing
Reordering
Read-Modify-Write Flow
PHY
Designing with the Core
Clocking
Requirements
Connectivity with VCU DDR4 Controller IP
Design Flow Steps
Customizing the VCU DDR4 Controller
Supported Configuration
Connecting with VCU LogiCORE IP
Constraining the Core
VCU Sync IP v1.0
Introduction
IP Facts
Overview
Features
Licensing and Ordering
Product Specification
Standards
Performance
Resource Utilization
Port Descriptions
Controller AXI Ports
Consumer AXI Ports
Producer AXI Ports
Consumer Master AXI Ports
Mapping for AXI Ports
Core Architecture
Sync IP Implementation Overview
Designing with the Core
Clocking
Resets
Connectivity with VCU Sync IP
Software Flow Overview
Software Architecture and Buffer Synchronization Mechanism
Sync IP Software Programming Model
Design Flow Steps
Customizing and Generating the Core
Customizing the VCU Sync IP
Supported Configuration
Connecting the Sync IP with the VCU LogiCORE IP
Constraining the Core
Using the Release Package
Performance and Debugging
Latency in the VCU Pipeline
Glass-to-Glass Latency
VCU Latency Modes
Xilinx Low Latency Limitations
Encoder and Decoder Latencies with Xilinx Low Latency Mode
Recommended Parameters for Xilinx Low-latency Mode
VCU End-to-End Latency
Usage
Latency
Measuring Total Latency of a Pipeline
Checking Reported Latency of the Pipeline
Checking Instantaneous Latencies
Checking Reported Latencies of Individual Elements of Pipeline
VCU Encoder Latency
VCU Decoder Latency
Debugging
Finding Help on Xilinx.com
Documentation
Answer Records
Master Answer Record for the Core
Technical Support
Debug Tools
Vivado Design Suite Debug Feature
Vivado Design Suite Debug Feature
Reference Boards
Hardware Debug
General Checks
Debugging a VCU-based System
Debug Flow
Troubleshooting
Debugging External Interfaces
Debugging the VCU Interface
Debugging Control Software Application
Debugging GStreamer Based Application
Debugging Performance Issues
Debugging Latency Issues
Interface Debug
AXI4-Lite Interfaces
AXI4-Stream Interfaces
Application Software Development
Overview
Software Prerequisites
Encoder and Decoder Software Features
Encoder Software Features
GStreamer Encoding Parameters
Tier and Level Limits
Max Supported Num Slices for 1080p and 4k Resolution
Decoder Software Features
GStreamer Decoding Parameters
Decoder Maximum Bit Rate Tests
Max-Bitrate Benchmarking
Gstreamer and V4L2 Formats
Preparing PetaLinux to Run VCU Applications
Integrating the VCU and GStreamer Patches
VCU Encoder Features
Dynamic-Bitrate
Dynamic GOP
Region of Interest Encoding
LongTerm Reference Picture
Adaptive GOP
Insertion of SEI Data at Encoder
SEI Decoder API
Dual Pass Encoding
Scene Change Detection
Interlaced Video
GDR Intra Refresh
Dynamic Resolution Change — VCU Encoder/Decoder
VCU Decoder
VCU Encoder
Frameskip Support for the VCU Encoder
Gstreamer Pipeline
32-Streams Support
DCI-4k Encode/Decode
Temporal-ID support for VCU Encoder
Intra MB forcing using External QP-Table/LOAD_QP Mode
Decoder Meta-data Transfer Using 1-to-1 Relation Between Input and Output Buffer
DEFAULT_GOP_B and PYRAMIDAL_GOP_B GOP Control Modes
Adaptive Deblocking Filter Parameters Update
LOAD_QP Support at Gstreamer
DMA_PROXY Module Usage
XAVC
Supported Profiles for XAVC Encoder
Supported Profiles for XAVC Decoder
XAVC Examples
HDR10
GStreamer Pipelines
H.264 Decoding
H.265 Decoding
High Bitrate Bitstream Decoding
H.264 Encoding
H.265 Encoding
Transcode from H.264 to H.265
Transcode from H.265 to H.264
Multistream Encoding Decoding Simultaneously
Multistream Decoding
Multistream Encoding
Transcoding and Streaming via Ethernet
Streaming via Ethernet and Decoding to the Display Pipeline
Recommended Settings for Streaming
Verified GStreamer Elements
Verified Containers Using GStreamer
Verified Streaming Protocols Using GStreamer
OpenMax Integration Layer
OpenMax Integration Layer Sample Applications
Sync IP API
Structure and Function Definitions
SyncIp
SyncChannel
EncSyncChannel, DecSyncChannel
int xvfbsync_syncip_chan_populate (SyncIp *syncip, SyncChannel * syncip_chan, uint32_t fd)
int xvfbsync_enc_sync_chan_populate (EncSyncChannel * enc_sync_chan, SyncChannel *sync_chan, uint32_t hardware_horizontal_stride_alignment, uint32_t hardware_vertical_stride_alignment)
int xvfbsync_enc_sync_chan_enable (EncSyncChannel * enc_sync_chan)
int xvfbsync_enc_sync_chan_set_intr_mask (EncSyncChannel * enc_sync_chan, ChannelIntr * intr_mask)
int xvfbsync_enc_sync_chan_add_buffer (EncSyncChannel * enc_sync_chan, XLNXLLBuf * buf)
int xvfbsync_syncip_reset_err_status (EncSyncChannel * enc_sync_chan, ChannelErrIntr *err_intr)
int xvfbsync_enc_sync_chan_depopulate (EncSyncChannel * enc_sync_chan)
Programming Sequence of Synchronization IP
Programming Sequence of Synchronization IP
VCU Control Software
Xilinx VCU Control Software API
Error Checking and Reporting
Memory Management
AL_Buffer
void AL_Buffer_Ref(AL_TBuffer* hBuf)
void AL_Buffer_Unref(AL_TBuffer* hBuf)
AL_TAllocator
AL_AllocatorVtable
void AL_Buffer_SetData(const AL_TBuffer* hBuf, uint8_t* pData)
AL_HANDLE AL_Allocator_Alloc(AL_TAllocator* pAllocator, size_t zSize)
AL_VADDR AL_Allocator_GetVirtualAddr(AL_TAllocator* pAllocator, AL_HANDLE hBuf)
AL_PADDR AL_Allocator_GetPhysicalAddr(AL_TAllocator* pAllocator, AL_HANDLE hBuf)
AL_EMetaType
AL_TStreamMetaData* AL_StreamMetaData_Create(uint16_t uMaxNumSection)
void AL_StreamMetaData_ClearAllSections(AL_TStreamMetaData* pMetaData)
uint16_t AL_StreamMetaData_AddSection(AL_TStreamMetaData* pMetaData, uint32_t uOffset, uint32_t uLength, uint32_t uFlags)
AL_Section_Flags
bool AL_Buffer_AddMetaData(AL_TBuffer* pBuf, AL_TMetaData* pMeta)
AL_TMetaData* AL_Buffer_GetMetaData(AL_TBuffer* pBuf, AL_EMetaType eType)
bool AL_Buffer_RemoveMetaData(AL_TBuffer* pBuf, AL_TMetaData* pMeta)
AL_TBuffer* AL_Buffer_Create_And_Allocate(AL_TAllocator* pAllocator, size_t zSize, PFN_RefCount_CallBack pCallBack)
void AL_Buffer_Destroy(AL_TBuffer* pBuf)
AL_TBuffer* AL_Buffer_WrapData(uint8_t* pData, size_t zSize, PFN_RefCount_CallBackpCallBack)
bool AL_Allocator_Free(AL_TAllocator* pAllocator, AL_HANDLE hBuf)
AL_TAllocator* DmaAlloc_Create(const char* deviceFile)
bool AL_Allocator_Destroy(AL_TAllocator* pAllocator)
int AL_GetAllocSize_DecReference(AL_TDimension tDim, AL_EChromaMode eChromaMode, uint8_t uBitDepth, AL_EFbStorageMode eFrameBufferStorageMode)
int AL_CalculatePitchValue(int iWidth, uint8_t uBitDepth, AL_EFbStorageMode eStorageMode)
Data Format Conversion
void I0AL_To_I420(AL_TBuffer const* pSrc, AL_TBuffer* pDst)
AL_EChromaMode AL_GetChromaMode(TFourCC tFourCC)
FOURCC(A)
AL_EPicFormat
AL_GET_BITDEPTH_LUMA(PicFmt)
AL_GET_BITDEPTH_CHROMA(PicFmt)
AL_GET_BITDEPTH(PicFmt)
AL_GET_CHROMA_MODE(PicFmt)
AL_SET_BITDEPTH_LUMA(PicFmt, BitDepth)
AL_SET_BITDEPTH_CHROMA(PicFmt, BitDepth)
AL_SET_BITDEPTH(PicFmt, BitDepth)
AL_SET_CHROMA_MODE(PicFmt, BitDepth)
uint8_t AL_GetBitDepth(TFourCC tFourCC)
int GetPixelSize(uint8_t uBitDepth)
AL_EFbStorageMode GetStorageMode(TFourCC tFourCC)
AL_EFbStorageMode AL_GetSrcStorageMode(AL_ESrcMode eSrcMode)
int GetNumLinesInPitch(AL_EFbStorageMode eFrameBufferStorageMode)
AL_IS_AVC(Prof)
AL_IS_HEVC(Prof)
AL_IS_STILL_PROFILE(Prof)
bool AL_IsSemiPlanar(TFourCC tFourCC)
bool AL_IsTiled(TFourCC tFourCC)
bool AL_Is32x4Tiled(TFourCC tFourCC)
bool AL_Is64x4Tiled(TFourCC tFourCC)
bool AL_Is10bitPacked(TFourCC tFourCC)
void AL_GetSubsampling(TFourCC fourcc, int* sx, int* sy)
TFourCC AL_EncGetSrcFourCC(AL_TPicFormat const picFmt)
Driver* AL_GetHardwareDriver()
TScheduler* AL_SchedulerMcu_Create(Driver* driver, AL_TAllocator* pDmaAllocator)
TScheduler* AL_SchedulerMcu_Destroy(AL_TSchedulerMcu* schedulerMcu)
AL_ECodec AL_GetCodec(AL_EProfile eProf)
VCU Control Software Sample Applications
VCU Control Software Encoder Parameters
Input Parameters
Dynamic Input Section Parameters
Output Parameters
Rate Control Parameters
Group of Pictures Parameters
Settings Parameters
Run Parameters
Quantization Parameter (QP) File Format
Load QP
Lambda File Format
Command File Format
ROI File Format
Driver
MCU Firmware
Encoder and Decoder Stacks
Decoder Stack
Encoder Stack
Encoder Flow
Encoder API
AL_ERR AL_Encoder_GetLastError(AL_HEncoder hEnc)
AL_TEncSettings
AL_EChEncOptions
AL_ERR AL_Encoder_Create(AL_HEncoder* hEnc, TScheduler* pScheduler, AL_TAllocator* pAlloc, AL_TEncSettings const* pSettings, AL_CB_EndEncoding callback)
AL_CB_EndEncoding
void AL_Settings_SetDefaults(AL_TEncSettings* pSettings)
void AL_Settings_SetDefaultParam(AL_TEncSettings* pSettings)
int AL_Settings_CheckValidity(AL_TEncSettings* pSettings, AL_TEncChanParam * pChParam, FILE* pOut)
int AL_Settings_CheckCoherency(AL_TEncSettings * pSettings, AL_TEncChanParam * pChParam, TFourCC tFourCC, FILE * pOut)
int AL_GetMitigatedMaxNalSize(AL_TDimension tDim, AL_EChromaMode eMode, int iBitDepth)
AL_TBufPoolConfig
bool AL_Encoder_PutStreamBuffer(AL_HEncoder hEnc, AL_TBuffer* pStream)
bool AL_Encoder_Process(AL_HEncoder hEnc, AL_TBuffer* pFrame, AL_TBuffer* pQpTable)
void AL_Encoder_Destroy(AL_HEncoder hEnc)
AL_EBufMode
int AL_Encoder_AddSei(AL_HEncoder hEnc, AL_TBuffer* pStream, bool isPrefix, int iPayloadType, uint8_t* pPayload, int iPayloadSize, int iTempId);
int AL_Encoder_NotifySceneChange(AL_HEncoder hEnc, int iAhead)
void AL_Encoder_NotifyUseLongTerm(AL_HEncoder hEnc)
void AL_Encoder_NotifyIsLongTerm(AL_HEncoder hEnc)
bool AL_Encoder_RestartGop(AL_HEncoder hEnc);
bool AL_Encoder_SetGopLength(AL_HEncoder hEnc, int iGopLength)
bool AL_Encoder_SetGopNumB(AL_HEncoder hEnc, int iNumB)
bool AL_Encoder_SetBitRate(AL_HEncoder hEnc, int iBitRate)
bool AL_Encoder_SetFrameRate(AL_HEncoder hEnc, uint16_t uFrameRate, uint16_t uClkRatio)
bool AL_Encoder_SetInputResolution (AL_HEncoder hEnc, AL_TDimension tDim);
bool AL_Encoder_SetQP(AL_HEncoder hEnc, int16_t iQP)
bool AL_Encoder_GetRecPicture(AL_HEncoder hEnc, TRecPic* pRecPic);
void AL_Encoder_ReleaseRecPicture(AL_HEncoder hEnc, TRecPic* pRecPic)
Decoder Flow
Decoder API
AL_ERR AL_Decoder_Create(AL_HDecoder* hDec, AL_TIDecChannel* pDecChannel, AL_TAllocator* pAllocator, AL_TDecSettings* pSettings, AL_TDecCallBacks* pCB)
AL_TDecCallBacks
AL_CB_EndDecoding
AL_CB_Display
AL_CB_ParsedSei
AL_CB_ResolutionFound
AL_TIDecChannel* AL_DecChannelMcu_Create()
AL_TIDecChannel
AL_CB_EndFrameDecoding
void AL_Decoder_Destroy(AL_HDecoder hDec)
void AL_Decoder_PutDisplayPicture(AL_HDecoder hDec, AL_TBuffer* pDisplay)
uint32_t AL_Decoder_GetMinPitch(uint32_t uWidth, uint8_t uBitDepth, AL_EFbStorageMode eFrameBufferStorageMode);
uint32_t AL_Decoder_GetMinStrideHeight(uint32_t uHeight);
bool AL_Decoder_PreallocateBuffers(AL_HDecoder hDec)
oid AL_Default_Decoder_EndDecoding(void* pUserParam, AL_TDecPicStatus* pStatus)
AL_TDecPicStatus
AL_TDecSettings
AL_TSrcMetaData* AL_SrcMetaData_Create(AL_TDimension tDim, AL_TPlane tYPlane, AL_TPlane tUVPlane, TFourCC tFourCC);
int AL_SrcMetaData_GetLumaSize(AL_TSrcMetaData* pMeta);
int AL_SrcMetaData_GetChromaSize(AL_TSrcMetaData* pMeta);
bool AL_Decoder_PushBuffer(AL_HDecoder hDec, AL_TBuffer* pBuf, size_t uSize,)
void AL_Decoder_Flush(AL_HDecoder hDec)
void AL_Decoder_GetMaxBD(AL_HDecoder hDec)
int32_t RndPitch(int32_t iWidth, uint8_t uBitDepth, AL_EFbStorageMode eFrameBufferStorageMode)
int32_t RndHeight(int32_t iHeight)
int AL_GetNumLCU(AL_TDimension tDim, uint8_t uLCUSize)
int AL_GetAllocSize_HevcCompData(AL_TDimension tDim, AL_EChromaMode eChromaMode)
int AL_GetAllocSize_AvcCompData(AL_TDimension tDim, AL_EChromaMode eChromaMode)
Function: int AL_GetAllocSize_DecCompMap(AL_TDimension tDim);
int AL_GetAllocSize_HevcMV(AL_TDimension tDim)
int AL_GetAllocSize_AvcMV(AL_TDimension tDim)
int AL_GetAllocSize_Frame(AL_TDimension tDim, AL_EChromaMode eChromaMode, uint8_t uBitDepth, bool bFrameBufferCompression, AL_EFbStorageMode eFrameBufferStorageMode)
int AL_GetAllocSize_DecReference(AL_TDimension tDim, int iPitch AL_EChromaMode eChromaMode, AL_EFbStorageMode eFrameBufferStorageMode)
uint32_t GetAllocSizeEP2(AL_TDimension tDim, uint8_t uMaxCuSize)
uint32_t AL_GetAllocSize(AL_TDimension tDim, uint8_t uBitDepth, AL_EChromaMode eChromaMode, AL_EFbStorageMode eStorageMode)
uint32_t GetAllocSize_Src(AL_TDimension tDim, uint8_t uBitDepth, AL_EChromaMode eChromaMode, AL_ESrcMode eSrcFmt)
int AL_CalculatePitchValue(int iWidth, uint8_t uBitDepth, AL_EFbStorageMode eStorageMode)
AL_EFbStorageMode AL_GetSrcStorageMode(AL_ESrcMode eSrcMode)
AL_ERR AL_Decoder_GetFrameError(AL_HDecoder hDec, AL_TBuffer* pBuf);
AL_ERR AL_Decoder_GetLastError(AL_HDecoder hDec)
Tuning Visual Quality
Optimum VCU Encoder Parameters for Use Cases
Video Streaming
AVC Encoder Performance Settings
Low Bitrate AVC Encoding
Example Design
VCU Out of the Box Examples
Verifying the Examples
Boot the Board (ZCU106/ZCU104) Using the PetaLinux BSP
Example-1: (Decode → Display)
Example-2: (USB-Camera → Encode → Decode → Display)
Example-3: (USB-Camera → Decode → Display)
Example-4: (Transcode → to File)
Example-5: (Transcode → Stream out using Ethernet ... Streaming In → Decode → Display)
Example-6: (Camera Audio/Video → Encode → File A/V Record)
Example-7: (Camera Audio/Video → Stream out using Ethernet ... Streaming In → Decode → Display with Audio)
Running Examples with Jupyter Notebook
Determining Pulse Audio and ALSA Device Names
Appendices
Codec Parameters for Different Use Cases
QoS Configurations
IRQ Balancing
Upgrading
2020.2 VCU Ctrl-SW API Migration
Modified APIs
New APIs
2020.1 VCU Ctrl-SW API Migration
Modified APIs
New APIs
Removed APIs
2019.2 VCU Ctrl-SW API Migration
Modified APIs
New APIs
2019.1 VCU Ctrl-SW API Migration
Modified APIs
New APIs
Removed APIs
Verification, Compliance, and Interoperability
Additional Resources and Legal Notices
Xilinx Resources
Documentation Navigator and Design Hubs
References
Training Resources
Revision History
Please Read: Important Legal Notices
Video streaming use-case requires very stable bitrate graph for all pictures.
Avoid periodic large Intra pictures during encoding session
Low-latency rate control (hardware RC) is the preferred control-rate for video streaming, it tries to maintain equal amount frame sizes for all pictures.
Avoid periodic Intra frames instead use low-delay-p (IPPPPP…) with Intra refresh
enable (gdr-mode=horizontal or vertical)
VBR is not a preferred mode of streaming.