The Semi-Ternary CAM Search IP core (STCAM) is a member of the family of CAMs provided by Xilinx® . The family consists of four members:
- Binary CAM (BCAM)
- Used for exact matching. See the Binary CAM Search LogiCORE IP Product Guide (PG317).
- Semi TCAM (STCAM)
- Described in this document. The STCAM is fully flexible in terms of number, size and position of wildcard (don't care) fields. Every key bit has a corresponding mask bit. The number of allowed unique masks is however limited. This allows for considerable memory and logic optimizations.
- Ternary CAM (TCAM)
- The primary usage of TCAM is tables requiring full flexibility in terms of size and position of wildcard (don't care) fields. Every key bit has a corresponding mask bit stored together with the key. All entries can have different masks. TCAMs are used for Access Control List (ACL) type of lookups, requiring a large number of different masks. See the Ternary CAM Search LogiCORE IP Product Guide (PG318).
One or multiple instances of each type can be used inside the same FPGA. Different types can also be mixed inside the same FPGA. Each CAM type is optimized for its specific task in terms of hardware resource usage.
The STCAM stores (key, mask, priority, response) entries in either UltraRAM (URAM) or block RAM. The STCAM provides efficient use of FPGA resources compared to basic TCAM implementations that store the keys in flip-flops and use logic resources for parallel key comparison.
The Lookup interface of the STCAM receives a lookup key and outputs a result that contains a match flag indicating whether the masked lookup key matches the masked key of any entry in the STCAM. The width of the mask is the same as the key width. A cleared mask bit invalidates the corresponding key bit and makes it "don't care". Both the lookup key and the stored key are bit-wise ANDed with the mask prior to the bit-wise matching. The STCAM is pipelined so that it can process a Lookup Request every clock cycle.
If multiple entries are matched, the response value of the matching entry with the lowest priority is output. If two entries have the same priority, one of them is arbitrarily picked as winner. The API software ensures that two entries with the same masked key value can not be inserted.
The entries are read and written using a set of high-level Application Programming Interface (API) functions. The API functions are written in C and delivered as part of the IP. The API encapsulates the details of memory management and register access and provides a simple and efficient management interface. The API software with detailed documentation is found on the CAM IP product page. The user only provides the functions for basic hardware reads and writes to the API. This allows for flexible hardware mapping and the communications link between the API software and the hardware is designed to the users' specifications. The communication link could for instance be AXI4-Lite or PCIe® .
Parameter Name | Valid Range | Description |
---|---|---|
KEY_WIDTH | 10-992 bits |
The width of the lookup key. KEY_WIDTH + RESPONSE_WIDTH + PRIORITY_WIDTH + 1 cannot exceed 1536/2048 [BRAM/URAM] |
RESPONSE_WIDTH | 1-1024 bits |
The width of the lookup response. RESPONSE_WIDTH + PRIORITY_WIDTH cannot exceed 1024. KEY_WIDTH + RESPONSE_WIDTH + PRIORITY_WIDTH + 1 cannot exceed 1536/2048 [BRAM/URAM] |
PRIORITY_WIDTH | 0-32 bits | The width of the priority assigned to each
entry. RESPONSE_WIDTH + PRIORITY_WIDTH cannot exceed 1024. |
NUM_MASKS | 2-255 | The number of unique masks. The CAM compiler generates a CAM supporting both the specified number of unique masks and the specified number of entries at the same time. |
NUM_ENTRIES | 1 - 1.25M | The supported number of entries (depth). The number of entries is only limited by the available amount of memory in a FPGA SLR. The CAM compiler generates a CAM supporting both the specified number of unique masks and the specified number of entries at the same time. |
MEMORY_PRIMITIVE | BLOCK or ULTRA or AUTO | The compiler selects the best suited type automatically. This can however be overridden as a user preference. |
LOOKUP_RATE | 15 - 600 Mlps | This is the supported lookup rate of the instance (expressed in million lookups per second). In order to save resources it is important not to set the lookup rate higher than required. |
LOOKUP_INTERFACE_FREQ | 15-600 MHz | This is the clock frequency of the Lookup Request and response
interfaces. LOOKUP_INTERFACE_FREQ >= LOOKUP_RATE |
RAM_FREQ | 15-600 MHz | This is the clock frequency of the memories and the internal datapath. An optional, high frequency RAM clock enables time division of the hardware resources, leading to significant savings. See the TDM_FACTOR parameter. |
TDM_FACTOR | 1, 2, 4, 8, 16, or 32 | The TDM_FACTOR is calculated as: RAM_FREQ / LOOKUP_RATE = 1, 2, 4, 8, 16, or 32 The ratio is rounded downwards to the nearest power of two. Example: RAM clock frequency = 600, Lookup rate = 150 → TDM_FACTOR = 600 / 150 = 4 The RAM can be accessed four times per lookup, saving up to four times the RAM and logic resources for small table configurations. |
CLOCKING_MODE | SINGLE-CLOCK or DUAL_CLOCK | The use of a separate RAM clock is optional. If RAM_FREQ = LOOKUP_INTERFACE_FREQ, then the single clock mode is enabled. In single clock mode only the lookup interface clock is used for lookup interfaces, RAM and match logic. |
All of these parameters are extracted from the P4 code and VitisNetP4 tool during compilation. If the STCAM is used without P4, these parameters need to be set prior to generating the hardware STCAM or calling the software STCAM API. VitisNetP4 ensures that the parameters used to generate the hardware STCAM and those used to create the software STCAM instance are synchronized. For standalone usage, the user must guarantee that the parameters used to generate the hardware STCAM and the parameters used to call the software STCAM API are identical.