Overview - 2.5 English

Semi-Ternary CAM Search v2.5 LogiCORE IP Product Guide (PG319)

Document ID
PG319
Release Date
2023-05-16
Version
2.5 English

The Semi-Ternary CAM Search IP core (STCAM) is a member of the family of CAMs provided by AMD. The family consists of four members:

Binary CAM (BCAM)
Used for exact matching. See the Binary CAM Search LogiCORE IP Product Guide (PG317).
Hardware-managed binary CAM (CBCAM)
Entries can be inserted/deleted without a software driver using hardware interface. Software managed inserts/deletes are also supported. The CBCAM software driver has no shadow memory of the CAM contents, and requires less memory and CPU resources. See the Binary CAM Search LogiCORE IP Product Guide (PG317).
Semi TCAM (STCAM)
Described in this document. The STCAM is fully flexible in terms of number, size and position of wildcard (don't care) fields. Every key bit has a corresponding mask bit. The number of allowed unique masks is however limited. This allows for considerable memory and logic optimizations.
Ternary CAM (TCAM)
The primary usage of TCAM is tables requiring full flexibility in terms of size and position of wildcard (don't care) fields. Every key bit has a corresponding mask bit stored together with the key. All entries can have different masks. TCAMs are used for Access Control List (ACL) type of lookups, requiring a large number of different masks. See the Ternary CAM Search LogiCORE IP Product Guide (PG318).

One or multiple instances of each type can be used inside the same FPGA. Different types can also be mixed inside the same FPGA. Each CAM type is optimized for its specific task in terms of hardware resource usage.

The STCAM stores (key, mask, priority, response) entries in either UltraRAM (URAM) or block RAM. The STCAM provides efficient use of FPGA resources compared to basic TCAM implementations that store the keys in flip-flops and use logic resources for parallel key comparison.

The Lookup interface of the STCAM receives a lookup key and outputs a result that contains a match flag indicating whether the masked lookup key matches the masked key of any entry in the STCAM. The width of the mask is the same as the key width. A cleared mask bit invalidates the corresponding key bit and makes it "don't care". Both the lookup key and the stored key are bit-wise ANDed with the mask prior to the bit-wise matching. The STCAM is pipelined so that it can process a Lookup Request every clock cycle.

If multiple entries are matched, the response value of the matching entry with the lowest priority is output. If two entries have the same priority, one of them is arbitrarily picked as winner. The API software ensures that two entries with the same masked key value can not be inserted.

The entries are read and written using a set of high-level API functions. The API functions are written in C and delivered as part of the IP. The API encapsulates the details of memory management and register access and provides a simple and efficient management interface. The API software with detailed documentation is found on the CAM IP product page. The user only provides the functions for basic hardware reads and writes to the API. This allows for flexible hardware mapping and the communications link between the API software and the hardware is designed to the users' specifications. The communication link could for instance be AXI4-Lite or PCIe® .

The STCAM design is highly configurable at compile time to make it suitable for a large variety of applications. The table below lists the configuration parameters.
Table 1. Configuration Parameters
Parameter Name Valid Range Description
KEY_WIDTH 10-992 bits

The width of the lookup key.

KEY_WIDTH + RESPONSE_WIDTH + PRIORITY_WIDTH + 1 cannot exceed 1536/2048 [BRAM/URAM]

RESPONSE_WIDTH 1-1024 bits

The width of the lookup response.

KEY_WIDTH + RESPONSE_WIDTH + PRIORITY_WIDTH + 1 cannot exceed 1536/2048 [BRAM/URAM]

PRIORITY_WIDTH 0-32 bits The width of the priority assigned to each entry.
NUM_MASKS 2-255 The number of unique masks. The CAM compiler generates a CAM supporting both the specified number of unique masks and the specified number of entries at the same time.
NUM_ENTRIES 1 - 1.25M The supported number of entries (depth). The number of entries is only limited by the available amount of memory in a FPGA SLR. The CAM compiler generates a CAM supporting both the specified number of unique masks and the specified number of entries at the same time.
MEMORY_PRIMITIVE BLOCK or ULTRA or AUTO The compiler selects the best suited type automatically. This can however be overridden as a user preference.
LOOKUP_RATE 15 - 600 Mlps This is the supported lookup rate of the instance (expressed in million lookups per second). In order to save resources it is important not to set the lookup rate higher than required.
LOOKUP_INTERFACE_FREQ 15-600 MHz This is the clock frequency of the Lookup Request and response interfaces.

LOOKUP_INTERFACE_FREQ >= LOOKUP_RATE

RAM_FREQ 15-600 MHz This is the clock frequency of the memories and the internal datapath. An optional, high frequency RAM clock enables time division of the hardware resources, leading to significant savings. See the TDM_FACTOR parameter.
TDM_FACTOR 1, 2, 4, 8, 16, or 32 The TDM_FACTOR is calculated as:

RAM_FREQ / LOOKUP_RATE = 1, 2, 4, 8, 16, or 32

The ratio is rounded downwards to the nearest power of two.

Example:

RAM clock frequency = 600, Lookup rate = 150 → TDM_FACTOR = 600 / 150 = 4

The RAM can be accessed four times per lookup, saving up to four times the RAM and logic resources for small table configurations.

CLOCKING_MODE SINGLE-CLOCK or DUAL_CLOCK The use of a separate RAM clock is optional. If RAM_FREQ = LOOKUP_INTERFACE_FREQ, then the single clock mode is enabled. In single clock mode, only the lookup interface clock is used for lookup interfaces, RAM and match logic.

All of these parameters are extracted from the P4 code and VitisNetP4 tool during compilation. If the STCAM is used without P4, these parameters need to be set prior to generating the hardware STCAM or calling the software STCAM API. VitisNetP4 ensures that the parameters used to generate the hardware STCAM and those used to create the software STCAM instance are synchronized. For standalone usage, you must guarantee that the parameters used to generate the hardware STCAM and the parameters used to call the software STCAM API are identical.