The Semi-Ternary CAM Search IP core (STCAM) is a member of the family of CAMs provided by AMD. The family consists of four members:
- Binary CAM (BCAM)
- Used for exact matching, BCAM is available in two versions. A software-managed version and a hardware-managed version (CBCAM). CBCAM offers you the flexibility to insert or delete entries using a hardware interface with or without a software driver. For more information, see Binary CAM Search LogiCORE IP Product Guide (PG317).
- Cached DRAM Binary CAM (CDBCAM)
- Used for exact matching, CDBCAM is similar to BCAM except that it uses DRAM as the primary storage for entries, whereas BCAM uses URAM or block RAM (BRAM). CDBCAM can store more entries, and, in combination with its on-chip BCAM cache, it can achieve lookup rates comparable to the BCAM. Similar to the BCAM, the CDBCAM supports both a software managed and hardware managed interface. For further information, see Cached DRAM Binary CAM LogiCORE IP Product Guide (PG427).
- Semi TCAM (STCAM)
- Described in this document. It is also available in two versions, one with fixed rate and latency and the other with variable rate and latency for low-cost applications. The low-cost version also supports ranges to avoid costly entry explosion to cover ranges. The STCAM is fully flexible in terms of number, size and position of wildcard (ignored) fields. Every key bit has a corresponding mask bit. The number of allowed unique masks is however limited. This allows for considerable memory and logic optimizations.
- Ternary CAM (TCAM)
- The primary usage of TCAM is tables requiring full flexibility in terms of size and position of wildcard (ignored) fields. Every key bit has a corresponding mask bit stored together with the key. All entries can have different masks. TCAMs are used for Access Control List (ACL) type of lookups, requiring a large number of different masks. See the Ternary CAM Search LogiCORE IP Product Guide (PG318).
One or multiple instances of each type can be used inside the same FPGA. Different types can also be mixed inside the same FPGA. Each CAM type is optimized for its specific task in terms of hardware resource usage.
The STCAM stores (key, mask, priority, response) entries in either UltraRAM (URAM) or block RAM.
The Lookup interface of the STCAM receives a lookup key and outputs a result that contains a match flag indicating whether the masked lookup key matches the masked key of any entry in the STCAM. The width of the mask is the same as the key width. A cleared mask bit invalidates the corresponding key bit and ignores it. Both the lookup key and the stored key are bit-wise ANDed with the mask prior to the bit-wise matching. The STCAM is pipelined so that it can process a Lookup Request every clock cycle.
If multiple entries are matched, the response value of the matching entry with the lowest priority is output. If two entries have the same priority, one of them is arbitrarily picked as winner. The API software ensures that two entries with the same masked key value can not be inserted.
The entries are read and written using a set of high-level API functions. The API functions are written in C and delivered as part of the IP. The API encapsulates the details of memory management and register access and provides a simple and efficient management interface. The API software with detailed documentation is found on the CAM IP product page. The user only provides the functions for basic hardware reads and writes to the API. This allows for flexible hardware mapping and the communications link between the API software and the hardware is designed to the users' specifications. The communication link could for instance be AXI4-Lite or PCIe® .
Parameter Name | Valid Range | Description |
---|---|---|
KEY_WIDTH | 10–992 bits |
The width of the lookup key. KEY_WIDTH + RESPONSE_WIDTH + PRIORITY_WIDTH + CTRL cannot exceed 2048 CTRL = 1 (VARIABLE_RATE = FALSE) CTRL = up to 11 + total width of ranges (VARIABLE_RATE = TRUE) |
RESPONSE_WIDTH | 1–1024 bits |
The width of the lookup response. KEY_WIDTH + RESPONSE_WIDTH + PRIORITY_WIDTH + CTRL cannot exceed 2048 CTRL = 1 (VARIABLE_RATE = FALSE) CTRL = up to 11 + total width of ranges (VARIABLE_RATE = TRUE) |
VARIABLE_RATE | TRUE/FALSE |
There are two use cases of the STCAM.
|
FORMAT_STRING | NA | When format string is used, ranges are enabled and the key width is specified by means of the format string. For more information, see section Format String in Designing with the Core. |
PRIORITY_WIDTH | 0–32 bits | The width of the priority assigned to each entry. |
NUM_MASKS |
1–256 (VARIABLE_RATE = FALSE) 1–1024 (VARIABLE_RATE = TRUE) |
The number of unique masks. The CAM compiler generates a CAM
supporting both the specified number of unique masks and the specified number of
entries at the same time. NUM_MASKS can be omitted if a format string is used. If omitted, NUM_MASKS is set to a default value based on conservative ACLs using NUM_ENTRIES rules. |
NUM_ENTRIES | 1–1M | The supported number of entries (depth). The CAM compiler generates a CAM supporting both the specified number of unique masks and the specified number of entries at the same time. |
MEMORY_PRIMITIVE | BLOCK or ULTRA or AUTO | The compiler selects the best suited type automatically. This can however be overridden as a user preference. |
LOOKUP_RATE | 1–600 Mlps | This is the supported lookup rate of the instance (expressed
in million lookups per second). In order to save resources it is important not to
set the lookup rate higher than required. For VARIABLE_RATE = TRUE, the specified LOOKUP_RATE is sustained if the following conditions are met for the lookup key: Max 2 overlapping rules on average for TDM_FACTOR = 1 Max TDM_FACTOR overlapping rules on average for TDM_FACTOR > 1 |
LOOKUP_INTERFACE_FREQ | 1–600 MHz | This is the clock frequency of the Lookup Request and response
interfaces. LOOKUP_INTERFACE_FREQ >= LOOKUP_RATE |
RAM_FREQ | 1–600 MHz | This is the clock frequency of the memories and the internal datapath. An optional, high frequency RAM clock enables time division of the hardware resources, leading to significant savings. See the TDM_FACTOR parameter. |
TDM_FACTOR | 1–256 | The TDM_FACTOR is calculated as: RAM_FREQ / LOOKUP_RATE The ratio is rounded downwards to the nearest power of two. Example: RAM clock frequency = 600, Lookup rate = 150 → TDM_FACTOR = 600 / 150 = 4 The RAM can be accessed four times per lookup, saving up to four times the RAM and logic resources for small table configurations. |
CLOCKING_MODE | SINGLE_CLOCK or DUAL_CLOCK | The use of a separate RAM clock is optional. If RAM_FREQ = LOOKUP_INTERFACE_FREQ, then the single clock mode is enabled. In single clock mode, only the lookup interface clock is used for lookup interfaces, RAM and match logic. |
All of these parameters are extracted from the P4 code and VitisNetP4 tool during compilation. If the STCAM is used without P4, these parameters need to be set prior to generating the hardware STCAM or calling the software STCAM API. VitisNetP4 ensures that the parameters used to generate the hardware STCAM and those used to create the software STCAM instance are synchronized. For standalone usage, you must guarantee that the parameters used to generate the hardware STCAM and the parameters used to call the software STCAM API are identical.