Implementation on FPGA - 2023.1 English

Vitis Libraries

Release Date
2023-12-20
Version
2023.1 English

As shown in the above summary of reference 2.5.1, we can know the implementation process of the Poly1305 algorithm. In order to improve the timing and reduce the latency for HLS implementation, we optimize the multiplication and modulo operations in the algorithm. The optimization idea is:

  • In order to improve the timing for multiplication operation, the multiplier and the multiplicand are separated into 27-bit and 18-bit arrays, and the result is multiplied to obtain the final result.
  • For modulo operation \(X=mod(A,P)\), where \(P=2^{130}-5\), we could take some tricks to reduce the latency. First, let \(N=\frac{A-X}{P}\), the equation can be transformed to \(X=A-NP=A-2^{130}N+5N\). Then, let \(X1=mod(A,2^{130})\) and \(N1=\frac{A}{2^{130}}\), we can get \(X2=X1+5*N1\). Finally, if \(X2<P\), there is \(X=X2\), otherwise, let \(A=X2\), then repeat the previous step. For more information, please refer to the code.

For the Poly1305, We provide two APIs: poly1305 and poly1305MultiChan.

  • poly1305 takes a 32-byte one-time key and a message and produces a 16-byte tag. This tag is used to authenticate the message.
  • poly1305MultiChan takes N 32-byte one-time keys and N messages and produces N 16-byte tags. These tags are used to authenticate the corresponding messages.