While FPGAs can be programmed using lower-level Hardware Description Languages (HDLs) such as Verilog or VHDL, there are now several High-Level Synthesis (HLS) tools that can take an algorithmic description written in a higher-level language like C/C++ and convert it into lower-level hardware description languages such as Verilog or VHDL. This can then be processed by downstream tools to program the FPGA device. The main benefit of this type of flow is that you can retain the advantages of the programming language like C/C++ to write efficient code that can then be translated into hardware. Additionally, writing good code is the software designer's forte and is easier than learning a new hardware description language.
A program written in C/C++ is essentially written for the von Neumann style of architecture where each instruction in the user's program is executed sequentially. In order to achieve high performance, the HLS tool must infer parallelism in the sequential code and exploit it to achieve greater performance. This is not an easy problem to solve. In addition, a good software programmer writes their program with well-defined rules and practices such as RTTI, recursion, and dynamic memory allocation. Many of these techniques have no direct equivalency in hardware and presents challenges for the HLS tool. This also means that arbitrary, off-the-shelf software cannot be efficiently converted into hardware. At a bare minimum, such software needs to be examined for non-synthesizable constructs and the code needs to be refactored to make it synthesizable.
Now even if the software program can be automatically converted (or synthesized) into hardware, achieving acceptable quality of results (QoR), will require additional work such as rewriting the software to help the HLS tool achieve the desired performance goals. To help, you need to understand the best practices for writing good software for execution on the FPGA device. The next few sections will discuss how you can first identify some macro-level architectural optimizations to structure your program and then focus on some fine-grained micro-level architectural optimizations to boost your performance goals.