The general structure of the solver is shown below:
The hardware design is parameterized with sizes N, M via the Makefile (see README.md in the FDBlackScholesLocalVolatility test directory). N is the number of grid points in the spatial (x) direction. This number in turn governs all of the internal vector and matrix sizing. M is the maximum number of time steps which can be supported by this particular parameterized build, but it is possible to pass in a smaller time-step value (tSteps) at runtime (with corresponding reductions in the data vectors). The host code provides the discrete grid points x[N], t[M] (remember that x=log(S) for this solver). Additionally, the host provides the maturity date T (in years), risk-free rate r[M], sigma[N*M] evaluated at each grid point in the x,t mesh, initial condition u[N], and the upper and lower boundary conditions to match the initial condition.
The engine will then calculate the dt, h deltas (a one-time step). Then, at each time step dt, the left and right hand matrices for the linear system Lu’ = Ru are calculated, where u is the current solution grid, u’ is the next time-step solution. Due to the use of central-differencing and the Dirichlet boundary conditions, these will be tridiagonal matrices. The right-hand side Ru is calculated (a simple tridiagonal array by vector multiplication) and the suitably discounted boundary conditions are applied. Finally the linear system is solved to get u’ which becomes u in the next iteration. Because of the tridiagonal matrices, the linear system is solved by making use of the efficient Parallel-Cyclic-Reduction (PCR) engine found in the L1 library.
Note that the engine supports selection of the solver method implemented via the Theta parameter which should be set to 0 for explit, 1 for fully-implicit, or 0.5 for Crank-Nicholson. Other values in the range 0 to 1 can be freely selected but are not commonly used.