Fitting methods - 5.2 English - 68552

AOCL API Guide (68552)

Document ID
68552
Release Date
2025-12-29
Version
5.2 English

Different methods are available to compute the models. The method is chosen automatically by default but can be set manually using the optional parameter optim method (see Linear model options).

Direct solvers

  • QR: the standard MSE linear regression model can be computed using the QR factorization of the data matrix if no regularization term is required.

    \[X = QR,\]

    where \(Q\) is a \(n_{\mathrm{samples}} \times n_{\mathrm{features}}\) matrix with orthogonal columns and \(R\) is a \(n_{\mathrm{features}}\times n_{\mathrm{features}}\) triangular matrix.

  • SVD: the singular value decomposition can be used to compute standard MSE and ridge regression models.

    \[X = UDV^T,\]

    where \(U\) is an orthogonal matrix of size \(n_{\mathrm{samples}}\times n_{\mathrm{samples}}\), \(D\) is a \(n_{\mathrm{samples}}\times n_{\mathrm{features}}\) diagonal matrix whose elements are the non-negative singular values of \(X\) and \(V^T\) is an orthogonal matrix of size \(n_{\mathrm{features}} \times n_{\mathrm{features}}\).

  • Cholesky: the Cholesky decomposition can be used for normal and ridge regression when the data matrix is full-rank. It factorizes the symmetric positive-definite normal equations matrix \(X^TX\) into two triangular matrices. In linear models it can be used to find coefficients expressed as:

    \[\beta = (X^TX+\lambda)^{-1}X^Ty,\]

    where, after left multiplying by the expression inside of the inverse, we end up with a system of linear equations in the form \(Ax=B\). The left hand side can be factorized using the Cholesky decomposition as follows:

    \[X^TX+\lambda = LL^T,\]

    where \(L\) is a lower triangular matrix with real and positive diagonal entries. This matrix is then used to find a solution to the system of linear equations.

Iterative solvers

  • L-BFGS-B: a solver aimed at minimizing smooth nonlinear functions (Liu and Nocedal [1989]). It can be used to compute both MSE and logistic models with or without \(\ell_2\) regularization. It is not suitable when an \(\ell_1\) regularization term is required.

    Note

    Our prebuilt Windows Python wheels (https://www.amd.com/en/developer/aocl.html) do not include the L-BFGS-B solver. To access it, building from source is required. Source code and compilation instructions are available at amd/aocl-data-analytics. If you encounter issues, please e-mail us on toolchainsupport@amd.com.

  • Coordinate descent: a solver aimed at minimizing nonlinear functions. It is suitable for linear models with an \(\ell_1\) regularization term or elastic-net (Friedman et al. [2010], Friedman et al. [2007]).

    Note

    The coordinate descent method implemented is optimized to solve lasso or elastic-net problems. For ridge or unregularized problems the use of any other method is recommended.

  • Conjugate gradient: a solver aimed at finding a solution to a system of linear equations. It can be used to compute linear regression with or without \(\ell_2\) regularization.