Linear Model APIs - 5.2 English - 68552

AOCL API Guide (68552)

Document ID
68552
Release Date
2025-12-29
Version
5.2 English
class aoclda.linear_model.linmod(mod, intercept=False, solver='auto', scaling='auto', max_iter=None, constraint='ssc', reg_lambda=0.0, reg_alpha=0.0, warm_start=False, tol=1.0e-4, progress_factor=None, check_data=False)#

Linear models.

Note that linear models currently do not accept array slices (e.g. X[0:2, 0:3]) as input. Please use copies (e.g. X[0:2, 0:3].copy()) instead.

Parameters:
  • mod (str) –

    Which linear model to compute.

    • If linmod_model='mse' then \(\ell_2\) norm linear regression is calculated.

    • If linmod_model='logistic' then logistic regression is calculated.

  • intercept (bool, optional) – Controls whether to add an intercept variable to the model. Default=False.

  • solver (str, optional) –

    Which solver to use to compute the coefficients. It can take values ‘auto’, ‘svd’, ‘cholesky’, ‘sparse_cg’, ‘qr’, ‘coord’, ‘lbfgs’. Some solvers are not suitable for some regularization types.

    • 'auto' chooses the best solver based on regression type and data.

    • 'svd' works with normal and ridge regression. The most robust solver but the least efficient.

    • 'cholesky' works with normal and ridge regression. Will return an error when a singular matrix is encountered.

    • 'sparse_cg' works with normal and ridge regression. You may need to set a smaller tol if a badly conditioned matrix is encountered.

    • 'qr' works with normal linear regression only. Will return an error when an underdetermined system is encountered.

    • 'coord' works with all regression types. Requires data to have variance of 1 column-wise (this can be achieved with the scaling option set to scale only). In the case of normal linear regression and an underdetermined system it will converge to a solution that is not necessarily a minimum norm solution.

    • 'lbfgs' works with normal and ridge regression. In the case of normal linear regression and an underdetermined system it will converge to a solution that is not necessarily a minimum norm solution.

  • scaling (str, optional) – What type of preprocessing you want to apply on the dataset. Available options are: ‘none’, ‘centering’, ‘scale_only’, ‘standardize’.

  • max_iter (int, optional) – Maximum number of iterations. Applies only to iterative solvers: ‘sparse_cg’, ‘coord’, ‘lbfgs’. The default value depends on the solver. For ‘sparse_cg’ it is 500, for ‘lbfgs’ and ‘coord’ it is 10000.

  • constraint (str, optional) –

    Affects only multinomial logistic regression. The type of constraint put on coefficients. This will affect the number of coefficients returned.

    • 'rsc' means we choose a reference category whose coefficients will be set to all 0. This results in K-1 class coefficients for K class problems.

    • 'ssc' means the sum of coefficients class-wise for each feature is 0. It will result in K class coefficients for K class problems.

  • reg_lambda (float, optional) – \(\lambda\), the magnitude of the regularization term. Default=0.0.

  • reg_alpha (float, optional) – \(\alpha\), the share of the \(\ell_1\) term in the regularization. Default=0.0.

  • warm_start (bool, optional) – Reuse coefficients from the previous run when using the same object to compute coefficients on the new data. Applies only to iterative solvers. For more details refer to initial coefficients section of linear models documentation.

  • tol (float, optional) – Convergence tolerance for iterative solvers. Applies only to iterative solvers: ‘sparse_cg’, ‘coord’, ‘lbfgs’. Default=1.0e-4.

  • progress_factor (float, optional) – Applies only to ‘lbfgs’ and ‘coord’ solver. Factor used to detect convergence of the iterative optimization step. Default=None.

  • check_data (bool, optional) – Whether to check the data for NaNs. Default = False.

property coef#

contains the output coefficients of the model. Its shape depends on a problem being solved. When linmod_model='mse' or when linmod_model='logistic' but the data has 2 classes, it is 1D ndarray of shape (, ncoef) where ncoef is nfeat+intercept. Otherwise, for K-class problem, it is 2D ndarray of shape (nrows, ncoef) where nrows is K-1 if constraint='rsc' and K otherwise.

Type:

numpy.ndarray

property dual_coef#

contains the dual coefficients of the model. Only valid for CG solver and undertermined problems.

Type:

numpy.ndarray

fit(X, y, x0=None)#

Computes the chosen linear model on the feature matrix X and response vector y

Parameters:
  • X (array-like) – The feature matrix on which to compute the model. Its shape is (n_samples, n_features).

  • y (array-like) – The response vector. Its shape is (n_samples).

  • x0 (array-like, optional) – Initial guess for solution. Applies only to iterative solvers. The required shape depends on the problem that is being solved (look at coef attribute). If None then x0 is set to a vector of 0. Default=None.

Returns:

Returns the instance itself.

Return type:

self (object)

property loss#

The value of loss function \(L(\beta_0, \beta)\).

Type:

numpy.ndarray of shape (1, )

property n_iter#

The number iterations performed to find the solution. Only valid for iterative solvers.

Type:

int

property nrm_gradient_loss#

The norm of the gradient of the loss function. Only valid for iterative solvers.

Type:

numpy.ndarray of shape (1, )

predict(X)#

Evaluate the model on a data set X.

Parameters:

X (array-like) – The feature matrix to evaluate the model on. It must have n_features columns.

Returns:

The prediction vector, where n_samples is the number of rows of X.

Return type:

numpy.ndarray of length n_samples

property time#

Compute time (wall clock time in seconds).

Type:

numpy.ndarray of shape (1, )

da_status da_linmod_select_model_s(da_handle handle, linmod_model mod)#
da_status da_linmod_select_model_d(da_handle handle, linmod_model mod)#

Select which linear model to compute.

The last suffix of the function name marks the floating point precision on which the handle operates (see precision section).

The model definition can be further enhanced with elements such as a regularization term by setting up optional parameters. See the linear model options section for more information.

Parameters:
Returns:

da_status. The function returns:

da_status da_linmod_define_features_s(da_handle handle, da_int n_samples, da_int n_features, const float *X, da_int ldx, const float *y)#
da_status da_linmod_define_features_d(da_handle handle, da_int n_samples, da_int n_features, const double *X, da_int ldx, const double *y)#

Define the data to train a linear model.

The last suffix of the function name marks the floating point precision on which the handle operates (see precision section).

Pass pointers to a data matrix X containing n_samples observations (rows) over n_features features (columns) and a response vector y of size n_samples.

Only the pointers to X and y are stored; an internal copy may be made depending on the model, solver and scaling method selected.

Parameters:
  • handle[inout] a da_handle object, initialized with type da_handle_linmod.

  • n_samples[in] the number of observations (rows) of the data matrix X. Constraint: n_samples \(\ge\) 1.

  • n_features[in] the number of features (columns) of the data matrix, X. Constraint: n_features \(\ge\) 1.

  • X[in] the n_samples \(\times\) n_feat data matrix. For best performance store in column-major order, can be changed by setting storage order option to row-major.

  • ldx[in] the leading dimension of the data matrix X. Constraint: ldx \(\ge\) n_samples if the data is stored in column-major order, or ldx \(\ge\) n_features if the data is stored in row-major order.

  • y[in] the response vector, of size n_samples.

Returns:

da_status. The function returns:

da_status da_linmod_fit_s(da_handle handle)#
da_status da_linmod_fit_d(da_handle handle)#

Fit the linear model defined in the handle.

Compute the linear model defined by da_linmod_select_model_? on the data passed by the last call to the function da_linmod_define_features_?.

Note that you can customize the model before using the fit function through the use of optional parameters, see this section for a list of available options (e.g., the regularization terms).

Parameters:

handle[inout] a da_handle object, initialized with type da_handle_linmod.

Returns:

da_status. The function returns:

da_status da_linmod_fit_start_s(da_handle handle, da_int n_coefs, const float *coefs)#
da_status da_linmod_fit_start_d(da_handle handle, da_int n_coefs, const double *coefs)#

Fit the linear model defined in the handle using a custom starting estimate for the model coefficients.

Compute the same model as da_linmod_fit_?, starting the fitting process with the custom values defined in coefs.

Parameters:
  • handle[inout] a da_handle object, initialized with type da_handle_linmod.

  • n_coefs[in] the number of coefficients provided in coefs. It must match the number of expected coefficients for the model defined in handle to be taken into account.

  • coefs[in] the initial coefficients.

Returns:

da_status. The function returns:

da_status da_linmod_evaluate_model_s(da_handle handle, da_int n_samples, da_int n_features, const float *X, da_int ldx, float *predictions, float *observations, float *loss)#
da_status da_linmod_evaluate_model_d(da_handle handle, da_int n_samples, da_int n_features, const double *X, da_int ldx, double *predictions, double *observations, double *loss)#

Evaluate the model previously computed on a new set of data X and observations y.

After a model has been fitted using da_linmod_fit_?, it can be evaluated on a new set of data and observations. This function returns the model evaluation (loss) in the array loss and the predictions in predictions.

In the case where the model chosen solves a classification problem (e.g., logistic regression), the predictions computed will be categorical. For each data point i, prediction[i] will contain the index of the most likely class according to the model.

Parameters:
  • handle[inout] a da_handle object, initialized with type da_handle_linmod.

  • n_samples[in] number of rows of X or equivalently the number of samples to estimate the model on.

  • n_features[in] number of columns of X or equivalently the number of features of the test data. It must match the number features of the data defined in the handle.

  • X[in] the n_samples \(\times\) n_feat data matrix. For best performance store in column-major order, can be changed by setting storage order option to row-major.

  • ldx[in] the leading dimension of the data matrix X. Constraint: ldx \(\ge\) n_samples if the data is stored in column-major order, or ldx \(\ge\) n_features if the data is stored in row-major order.

  • predictions[out] vector of size n_samples containing the model’s prediction.

  • observations[in] vector of size n_samples containing new observations; may be NULL if none are provided.

  • loss[out] scalar containing the model’s loss given the new data X and the new observations y; may be NULL if no observations are provided. Note that either both observations and loss parameters are NULL or both must contain a valid address.

Returns:

da_status

typedef enum linmod_model_ linmod_model#

Alias for the linmod_model_ enum.

enum linmod_model_#

Defines which linear model is computed.

Values:

enumerator linmod_model_undefined#

No linear model set.

enumerator linmod_model_mse#

\(L_2\) norm linear regression.

enumerator linmod_model_logistic#

Logistic regression.

enum da_linmod_info_t_#

Indices of the information vector containing metrics from optimization solvers.

The information vector can be retrieved after a successful return from the fit function da_linmod_fit_? by querying the handle, using da_handle_get_result_? and passing the query da_result_::da_rinfo.

Values:

enumerator linmod_info_objective#

objective value

enumerator linmod_info_grad_norm#

norm of the objective gradient

enumerator linmod_info_iter#

number of iterations

enumerator linmod_info_time#

current time

enumerator linmod_info_nevalf#

number of objective function callback evaluations

enumerator linmod_info_inorm#

infinity norm of gradient

enumerator linmod_info_inorm_init#

infinity norm of gradient at the initial iteration

enumerator linmod_info_ncheap#

evaluations requesting “cheap” update (only coordinate descent solver)

number of objective function callback

enumerator linmod_info_optim#

optimality measure (only coordinate descent solver)

enumerator linmod_info_optimcnt#

number of optimality checks (only coordinate descent solver)

enumerator linmod_info_nsamples#

number of rows in the input matrix

enumerator linmod_info_nfeat#

number of columns in the input matrix

enumerator linmod_info_nclass#

number of classes in the input data (0 when regression problem)

enumerator linmod_info_nrow_coef#

number of rows of the coefficient array

enumerator linmod_info_ncol_coef#

number of columns of the coefficient array

enumerator linmod_info_well_determined#

flag indicating if the problem is well determined

enumerator linmod_info_number#

for internal use