- class aoclda.linear_model.linmod(mod, intercept=False, solver='auto', scaling='auto', max_iter=None, constraint='ssc', reg_lambda=0.0, reg_alpha=0.0, warm_start=False, tol=1.0e-4, progress_factor=None, check_data=False)#
Linear models.
Note that linear models currently do not accept array slices (e.g.
X[0:2, 0:3]) as input. Please use copies (e.g.X[0:2, 0:3].copy()) instead.- Parameters:
mod (str) –
Which linear model to compute.
If
linmod_model='mse'then \(\ell_2\) norm linear regression is calculated.If
linmod_model='logistic'then logistic regression is calculated.
intercept (bool, optional) – Controls whether to add an intercept variable to the model. Default=False.
solver (str, optional) –
Which solver to use to compute the coefficients. It can take values ‘auto’, ‘svd’, ‘cholesky’, ‘sparse_cg’, ‘qr’, ‘coord’, ‘lbfgs’. Some solvers are not suitable for some regularization types.
'auto'chooses the best solver based on regression type and data.'svd'works with normal and ridge regression. The most robust solver but the least efficient.'cholesky'works with normal and ridge regression. Will return an error when a singular matrix is encountered.'sparse_cg'works with normal and ridge regression. You may need to set a smaller tol if a badly conditioned matrix is encountered.'qr'works with normal linear regression only. Will return an error when an underdetermined system is encountered.'coord'works with all regression types. Requires data to have variance of 1 column-wise (this can be achieved with the scaling option set to scale only). In the case of normal linear regression and an underdetermined system it will converge to a solution that is not necessarily a minimum norm solution.'lbfgs'works with normal and ridge regression. In the case of normal linear regression and an underdetermined system it will converge to a solution that is not necessarily a minimum norm solution.
scaling (str, optional) – What type of preprocessing you want to apply on the dataset. Available options are: ‘none’, ‘centering’, ‘scale_only’, ‘standardize’.
max_iter (int, optional) – Maximum number of iterations. Applies only to iterative solvers: ‘sparse_cg’, ‘coord’, ‘lbfgs’. The default value depends on the solver. For ‘sparse_cg’ it is 500, for ‘lbfgs’ and ‘coord’ it is 10000.
constraint (str, optional) –
Affects only multinomial logistic regression. The type of constraint put on coefficients. This will affect the number of coefficients returned.
'rsc'means we choose a reference category whose coefficients will be set to all 0. This results in K-1 class coefficients for K class problems.'ssc'means the sum of coefficients class-wise for each feature is 0. It will result in K class coefficients for K class problems.
reg_lambda (float, optional) – \(\lambda\), the magnitude of the regularization term. Default=0.0.
reg_alpha (float, optional) – \(\alpha\), the share of the \(\ell_1\) term in the regularization. Default=0.0.
warm_start (bool, optional) – Reuse coefficients from the previous run when using the same object to compute coefficients on the new data. Applies only to iterative solvers. For more details refer to initial coefficients section of linear models documentation.
tol (float, optional) – Convergence tolerance for iterative solvers. Applies only to iterative solvers: ‘sparse_cg’, ‘coord’, ‘lbfgs’. Default=1.0e-4.
progress_factor (float, optional) – Applies only to ‘lbfgs’ and ‘coord’ solver. Factor used to detect convergence of the iterative optimization step. Default=None.
check_data (bool, optional) – Whether to check the data for NaNs. Default = False.
- property coef#
contains the output coefficients of the model. Its shape depends on a problem being solved. When
linmod_model='mse'or whenlinmod_model='logistic'but the data has 2 classes, it is 1D ndarray of shape (, ncoef) where ncoef is nfeat+intercept. Otherwise, for K-class problem, it is 2D ndarray of shape (nrows, ncoef) where nrows is K-1 ifconstraint='rsc'and K otherwise.- Type:
numpy.ndarray
- property dual_coef#
contains the dual coefficients of the model. Only valid for CG solver and undertermined problems.
- Type:
numpy.ndarray
- fit(X, y, x0=None)#
Computes the chosen linear model on the feature matrix
Xand response vectory- Parameters:
X (array-like) – The feature matrix on which to compute the model. Its shape is (n_samples, n_features).
y (array-like) – The response vector. Its shape is (n_samples).
x0 (array-like, optional) – Initial guess for solution. Applies only to iterative solvers. The required shape depends on the problem that is being solved (look at coef attribute). If None then x0 is set to a vector of 0. Default=None.
- Returns:
Returns the instance itself.
- Return type:
self (object)
- property loss#
The value of loss function \(L(\beta_0, \beta)\).
- Type:
numpy.ndarray of shape (1, )
- property n_iter#
The number iterations performed to find the solution. Only valid for iterative solvers.
- Type:
int
- property nrm_gradient_loss#
The norm of the gradient of the loss function. Only valid for iterative solvers.
- Type:
numpy.ndarray of shape (1, )
- predict(X)#
Evaluate the model on a data set
X.- Parameters:
X (array-like) – The feature matrix to evaluate the model on. It must have n_features columns.
- Returns:
The prediction vector, where n_samples is the number of rows of
X.- Return type:
numpy.ndarray of length n_samples
- property time#
Compute time (wall clock time in seconds).
- Type:
numpy.ndarray of shape (1, )
-
da_status da_linmod_select_model_s(da_handle handle, linmod_model mod)#
-
da_status da_linmod_select_model_d(da_handle handle, linmod_model mod)#
Select which linear model to compute.
The last suffix of the function name marks the floating point precision on which the handle operates (see precision section).
The model definition can be further enhanced with elements such as a regularization term by setting up optional parameters. See the linear model options section for more information.
- Parameters:
handle – [inout] a da_handle object, initialized with type da_handle_linmod.
mod – [in] a linmod_model enum type to select the linear model.
- Returns:
da_status. The function returns:
da_status_success - the operation was successfully completed.
da_status_wrong_type - the floating point precision of the arguments is incompatible with the
handleinitialization.da_status_invalid_pointer - the
handlehas not been correctly initialized.
-
da_status da_linmod_define_features_s(da_handle handle, da_int n_samples, da_int n_features, const float *X, da_int ldx, const float *y)#
-
da_status da_linmod_define_features_d(da_handle handle, da_int n_samples, da_int n_features, const double *X, da_int ldx, const double *y)#
Define the data to train a linear model.
The last suffix of the function name marks the floating point precision on which the handle operates (see precision section).
Pass pointers to a data matrix
Xcontainingn_samplesobservations (rows) overn_featuresfeatures (columns) and a response vectoryof sizen_samples.Only the pointers to
Xandyare stored; an internal copy may be made depending on the model, solver and scaling method selected.- Parameters:
handle – [inout] a da_handle object, initialized with type da_handle_linmod.
n_samples – [in] the number of observations (rows) of the data matrix
X. Constraint:n_samples\(\ge\) 1.n_features – [in] the number of features (columns) of the data matrix,
X. Constraint:n_features\(\ge\) 1.X – [in] the
n_samples\(\times\)n_featdata matrix. For best performance store in column-major order, can be changed by setting storage order option to row-major.ldx – [in] the leading dimension of the data matrix
X. Constraint:ldx\(\ge\)n_samplesif the data is stored in column-major order, orldx\(\ge\)n_featuresif the data is stored in row-major order.y – [in] the response vector, of size
n_samples.
- Returns:
da_status. The function returns:
da_status_success - the operation was successfully completed.
da_status_wrong_type - the floating point precision of the arguments is incompatible with the
handleinitialization.da_status_invalid_pointer - the
handlehas not been correctly initialized.da_status_invalid_input - one of the arguments had an invalid value. You can obtain further information using da_handle_print_error_message.
-
da_status da_linmod_fit_d(da_handle handle)#
Fit the linear model defined in the
handle.Compute the linear model defined by da_linmod_select_model_? on the data passed by the last call to the function da_linmod_define_features_?.
Note that you can customize the model before using the fit function through the use of optional parameters, see this section for a list of available options (e.g., the regularization terms).
- Parameters:
handle – [inout] a da_handle object, initialized with type da_handle_linmod.
- Returns:
da_status. The function returns:
da_status_success - the operation was successfully completed.
da_status_wrong_type - the floating point precision of the arguments is incompatible with the
handleinitialization.da_status_invalid_pointer - the
handlehas not been correctly initialized.da_status_incompatible_options - some of the options set are incompatible with the model defined in
handle. You can obtain further information using da_handle_print_error_message.da_status_memory_error - internal memory allocation encountered a problem.
da_status_internal_error - an unexpected error occurred.
-
da_status da_linmod_fit_start_d(da_handle handle, da_int n_coefs, const double *coefs)#
Fit the linear model defined in the
handleusing a custom starting estimate for the model coefficients.Compute the same model as da_linmod_fit_?, starting the fitting process with the custom values defined in
coefs.- Parameters:
handle – [inout] a da_handle object, initialized with type da_handle_linmod.
n_coefs – [in] the number of coefficients provided in coefs. It must match the number of expected coefficients for the model defined in
handleto be taken into account.coefs – [in] the initial coefficients.
- Returns:
da_status. The function returns:
da_status_success - the operation was successfully completed.
da_status_wrong_type - the floating point precision of the arguments is incompatible with the
handleinitialization.da_status_invalid_pointer - the
handlehas not been correctly initialized.da_status_incompatible_options - some of the options set are incompatible with the model defined in
handle. You can obtain further information using da_handle_print_error_message.da_status_memory_error - internal memory allocation encountered a problem.
da_status_internal_error - an unexpected error occurred.
-
da_status da_linmod_evaluate_model_s(da_handle handle, da_int n_samples, da_int n_features, const float *X, da_int ldx, float *predictions, float *observations, float *loss)#
-
da_status da_linmod_evaluate_model_d(da_handle handle, da_int n_samples, da_int n_features, const double *X, da_int ldx, double *predictions, double *observations, double *loss)#
Evaluate the model previously computed on a new set of data
Xand observations y.After a model has been fitted using da_linmod_fit_?, it can be evaluated on a new set of data and observations. This function returns the model evaluation (loss) in the array
lossand the predictions inpredictions.In the case where the model chosen solves a classification problem (e.g., logistic regression), the predictions computed will be categorical. For each data point
i,prediction[i]will contain the index of the most likely class according to the model.- Parameters:
handle – [inout] a da_handle object, initialized with type da_handle_linmod.
n_samples – [in] number of rows of
Xor equivalently the number of samples to estimate the model on.n_features – [in] number of columns of
Xor equivalently the number of features of the test data. It must match the number features of the data defined in thehandle.X – [in] the
n_samples\(\times\)n_featdata matrix. For best performance store in column-major order, can be changed by setting storage order option to row-major.ldx – [in] the leading dimension of the data matrix
X. Constraint:ldx\(\ge\)n_samplesif the data is stored in column-major order, orldx\(\ge\)n_featuresif the data is stored in row-major order.predictions – [out] vector of size
n_samplescontaining the model’s prediction.observations – [in] vector of size
n_samplescontaining new observations; may beNULLif none are provided.loss – [out] scalar containing the model’s loss given the new data
Xand the new observationsy; may beNULLif no observations are provided. Note that either bothobservationsandlossparameters areNULLor both must contain a valid address.
- Returns:
da_status
da_status_success - the operation was successfully completed.
da_status_wrong_type - the floating point precision of the arguments is incompatible with the
handleinitialization.da_status_invalid_pointer - the
handlehas not been correctly initialized.da_status_invalid_input - one of the arguments had an invalid value. You can obtain further information using da_handle_print_error_message.
da_status_out_of_date - the model has not been trained yet.
-
typedef enum linmod_model_ linmod_model#
Alias for the linmod_model_ enum.
-
enum linmod_model_#
Defines which linear model is computed.
Values:
-
enumerator linmod_model_undefined#
No linear model set.
-
enumerator linmod_model_mse#
\(L_2\) norm linear regression.
-
enumerator linmod_model_logistic#
Logistic regression.
-
enumerator linmod_model_undefined#
-
enum da_linmod_info_t_#
Indices of the information vector containing metrics from optimization solvers.
The information vector can be retrieved after a successful return from the fit function
da_linmod_fit_?by querying the handle, usingda_handle_get_result_?and passing the queryda_result_::da_rinfo.Values:
-
enumerator linmod_info_objective#
objective value
-
enumerator linmod_info_grad_norm#
norm of the objective gradient
-
enumerator linmod_info_iter#
number of iterations
-
enumerator linmod_info_time#
current time
-
enumerator linmod_info_nevalf#
number of objective function callback evaluations
-
enumerator linmod_info_inorm#
infinity norm of gradient
-
enumerator linmod_info_inorm_init#
infinity norm of gradient at the initial iteration
-
enumerator linmod_info_ncheap#
evaluations requesting “cheap” update (only coordinate descent solver)
number of objective function callback
-
enumerator linmod_info_optim#
optimality measure (only coordinate descent solver)
-
enumerator linmod_info_optimcnt#
number of optimality checks (only coordinate descent solver)
-
enumerator linmod_info_nsamples#
number of rows in the input matrix
-
enumerator linmod_info_nfeat#
number of columns in the input matrix
-
enumerator linmod_info_nclass#
number of classes in the input data (0 when regression problem)
-
enumerator linmod_info_nrow_coef#
number of rows of the coefficient array
-
enumerator linmod_info_ncol_coef#
number of columns of the coefficient array
-
enumerator linmod_info_well_determined#
flag indicating if the problem is well determined
-
enumerator linmod_info_number#
for internal use
-
enumerator linmod_info_objective#