Logistic regression is a type of supervised classification model aimed at assigning labels. In AOCL-DA, the labels are expected to be provided in a categorical response variable, \(y\), encoded by \(\{0, 1, 2, \ldots, K-1 \}\). The fit is based on maximizing the log-likelihood (loss function) of the probabilities that each observation \(i\) belongs to a given class, in turn defined by,
As an example, if \(K=2\), the loss function simplifies to,
As in the linear regression model, \(\ell_1\) or \(\ell_2\) regularization can be applied by adding the corresponding penalty term to the cost function.
When the model uses \(\ell_1\) regularization is it also known by the name of lasso, while when using \(\ell_2\) is it also called a ridge model. When both regularization terms are present then it is termed an elastic-net model.