k-means clustering aims to partition a set of \(n_{\mathrm{samples}}\) data points \(\{x_1, x_2, \dots, x_{n_{\mathrm{samples}}}\}\) into \(n_{\mathrm{clusters}}\) groups. Each group is described by the mean of the data points assigned to it \(\{\mu_1, \mu_2, \dots, \mu_{n_{\mathrm{clusters}}}\}\). These means are commonly known as the cluster centres or centroids and are not generally points from the original data matrix.
Various k-means algorithms are available, but each one proceeds by attempting to minimize a quantity known as the inertia, or the within-cluster sum-of-squares:
Since this is an NP-hard problem, algorithms are heuristic in nature and converge to local optima. Therefore it is often desirable to run the algorithms several times and select the result with the smallest inertia.