S. Adachi, S. Iwata, Y. Nakatsukasa, and A. Takeda. Solving the trust region subproblem by a generalized eigenvalue problem. Technical Report, Mathematical Engineering, The University of Tokyo, 2015. URL: https://www.keisu.t.u-tokyo.ac.jp/data/2015/METR15-14.pdf.
Christopher M Bishop. Pattern recognition and machine learning. Number 4. Springer, 2006. https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf.
A. R. Conn, N. I. M.Gould, and Ph. L. Toint. Trust Region Methods. SIAM, Philadelphia, 2000.
C. Elkan. Using the triangle inequality to accelerate k-means. Proceedings of the 20th international conference on Machine Learning, pages 147–153, 2003.
Jerome Friedman, Trevor Hastie, Holger Höfling, and Robert Tibshirani. Pathwise coordinate optimization. The Annals of Applied Statistics, 1(2):302 – 332, 2007. URL: https://doi.org/10.1214/07-AOAS131, doi:10.1214/07-AOAS131.
Jerome C. Friedman, Trevor Hastie, and Rob Tibshirani. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 2010.
N. I. M. Gould, D. Orban, and Ph. L. Toint. GALAHAD, a library of thread-safe Fortran 90 packages for large-scale nonlinear optimization. ACM Transactions on Mathematical Software (TOMS), 29(4):353––372, 2003.
N. I. M. Gould, T. Rees, and J. A. Scott. A higher order method for solving nonlinear least-squares problems. Technical Report, STFC Rutherford Appleton Laboratory, 2017.
John A Hartigan and Manchek A Wong. Algorithm as 136: a k-means clustering algorithm. Journal of the royal statistical society. series c (applied statistics), 28(1):100–108, 1979.
Trevor Hastie, Tibshirani, Robert Friedman, and Jerome H Friedman. The elements of statistical learning: data mining, inference, and prediction. Springer, 2009.
Rob J Hyndman and Yanan Fan. Sample quantiles in statistical packages. The American Statistician, 50(4):361–365, 1996.
C. Kanzow, N. Yamashita, and M. Fukushima. Levenberg-Marquardt methods with strong local convergence properties for solving nonlinear equations with convex constraints. Journal of Computational and Applied Mathematics, 174:375––397, 2004.
Stephen Kokoska and Daniel Zwillinger. CRC Standard probability and statistics tables and formulae. CRC Press, 2000.
Dong C. Liu and Jorge Nocedal. On the limited memory method for large scale optimization. Mathematical Programming, 45:503–528, 1989.
Stuart Lloyd. Least squares quantization in pcm. IEEE transactions on information theory, 28(2):129–137, 1982.
James MacQueen. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, 281–297. Oakland, CA, USA, 1967.
J. Nocedal and S. J. Wright. Numerical Optimization. Springer Series in Operations Research, Springer, New York, 2nd edition, 2006.
John Rice. Mathematical statistics and data analysis. Duxbury Press, 1995.
Zeyi Wen, Jiashuai Shi, Qinbin Li, Bingsheng He, and Jian Chen. ThunderSVM: a fast SVM library on GPUs and CPUs. Journal of Machine Learning Research, 19:797–801, 2018.