Massart, Estelle
[UCL]
Abrol, Vinayak
[CSE Department Infosys Centre for AI,IIIT Delhi - India]
To alleviate the cost incurred by orthogonality constraints in optimization and model training, we propose a stochastic coordinate descent algorithm on the Stiefel manifold. We compute expressions for geodesics on the Stiefel manifold with initial velocity aligned with coordinates of the tangent space and show that, analogously to the orthogonal group, iterate updates of coordinate descent methods can be efficiently implemented in terms of multiplications by Givens matrices. We illustrate our proposed algorithm on deep neural network training.
Bibliographic reference |
Massart, Estelle ; Abrol, Vinayak . Coordinate descent on the Stiefel manifold for deep neural network training.31st European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. (Bruges, Belgium, du 04/10/2023 au 06/10/2023). In: ESANN 2023 proceedings, 2023 |
Permanent URL |
http://hdl.handle.net/2078.1/289245 |