Van Dessel, Guillaume
[UCL]
Glineur, François
[UCL]
Stich, Sebastian
[UCL]
With the advent of massive data collection and machines everyday more powerful the interest put in large scale optimization problems is increasing exponentially fast. Typical machine learning tasks require to solve an optimization problem of a composite loss function for which the number of terms scales with the number $n$ of observations. Since $n$ can reach orders of magnitude up to 1E9 or even 1E12, traditional techniques based on full gradient information at each update step fail to remain time-efficient. Therefore arises the necessity of getting rid of those plain methods to build strategies that keep a light 'per update iteration' cost. This is the aim of stochastic gradient methods, the topic of this master thesis. To briefly resume, the main contributions of this master thesis are the review and the unification of stochastic gradient based methods based on personal definitions. A new importance sampling framework is introduced, namely K-greedy sampling. This latter is shown to compete with the reference importance sampling for stochastic optimization first introduced in 2014 by Zhao and Zhang. Finally, the most promising contribution of this work is the developement of new optimization methods that take profit of the knowledge of a quadratic upper-bound information on the smooth part of the objective function. KProx-SVRG (linear rate of convergence proof derived) KProx-SARAH and an heuristic AdapKProx-SVRG are motivated and tested.
Bibliographic reference |
Van Dessel, Guillaume. Stochastic gradient based methods for large-scale optimization : review, unification and new approaches. Ecole polytechnique de Louvain, Université catholique de Louvain, 2019. Prom. : Glineur, François ; Stich, Sebastian. |
Permanent URL |
http://hdl.handle.net/2078.1/thesis:19568 |