We analyze the adaptive first order algorithm AMSGrad, for solving a constrained stochastic optimization problem with a weakly convex objective. We prove the (O) over tilde (t(-1/2)) rate of convergence for the squared norm of the gradient of Moreau envelope, which is the standard stationarity measure for this class of problems. It matches the known rates that adaptive algorithms enjoy for the specific case of unconstrained smooth nonconvex stochastic optimization. Our analysis works with mini-batch size of 1, constant first and second order moment parameters, and possibly unbounded optimization domains. Finally, we illustrate the applications and extensions of our results to specific problems and algorithms.
Funding Agencies|NSF [2023239]; DOE ASCR from Argonne National Laboratory [8F-30039]; Wallenberg Al, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation [305286]; European Research Council (ERC) under the European Union [725594]; Swiss National Science Foundation (SNSF) [200021_178865/1]; Department of the Navy, Office of Naval Research (ONR) [N62909-17-1-2111]; Hasler Foundation Program: Cyber Human Systems [16066]