CHAPTER 1 Introduction
CHAPTER 2 Accelerated Algorithms for Unconstrained Convex Optimization
1. Preliminaries
2. Accelerated Gradient Method for smooth optimization
3. Extension to the Composite Optimization
3.1. Nesterov's First Scheme
3.2. Nesterov's Second Scheme
3.2.1. A Primal Dual Perspective
3.3. Nesterov's Third Scheme
4. Inexact Proximal and Gradient Computing
4.1. Inexact Accelerated Gradient Descent
4.2. Inexact Accelerated Proximal Point Method
5. Restart
6. Smoothing for Nonsmooth Optimization
7. Higher Order Accelerated Method
8. Explanation: An Variational Perspective 8.1. Discretization
CHAPTER 3 Accelerated Algorithms for Constrained Convex Optimization
1. Preliminaries
1.1. Case Study: Linear Equality Constraint
2. Accelerated Penalty Method
2.1. Non-strongly Convex Objectives
2.2. Strong Convex Objectives
3. Accelerated Lagrange Multiplier Method 3.1. Recovering the Primal Solution
3.2. Accelerated Augmented Lagrange Multiplier Method
4. Accelerated Alternating Direction Method of Multipliers
4.1. Non-strongly Convex and Non-smooth
4.2. Strongly Convex and Non-smooth
4.3. Non-strongly Convex and Smooth
4.4. Strongly Convex and Smooth
4.5. Non-ergodic Convergence Rate
4.5.1. Original ADMM
4.5.2. ADMM with Extrapolation and Increasing Penalty Parameter
5. Accelerated Primal Dual Method
5.1. Case 1
5.2. Case 2
5.3. Case 3
5.4. Case 4
CHAPTER 4 Accelerated Algorithms for Nonconvex Optimization 1. Proximal Gradient with Momentum
1.1. Basic Assumptions
1.2. Convergence Theorem
1.3. Another Method: Monotone APG
2. AGD Achieves the Critical Points Quickly
2.1. AGD as a Convexity Monitor
2.2. Negative Curvature
2.3. Accelerating Nonconvex Optimization 3. AGD Escapes the Saddle Points Quickly
3.1. Almost Convex
3.2. Negative Curvature Descent
3.3. AGD for Non-Convex Problem
3.3.1. Locally Almost Convex! Globally Almost Convex
3.3.2. Outer Iterations
3.3.3. Inner Iterations
CHAPTER 5 Accelerated Stochastic Algorithms 1. The Individual Convexity Case
1.1. Accelerated Stochastic Coordinate Descent
1.2. Background for Variance Reduction Methods
1.3. Accelerated Stochastic Variance Reduction Method
1.4. Black-Box Acceleration
2. The Individual Non-convexity Case
2.1. Individual Non-convex but Integrally Convex
3. The Non-Convexity Case
3.1. SPIDER
3.2. Momentum Acceleration
4. Constrained Problem
5. Infinity Case
CHAPTER 6 Paralleling Algorithms
1. Accelerated Asynchronous Algorithms
1.1. Asynchronous Accelerated Gradient Descent
1.2. Asynchronous Accelerated Stochastic Coordinate Descent
2. Accelerated Distributed Algorithms
2.1. Centralized Topology
2.1.1. Large Mini-batch Algorithms
2.1.2. Dual Communication-Efficient Methods
2.2. Decentralized Topology
CHAPTER 7 Conclusions
APPENDIX Mathematical Preliminaries
About the Author: Zhouchen Lin is a leading expert in the fields of machine learning and computer vision. He is currently a Professor at the Key Laboratory of Machine Perception (Ministry of Education), School of EECS, Peking University. He served as an area chair for several prestigious conferences, including CVPR, ICCV, ICML, NIPS, AAAI and IJCAI. He is an associate editor of the IEEE Transactions on Pattern Analysis and Machine Intelligence and the International Journal of Computer Vision. He is a Fellow of IAPR and IEEE.
Huan Li received his Ph.D. degree in machine learning from Peking University in 2019. He is currently an Assistant Professor at the College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics. His current research interests include optimization and machine learning.
Cong Fang received his Ph.D. degree from Peking University in 2019. He is currently a Postdoctoral Researcher at Princeton University. His research interests include machine learning and optimization.