Dynamic Decoupling of Placid Terminal Attractor-based Gradient Descent Algorithm
Abstract
Gradient descent (GD) and stochastic gradient descent (SGD) have been widely used in a large number of application domains. Therefore, understanding the dynamics of GD and improving its convergence speed is still of great importance. This paper carefully analyzes the dynamics of GD based on the terminal attractor at different stages of its gradient flow. On the basis of the terminal sliding mode theory and the terminal attractor theory, four adaptive learning rates are designed. Their performances are investigated in light of a detailed theoretical investigation, and the running times of the learning procedures are evaluated and compared. The total times of their learning processes are also studied in detail. To evaluate their effectiveness, various simulation results are investigated on a function approximation problem and an image classification problem.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2024
- DOI:
- 10.48550/arXiv.2409.06542
- arXiv:
- arXiv:2409.06542
- Bibcode:
- 2024arXiv240906542Z
- Keywords:
-
- Computer Science - Machine Learning
- E-Print:
- 8 pages, 4 figures