Nesterov Momentum update policy for Stochastic Gradient Descent (SGD). More...
Public Member Functions | |
| NesterovMomentumUpdate (const double momentum=0.5) | |
| Construct the Nesterov Momentum update policy with the given parameters. More... | |
| void | Initialize (const size_t rows, const size_t cols) |
| The Initialize method is called by SGD Optimizer method before the start of the iteration update process. More... | |
| double | Momentum () const |
| Get the value used to initialize the momentum coefficient. More... | |
| double & | Momentum () |
| Modify the value used to initialize the momentum coefficient. More... | |
| void | Update (arma::mat &iterate, const double stepSize, const arma::mat &gradient) |
| Update step for SGD. More... | |
Nesterov Momentum update policy for Stochastic Gradient Descent (SGD).
Learning with SGD can be slow. Applying Standard momentum can accelerate the rate of convergence. Nesterov Momentum application can accelerate the rate of convergence to O(1/k^2).
Definition at line 38 of file nesterov_momentum_update.hpp.
|
inline |
Construct the Nesterov Momentum update policy with the given parameters.
Definition at line 45 of file nesterov_momentum_update.hpp.
|
inline |
The Initialize method is called by SGD Optimizer method before the start of the iteration update process.
In the momentum update policy the velocity matrix is initialized to the zeros matrix with the same size as the gradient matrix (see mlpack::optimization::SGD::Optimizer )
| rows | Number of rows in the gradient matrix. |
| cols | Number of columns in the gradient matrix. |
Definition at line 60 of file nesterov_momentum_update.hpp.
|
inline |
Get the value used to initialize the momentum coefficient.
Definition at line 85 of file nesterov_momentum_update.hpp.
|
inline |
Modify the value used to initialize the momentum coefficient.
Definition at line 87 of file nesterov_momentum_update.hpp.
|
inline |
Update step for SGD.
The momentum term makes the convergence faster on the way as momentum term increases for dimensions pointing in the same direction and reduces updates for dimensions whose gradients change directions.
| iterate | Parameters that minimize the function. |
| stepSize | Step size to be used for the given iteration. |
| gradient | The gradient matrix. |
Definition at line 75 of file nesterov_momentum_update.hpp.