Implementation of the AdaGrad update policy. More...
Public Member Functions | |
| AdaGradUpdate (const double epsilon=1e-8) | |
| Construct the AdaGrad update policy with given epsilon parameter. More... | |
| double | Epsilon () const |
| Get the value used to initialise the squared gradient parameter. More... | |
| double & | Epsilon () |
| Modify the value used to initialise the squared gradient parameter. More... | |
| void | Initialize (const size_t rows, const size_t cols) |
| The Initialize method is called by SGD Optimizer method before the start of the iteration update process. More... | |
| void | Update (arma::mat &iterate, const double stepSize, const arma::mat &gradient) |
| Update step for SGD. More... | |
Implementation of the AdaGrad update policy.
AdaGrad update policy chooses learning rate dynamically by adapting to the data. Hence AdaGrad eliminates the need to manually tune the learning rate.
For more information, see the following.
Definition at line 41 of file ada_grad_update.hpp.
|
inline |
Construct the AdaGrad update policy with given epsilon parameter.
| epsilon | The epsilon value used to initialise the squared gradient parameter. |
Definition at line 50 of file ada_grad_update.hpp.
|
inline |
Get the value used to initialise the squared gradient parameter.
Definition at line 88 of file ada_grad_update.hpp.
|
inline |
Modify the value used to initialise the squared gradient parameter.
Definition at line 90 of file ada_grad_update.hpp.
|
inline |
The Initialize method is called by SGD Optimizer method before the start of the iteration update process.
In AdaGrad update policy, squared gradient matrix is initialized to the zeros matrix with the same size as gradient matrix (see mlpack::optimization::SGD::Optimizer).
| rows | Number of rows in the gradient matrix. |
| cols | Number of columns in the gradient matrix. |
Definition at line 64 of file ada_grad_update.hpp.
|
inline |
Update step for SGD.
The AdaGrad update adapts the learning rate by performing larger updates for more sparse parameters and smaller updates for less sparse parameters .
| iterate | Parameters that minimize the function. |
| stepSize | Step size to be used for the given iteration. |
| gradient | The gradient matrix. |
Definition at line 79 of file ada_grad_update.hpp.