Stochastic Variance Reduced Gradient is a technique for minimizing a function which can be expressed as a sum of other functions.
More...
|
| | SVRGType (const double stepSize=0.01, const size_t batchSize=32, const size_t maxIterations=1000, const size_t innerIterations=0, const double tolerance=1e-5, const bool shuffle=true, const UpdatePolicyType &updatePolicy=UpdatePolicyType(), const DecayPolicyType &decayPolicy=DecayPolicyType(), const bool resetPolicy=true) |
| | Construct the SVRG optimizer with the given function and parameters. More...
|
| |
| size_t | BatchSize () const |
| | Get the batch size. More...
|
| |
| size_t & | BatchSize () |
| | Modify the batch size. More...
|
| |
| const DecayPolicyType & | DecayPolicy () const |
| | Get the step size decay policy. More...
|
| |
| DecayPolicyType & | DecayPolicy () |
| | Modify the step size decay policy. More...
|
| |
| size_t | InnerIterations () const |
| | Get the maximum number of iterations (0 indicates default n / b). More...
|
| |
| size_t & | InnerIterations () |
| | Modify the maximum number of iterations (0 indicates default n / b). More...
|
| |
| size_t | MaxIterations () const |
| | Get the maximum number of iterations (0 indicates no limit). More...
|
| |
| size_t & | MaxIterations () |
| | Modify the maximum number of iterations (0 indicates no limit). More...
|
| |
template < typename DecomposableFunctionType > |
| double | Optimize (DecomposableFunctionType &function, arma::mat &iterate) |
| | Optimize the given function using SVRG. More...
|
| |
| bool | ResetPolicy () const |
| | Get whether or not the update policy parameters are reset before Optimize call. More...
|
| |
| bool & | ResetPolicy () |
| | Modify whether or not the update policy parameters are reset before Optimize call. More...
|
| |
| bool | Shuffle () const |
| | Get whether or not the individual functions are shuffled. More...
|
| |
| bool & | Shuffle () |
| | Modify whether or not the individual functions are shuffled. More...
|
| |
| double | StepSize () const |
| | Get the step size. More...
|
| |
| double & | StepSize () |
| | Modify the step size. More...
|
| |
| double | Tolerance () const |
| | Get the tolerance for termination. More...
|
| |
| double & | Tolerance () |
| | Modify the tolerance for termination. More...
|
| |
| const UpdatePolicyType & | UpdatePolicy () const |
| | Get the update policy. More...
|
| |
| UpdatePolicyType & | UpdatePolicy () |
| | Modify the update policy. More...
|
| |
template
<
typename
UpdatePolicyType
,
class mlpack::optimization::SVRGType< UpdatePolicyType, DecayPolicyType >
Stochastic Variance Reduced Gradient is a technique for minimizing a function which can be expressed as a sum of other functions.
That is, suppose we have
and our task is to minimize
. Stochastic Variance Reduced Gradient iterates over each function
, based on the specified update policy. By default vanilla update policy is used. The SVRG class supports either scanning through each of the
functions
linearly, or in a random sequence. The algorithm continues until
reaches the maximum number of iterations—or when a full sequence of updates through each of the
functions
produces an improvement within a certain tolerance
. That is,
The parameter
is specified by the tolerance parameter to the constructor;
is specified by the maxIterations parameter.
This class is useful for data-dependent functions whose objective function can be expressed as a sum of objective functions operating on an individual point. Then, SVRG considers the gradient of the objective function operating on an individual point in its update of
.
For SVRG to work, a DecomposableFunctionType template parameter is required. This class must implement the following function:
size_t NumFunctions(); double Evaluate(const arma::mat& coordinates, const size_t i, const size_t batchSize); void Gradient(const arma::mat& coordinates, const size_t i, arma::mat& gradient, const size_t batchSize);
NumFunctions() should return the number of functions (
), and in the other two functions, the parameter i refers to which individual function (or gradient) is being evaluated. So, for the case of a data-dependent function, such as NCA (see mlpack::nca::NCA), NumFunctions() should return the number of points in the dataset, and Evaluate(coordinates, 0) will evaluate the objective function on the first point in the dataset (presumably, the dataset is held internally in the DecomposableFunctionType).
For more information, please refer to:
@inproceedings{Johnson2013,
author = {Johnson, Rie and Zhang, Tong},
title = {Accelerating Stochastic Gradient Descent Using Predictive
Variance Reduction},
booktitle = {Proceedings of the 26th International Conference on Neural
Information Processing Systems - Volume 1},
series = {NIPS'13},
year = {2013},
location = {Lake Tahoe, Nevada},
pages = {315--323},
numpages = {9},
publisher = {Curran Associates Inc.},
}
- Template Parameters
-
| UpdatePolicyType | update policy used by SVRG during the iterative update process. By default vanilla update policy (see mlpack::optimization::VanillaUpdate) is used. |
| DecayPolicyType | Decay policy used during the iterative update process to adjust the step size. By default the step size isn't going to be adjusted (i.e. NoDecay is used). |
Definition at line 101 of file svrg.hpp.