Big-batch Stochastic Gradient Descent is a technique for minimizing a function which can be expressed as a sum of other functions. More...

Public Member Functions
	BigBatchSGD (const size_t batchSize=1000, const double stepSize=0.01, const double batchDelta=0.1, const size_t maxIterations=100000, const double tolerance=1e-5, const bool shuffle=true)
	Construct the BigBatchSGD optimizer with the given function and parameters. More...

double	BatchDelta () const
	Get the batch delta. More...

double &	BatchDelta ()
	Modify the batch delta. More...

size_t	BatchSize () const
	Get the batch size. More...

size_t &	BatchSize ()
	Modify the batch size. More...

size_t	MaxIterations () const
	Get the maximum number of iterations (0 indicates no limit). More...

size_t &	MaxIterations ()
	Modify the maximum number of iterations (0 indicates no limit). More...

template < typename DecomposableFunctionType >
double	Optimize (DecomposableFunctionType &function, arma::mat &iterate)
	Optimize the given function using big-batch SGD. More...

bool	Shuffle () const
	Get whether or not the individual functions are shuffled. More...

bool &	Shuffle ()
	Modify whether or not the individual functions are shuffled. More...

double	StepSize () const
	Get the step size. More...

double &	StepSize ()
	Modify the step size. More...

double	Tolerance () const
	Get the tolerance for termination. More...

double &	Tolerance ()
	Modify the tolerance for termination. More...

UpdatePolicyType	UpdatePolicy () const
	Get the update policy. More...

UpdatePolicyType &	UpdatePolicy ()
	Modify the update policy. More...

Detailed Description

template
<
typename
UpdatePolicyType
=
AdaptiveStepsize
>

class mlpack::optimization::BigBatchSGD< UpdatePolicyType >

Big-batch Stochastic Gradient Descent is a technique for minimizing a function which can be expressed as a sum of other functions.

That is, suppose we have

$f(A) = \sum_{i = 0}^{n} f_i(A)$

and our task is to minimize $ A $ . Big-batch SGD iterates over batches of functions $\{ f_{i0}(A), f_{i1}(A), \ldots, f_{i(m - 1)}(A)$ for some batch size $ m $ , producing the following update scheme:

$A_{j + 1} = A_j + \alpha \left(\sum_{k = 0}^{m - 1} \nabla f_{ik}(A) \right)$

where $\alpha$ is a parameter which specifies the step size. Each big-batch is passed through either sequentially or randomly. The algorithm continues until $ j $ reaches the maximum number of iterations—or when a full sequence of updates through each of the big-batches produces an improvement within a certain tolerance $\epsilon$ .

The parameter $\epsilon$ is specified by the tolerance parameter tot he constructor, as is the maximum number of iterations specified by the maxIterations parameter.

This class is useful for data-dependent functions whose objective function can be expressed as a sum of objective functions operating on an individual point. Then, big-batch SGD considers the gradient of the objective function operation on an individual big-batch of points in its update of $ A $ .

For more information, please refer to:

@article{De2017,
  title   = {Big Batch {SGD:} Automated Inference using Adaptive Batch
             Sizes},
  author  = {Soham De and Abhay Kumar Yadav and David W. Jacobs and
             Tom Goldstein},
  journal = {CoRR},
  year    = {2017},
  url     = {http://arxiv.org/abs/1610.05792},
}

For big-batch SGD to work, a DecomposableFunctionType template parameter is required. This class must implement the following function:

size_t NumFunctions(); double Evaluate(const arma::mat& coordinates, const size_t i); void Gradient(const arma::mat& coordinates, const size_t i, arma::mat& gradient);

NumFunctions() should return the number of functions, and in the other two functions, the parameter i refers to which individual function (or gradient) is being evaluated. So, for the case of a data-dependent function, such as NCA (see mlpack::nca::NCA), NumFunctions() should return the number of points in the dataset, and Evaluate(coordinates, 0) will evaluate the objective function on the first point in the dataset (presumably, the dataset is held internally in the DecomposableFunctionType).

Template Parameters

UpdatePolicyType Update policy used during the iterative update process. By default the AdaptiveStepsize update policy is used.

Definition at line 93 of file bigbatch_sgd.hpp.

Constructor & Destructor Documentation

◆ BigBatchSGD()

BigBatchSGD	(	const size_t	batchSize = `1000`,
		const double	stepSize = `0.01`,
		const double	batchDelta = `0.1`,
		const size_t	maxIterations = `100000`,
		const double	tolerance = `1e-5`,
		const bool	shuffle = `true`
	)

Construct the BigBatchSGD optimizer with the given function and parameters.

The defaults here are not necessarily good for the given problem, so it is suggested that the values used be tailored for the task at hand. The maximum number of iterations refers to the maximum number of batches that are processed.

Parameters

batchSize	Initial batch size.
stepSize	Step size for each iteration.
batchDelta	Factor for the batch update step.
maxIterations	Maximum number of iterations allowed (0 means no limit).
tolerance	Maximum absolute tolerance to terminate algorithm.
shuffle	If true, the batch order is shuffled; otherwise, each batch is visited in linear order.

Member Function Documentation

◆ BatchDelta() [1/2]

double BatchDelta ( ) const

inline

Get the batch delta.

Definition at line 143 of file bigbatch_sgd.hpp.

◆ BatchDelta() [2/2]

double& BatchDelta ( )

inline

Modify the batch delta.

Definition at line 145 of file bigbatch_sgd.hpp.

◆ BatchSize() [1/2]

size_t BatchSize ( ) const

inline

Get the batch size.

Definition at line 133 of file bigbatch_sgd.hpp.

◆ BatchSize() [2/2]

size_t& BatchSize ( )

inline

Modify the batch size.

Definition at line 135 of file bigbatch_sgd.hpp.

◆ MaxIterations() [1/2]

size_t MaxIterations ( ) const

inline

Get the maximum number of iterations (0 indicates no limit).

Definition at line 148 of file bigbatch_sgd.hpp.

◆ MaxIterations() [2/2]

size_t& MaxIterations ( )

inline

Modify the maximum number of iterations (0 indicates no limit).

Definition at line 150 of file bigbatch_sgd.hpp.

◆ Optimize()

double Optimize	(	DecomposableFunctionType &	function,
		arma::mat &	iterate
	)

Optimize the given function using big-batch SGD.

The given starting point will be modified to store the finishing point of the algorithm, and the final objective value is returned.

Template Parameters

DecomposableFunctionType Type of the function to be optimized.

Parameters

function	Function to optimize.
iterate	Starting point (will be modified).

Returns: Objective value of the final point.

◆ Shuffle() [1/2]

bool Shuffle ( ) const

inline

Get whether or not the individual functions are shuffled.

Definition at line 158 of file bigbatch_sgd.hpp.

◆ Shuffle() [2/2]

bool& Shuffle ( )

inline

Modify whether or not the individual functions are shuffled.

Definition at line 160 of file bigbatch_sgd.hpp.

◆ StepSize() [1/2]

double StepSize ( ) const

inline

Get the step size.

Definition at line 138 of file bigbatch_sgd.hpp.

◆ StepSize() [2/2]

double& StepSize ( )

inline

Modify the step size.

Definition at line 140 of file bigbatch_sgd.hpp.

◆ Tolerance() [1/2]

double Tolerance ( ) const

inline

Get the tolerance for termination.

Definition at line 153 of file bigbatch_sgd.hpp.

◆ Tolerance() [2/2]

double& Tolerance ( )

inline

Modify the tolerance for termination.

Definition at line 155 of file bigbatch_sgd.hpp.

◆ UpdatePolicy() [1/2]

UpdatePolicyType UpdatePolicy ( ) const

inline

Get the update policy.

Definition at line 163 of file bigbatch_sgd.hpp.

◆ UpdatePolicy() [2/2]

UpdatePolicyType& UpdatePolicy ( )

inline

Modify the update policy.

Definition at line 165 of file bigbatch_sgd.hpp.

The documentation for this class was generated from the following file:

src/mlpack/core/optimizers/bigbatch_sgd/bigbatch_sgd.hpp

Public Member Functions

Detailed Description

template<typenameUpdatePolicyType=AdaptiveStepsize> class mlpack::optimization::BigBatchSGD< UpdatePolicyType >

Constructor & Destructor Documentation

◆ BigBatchSGD()

Member Function Documentation

◆ BatchDelta() [1/2]

◆ BatchDelta() [2/2]

◆ BatchSize() [1/2]

◆ BatchSize() [2/2]

◆ MaxIterations() [1/2]

◆ MaxIterations() [2/2]

◆ Optimize()

◆ Shuffle() [1/2]

◆ Shuffle() [2/2]

◆ StepSize() [1/2]

◆ StepSize() [2/2]

◆ Tolerance() [1/2]

◆ Tolerance() [2/2]

◆ UpdatePolicy() [1/2]

◆ UpdatePolicy() [2/2]

template
<
typename
UpdatePolicyType
=
AdaptiveStepsize
>

class mlpack::optimization::BigBatchSGD< UpdatePolicyType >