\section{Overview}\label{function_Overview}
To represent the various types of loss functions encountered in machine learning problems, mlpack provides the {\ttfamily Function\+Type} template parameter in the optimizer interface. The various optimizers available in the core library rely on this policy to gain the necessary information required by the optimizing algorithm.

The {\ttfamily Function\+Type} template parameter required by the Optimizer class can have additional requirements imposed on it, depending on the type of optimizer used.\section{Interface requirements}\label{function_requirements}
The most basic requirements for the {\ttfamily Function\+Type} parameter are the implementations of two public member functions, with the following interface and semantics


\begin{DoxyCode}
\textcolor{comment}{// Evaluate the loss function at the given coordinates.}
\textcolor{keywordtype}{double} Evaluate(\textcolor{keyword}{const} arma::mat& coordinates);
\end{DoxyCode}



\begin{DoxyCode}
\textcolor{comment}{// Evaluate the gradient at the given coordinates, where 'gradient' is an}
\textcolor{comment}{// output parameter for the required gradient.}
\textcolor{keywordtype}{void} Gradient(\textcolor{keyword}{const} arma::mat& coordinates, arma::mat& gradient);
\end{DoxyCode}


Optimizers like S\+GD and R\+M\+S\+Prop require a {\ttfamily Decomposable\+Function\+Type} having the following requirements


\begin{DoxyCode}
\textcolor{comment}{// Return the number of functions. In a data-dependent function, this would}
\textcolor{comment}{// return the number of points in the dataset.}
\textcolor{keywordtype}{size\_t} NumFunctions();
\end{DoxyCode}



\begin{DoxyCode}
\textcolor{comment}{// Evaluate the 'i' th loss function. For example, for a data-dependent}
\textcolor{comment}{// function, Evaluate(coordinates, 0) should evaluate the loss function at the}
\textcolor{comment}{// first point in the dataset.}
\textcolor{keywordtype}{double} Evaluate(\textcolor{keyword}{const} arma::mat& coordinates, \textcolor{keyword}{const} \textcolor{keywordtype}{size\_t} i);
\end{DoxyCode}



\begin{DoxyCode}
\textcolor{comment}{// Evaluate the gradient of the 'i' th loss function at the given coordinates,}
\textcolor{comment}{// where 'gradient' is an output parameter for the required gradient.}
\textcolor{keywordtype}{void} Gradient(\textcolor{keyword}{const} arma::mat& coordinates, \textcolor{keyword}{const} \textcolor{keywordtype}{size\_t} i, arma::mat& gradient);
\end{DoxyCode}


{\ttfamily Parallel\+S\+GD} optimizer requires a {\ttfamily Sparse\+Function\+Type} interface. {\ttfamily Sparse\+Function\+Type} requires the gradient to be in a sparse matrix ({\ttfamily arma\+::sp\+\_\+mat}), as Parallel\+S\+GD, implemented with the H\+O\+G\+W\+I\+L\+D! scheme of unsynchronised updates, is expected to be relevant only in situations where the individual gradients are sparse. So, the interface requires function with the following signatures


\begin{DoxyCode}
\textcolor{comment}{// Return the number of functions. In a data-dependent function, this would}
\textcolor{comment}{// return the number of points in the dataset.}
\textcolor{keywordtype}{size\_t} NumFunctions();
\end{DoxyCode}



\begin{DoxyCode}
\textcolor{comment}{// Evaluate the loss function at the given coordinates.}
\textcolor{keywordtype}{double} Evaluate(\textcolor{keyword}{const} arma::mat& coordinates);
\end{DoxyCode}



\begin{DoxyCode}
\textcolor{comment}{// Evaluate the (sparse) gradient of the 'i' th loss function at the given}
\textcolor{comment}{// coordinates, where 'gradient' is an output parameter for the required}
\textcolor{comment}{// gradient.}
\textcolor{keywordtype}{void} Gradient(\textcolor{keyword}{const} arma::mat& coordinates, \textcolor{keyword}{const} \textcolor{keywordtype}{size\_t} i, arma::sp\_mat& gradient);
\end{DoxyCode}


The {\ttfamily S\+CD} optimizer requires a {\ttfamily Resolvable\+Function\+Type} interface, to calculate partial gradients with respect to individual features. The optimizer requires the decision variable to be arranged in a particular fashion to allow for disjoint updates. The features should be arranged columnwise in the decision variable. For example, in {\ttfamily Softmax\+Regression\+Function} the decision variable has size {\ttfamily num\+Classes} x {\ttfamily feature\+Size} (+ 1 if an intercept also needs to be fit). Similarly, for {\ttfamily Logistic\+Regression}, the decision variable is a row vector, with the number of columns determined by the dimensionality of the dataset.

The interface expects the following member functions from the function class


\begin{DoxyCode}
\textcolor{comment}{// Return the number of features in the decision variable.}
\textcolor{keywordtype}{size\_t} NumFeatures();
\end{DoxyCode}



\begin{DoxyCode}
\textcolor{comment}{// Evaluate the loss function at the given coordinates.}
\textcolor{keywordtype}{double} Evaluate(\textcolor{keyword}{const} arma::mat& coordinates);
\end{DoxyCode}



\begin{DoxyCode}
\textcolor{comment}{// Evaluate the partial gradient of the loss function with respect to the 'j' th}
\textcolor{comment}{// coordinate at the given coordinates, where 'gradient' is an output parameter}
\textcolor{comment}{// for the required gradient. The 'gradient' matrix is supposed to be non-zero}
\textcolor{comment}{// in the jth column, which contains the relevant partial gradient.}
\textcolor{keywordtype}{void} PartialGradient(\textcolor{keyword}{const} arma::mat& coordinates, \textcolor{keyword}{const} \textcolor{keywordtype}{size\_t} j, arma::sp\_mat& gradient);
\end{DoxyCode}
 