\section{Introduction}\label{hpt_guide_hptintro}
{\bfseries mlpack} implements a generic hyperparameter tuner that is able to tune both continuous and discrete parameters of various different algorithms. This is an important task---the performance of many machine learning algorithms can be highly dependent on the hyperparameters that are chosen for that algorithm. (One example\+: the choice of $k$ for a $k$-\/nearest-\/neighbors classifier.)

This hyper-\/parameter tuner is built on the same general concept as the cross-\/validation classes (see the \doxyref{cross-\/validation tutorial}{p.}{cv})\+: given some machine learning algorithm, some data, some performance measure, and a set of hyperparameters, attempt to find the hyperparameter set that best optimizes the performance measure on the given data with the given algorithm.

{\bfseries mlpack\textquotesingle{}s} implementation of hyperparameter tuning is flexible, and is built in a way that supports many algorithms and many optimizers. At the time of this writing, complex hyperparameter optimization techniques are not available, but the hyperparameter tuner does support these, should they be implemented in the future.

In this tutorial we will see the usage examples of the hyper-\/parameter tuning module, and also more details about the {\ttfamily \doxyref{Hyper\+Parameter\+Tuner}{p.}{classmlpack_1_1hpt_1_1HyperParameterTuner}} class.\section{Basic Usage}\label{hpt_guide_hptbasic}
The interface of the hyper-\/parameter tuning module is quite similar to the interface of the \doxyref{cross-\/validation module}{p.}{cv}. To construct a {\ttfamily \doxyref{Hyper\+Parameter\+Tuner}{p.}{classmlpack_1_1hpt_1_1HyperParameterTuner}} object you need to specify as template parameters what machine learning algorithm, cross-\/validation strategy, performance measure, and optimization strategy ({\ttfamily ens\+::\+Grid\+Search} will be used by default) you are going to use. Then, you must pass the same arguments as for the cross-\/validation classes\+: the data and labels (or responses) to use are given to the constructor, and the possible hyperparameter values are given to the {\ttfamily \doxyref{Hyper\+Parameter\+Tuner\+::\+Optimize()}{p.}{classmlpack_1_1hpt_1_1HyperParameterTuner_a4e04da235ec0434d69613c547b20dbea}} method, which returns the best algorithm configuration as a {\ttfamily std\+::tuple$<$$>$}.

Let\textquotesingle{}s see some examples.

Suppose we have the following data to train and validate on. 
\begin{DoxyCode}
\textcolor{comment}{// 100-point 5-dimensional random dataset.}
arma::mat data = arma::randu<arma::mat>(5, 100);
\textcolor{comment}{// Noisy responses retrieved by a random linear transformation of data.}
arma::rowvec responses = arma::randu<arma::rowvec>(5) * data +
    0.1 * arma::randn<arma::rowvec>(100);
\end{DoxyCode}


Given the dataset above, we can use the following code to try to find a good {\ttfamily lambda} value for \doxyref{Linear\+Regression}{p.}{classmlpack_1_1regression_1_1LinearRegression}. Here we use \doxyref{Simple\+CV}{p.}{classmlpack_1_1cv_1_1SimpleCV} instead of k-\/fold cross-\/validation to save computation time.


\begin{DoxyCode}
\textcolor{comment}{// Using 80% of data for training and remaining 20% for assessing MSE.}
\textcolor{keywordtype}{double} validationSize = 0.2;
HyperParameterTuner<LinearRegression, MSE, SimpleCV> hpt(validationSize,
    data, responses);

\textcolor{comment}{// Finding a good value for lambda from the discrete set of values 0.0, 0.001,}
\textcolor{comment}{// 0.01, 0.1, and 1.0.}
arma::vec lambdas\{0.0, 0.001, 0.01, 0.1, 1.0\};
\textcolor{keywordtype}{double} bestLambda;
std::tie(bestLambda) = hpt.Optimize(lambdas);
\end{DoxyCode}


In this example we have used {\ttfamily ens\+::\+Grid\+Search} (the default optimizer) to find a good value for the {\ttfamily lambda} hyper-\/parameter. For that we have specified what values should be tried.\section{Fixed Arguments}\label{hpt_guide_hptfixed}
When some hyper-\/parameters should not be optimized, you can specify values for them with the {\ttfamily \doxyref{Fixed()}{p.}{namespacemlpack_1_1hpt_ad773f4d1def8deb412ffbf37bdf289ec}} method as in the following example of trying to find good {\ttfamily lambda1} and {\ttfamily lambda2} values for \doxyref{L\+A\+RS}{p.}{classmlpack_1_1regression_1_1LARS} (least-\/angle regression).


\begin{DoxyCode}
HyperParameterTuner<LARS, MSE, SimpleCV> hpt2(validationSize, data,
    responses);

\textcolor{comment}{// The hyper-parameter tuner should not try to change the transposeData or}
\textcolor{comment}{// useCholesky parameters.}
\textcolor{keywordtype}{bool} transposeData = \textcolor{keyword}{true};
\textcolor{keywordtype}{bool} useCholesky = \textcolor{keyword}{false};

\textcolor{comment}{// We wish only to search for the best lambda1 and lambda2 values.}
arma::vec lambda1Set\{0.0, 0.001, 0.01, 0.1, 1.0\};
arma::vec lambda2Set\{0.0, 0.002, 0.02, 0.2, 2.0\};

\textcolor{keywordtype}{double} bestLambda1, bestLambda2;
std::tie(bestLambda1, bestLambda2) = hpt2.Optimize(Fixed(transposeData),
    Fixed(useCholesky), lambda1Set, lambda2Set);
\end{DoxyCode}


Note that for the call to {\ttfamily hpt2.\+Optimize()}, we have used the same order of arguments as they appear in the corresponding \doxyref{L\+A\+RS}{p.}{classmlpack_1_1regression_1_1LARS} constructor\+:


\begin{DoxyCode}
LARS(\textcolor{keyword}{const} arma::mat& data,
     \textcolor{keyword}{const} arma::rowvec& responses,
     \textcolor{keyword}{const} \textcolor{keywordtype}{bool} transposeData = \textcolor{keyword}{true},
     \textcolor{keyword}{const} \textcolor{keywordtype}{bool} useCholesky = \textcolor{keyword}{false},
     \textcolor{keyword}{const} \textcolor{keywordtype}{double} lambda1 = 0.0,
     \textcolor{keyword}{const} \textcolor{keywordtype}{double} lambda2 = 0.0,
     \textcolor{keyword}{const} \textcolor{keywordtype}{double} tolerance = 1e-16);
\end{DoxyCode}
\section{Gradient-\/\+Based Optimization}\label{hpt_guide_hptgradient}
In some cases we may wish to optimize a hyperparameter over the space of all possible real values, instead of providing a grid in which to search. Alternately, we may know approximately optimal values from a grid search for real-\/valued hyperparameters, but wish to further tune those values.

In this case, we can use a gradient-\/based optimizer for hyperparameter search. In the following example, we try to optimize the {\ttfamily lambda1} and {\ttfamily lambda2} hyper-\/parameters for \doxyref{L\+A\+RS}{p.}{classmlpack_1_1regression_1_1LARS} with the {\ttfamily ens\+::\+Gradient\+Descent} optimizer.


\begin{DoxyCode}
HyperParameterTuner<LARS, MSE, SimpleCV, GradientDescent> hpt3(validationSize,
    data, responses);

\textcolor{comment}{// GradientDescent can be adjusted in the following way.}
hpt3.Optimizer().StepSize() = 0.1;
hpt3.Optimizer().Tolerance() = 1e-15;

\textcolor{comment}{// We can set up values used for calculating gradients.}
hpt3.RelativeDelta() = 0.01;
hpt3.MinDelta() = 1e-10;

\textcolor{keywordtype}{double} initialLambda1 = 0.001;
\textcolor{keywordtype}{double} initialLambda2 = 0.002;

\textcolor{keywordtype}{double} bestGDLambda1, bestGDLambda2;
std::tie(bestGDLambda1, bestGDLambda2) = hpt3.Optimize(Fixed(transposeData),
    Fixed(useCholesky), initialLambda1, initialLambda2);
\end{DoxyCode}
\section{The Hyper\+Parameter\+Tuner class}\label{hpt_guide_hpt_class}
The {\ttfamily \doxyref{Hyper\+Parameter\+Tuner}{p.}{classmlpack_1_1hpt_1_1HyperParameterTuner}} class is very similar to the \doxyref{K\+Fold\+CV}{p.}{classmlpack_1_1cv_1_1KFoldCV} and \doxyref{Simple\+CV}{p.}{classmlpack_1_1cv_1_1SimpleCV} classes (see the \doxyref{cross-\/validation tutorial}{p.}{cv} for more information on those two classes), but there are a few important differences.

First, the {\ttfamily \doxyref{Hyper\+Parameter\+Tuner}{p.}{classmlpack_1_1hpt_1_1HyperParameterTuner}} accepts five different hyperparameters; only the first three of these are required\+:


\begin{DoxyItemize}
\item {\ttfamily M\+L\+Algorithm} This is the algorithm to be used.
\item {\ttfamily Metric} This is the performance measure to be used; see \doxyref{Performance measures}{p.}{cv_cvbasic_metrics} for more information.
\item {\ttfamily C\+V\+Type} This is the type of cross-\/validation to be used for evaluating the performance measure; this should be \doxyref{K\+Fold\+CV}{p.}{classmlpack_1_1cv_1_1KFoldCV} or \doxyref{Simple\+CV}{p.}{classmlpack_1_1cv_1_1SimpleCV}.
\item {\ttfamily Optimizer\+Type} This is the type of optimizer to use; it can be {\ttfamily Grid\+Search} or a gradient-\/based optimizer.
\item {\ttfamily Mat\+Type} This is the type of data matrix to use. The default is {\ttfamily arma\+::mat}. This only needs to be changed if you are specifically using sparse data, or if you want to use a numeric type other than {\ttfamily double}.
\end{DoxyItemize}

The last two template parameters are automatically inferred by the {\ttfamily \doxyref{Hyper\+Parameter\+Tuner}{p.}{classmlpack_1_1hpt_1_1HyperParameterTuner}} and should not need to be manually specified, unless an unconventional data type like {\ttfamily arma\+::fmat} is being used for data points.

Typically, \doxyref{Simple\+CV}{p.}{classmlpack_1_1cv_1_1SimpleCV} is a good choice for {\ttfamily C\+V\+Type} because it takes so much less time to compute than full \doxyref{K\+Fold\+CV}{p.}{classmlpack_1_1cv_1_1KFoldCV}; however, the disadvantage is that \doxyref{Simple\+CV}{p.}{classmlpack_1_1cv_1_1SimpleCV} might give a somewhat more noisy estimate of the performance measure on unseen test data.

The constructor for the {\ttfamily \doxyref{Hyper\+Parameter\+Tuner}{p.}{classmlpack_1_1hpt_1_1HyperParameterTuner}} is called with exactly the same arguments as the corresponding {\ttfamily C\+V\+Type} that has been chosen. For more information on that, please see the \doxyref{cross-\/validation constructor tutorial}{p.}{cv_cvbasic_api}. As an example, if we are using \doxyref{Simple\+CV}{p.}{classmlpack_1_1cv_1_1SimpleCV} and wish to hold out 20\% of the dataset as a validation set, we might construct a {\ttfamily \doxyref{Hyper\+Parameter\+Tuner}{p.}{classmlpack_1_1hpt_1_1HyperParameterTuner}} like this\+:


\begin{DoxyCode}
\textcolor{comment}{// We will use LinearRegression as the MLAlgorithm, and MSE as the performance}
\textcolor{comment}{// measure.  Our dataset is 'dataset' and the responses are 'responses'.}
HyperParameterTuner<LinearRegression, MSE, SimpleCV> hpt(0.2, dataset,
    responses);
\end{DoxyCode}


Next, we must set up the hyperparameters to be optimized. If we are doing a grid search with the {\ttfamily ens\+::\+Grid\+Search} optimizer (the default), then we only need to pass a {\ttfamily std\+::vector} (for non-\/numeric hyperparameters) or an {\ttfamily arma\+::vec} (for numeric hyperparameters) containing all of the possible choices that we wish to search over.

For instance, a set of numeric values might be chosen like this, for the {\ttfamily lambda} parameter (of type {\ttfamily double})\+:


\begin{DoxyCode}
arma::vec lambdaSet = arma::vec(\textcolor{stringliteral}{"0.0 0.1 0.5 1.0"});
\end{DoxyCode}


Similarly, a set of non-\/numeric values might be chosen like this, for the {\ttfamily intercept} parameter\+:


\begin{DoxyCode}
std::vector<bool> interceptSet = \{ \textcolor{keyword}{false}, \textcolor{keyword}{true} \};
\end{DoxyCode}


Once all of these are set up, the {\ttfamily \doxyref{Hyper\+Parameter\+Tuner\+::\+Optimize()}{p.}{classmlpack_1_1hpt_1_1HyperParameterTuner_a4e04da235ec0434d69613c547b20dbea}} method may be called to find the best set of hyperparameters\+:


\begin{DoxyCode}
\textcolor{keywordtype}{bool} intercept;
\textcolor{keywordtype}{double} lambda;
std::tie(lambda, intercept) = hpt.Optimize(lambdaSet, interceptSet);
\end{DoxyCode}


Alternately, the {\ttfamily \doxyref{Fixed()}{p.}{namespacemlpack_1_1hpt_ad773f4d1def8deb412ffbf37bdf289ec}} method (detailed in the \doxyref{Fixed arguments}{p.}{hpt_guide_hptfixed} section) can be used to fix the values of some parameters.

For continuous optimizers like {\ttfamily ens\+::\+Gradient\+Descent}, a range does not need to be specified but instead only a single value. See the \doxyref{Gradient-\/\+Based Optimization}{p.}{hpt_guide_hptgradient} section for more details.\section{Further documentation}\label{hpt_guide_hptfurther}
For more information on the {\ttfamily \doxyref{Hyper\+Parameter\+Tuner}{p.}{classmlpack_1_1hpt_1_1HyperParameterTuner}} class, see the \doxyref{mlpack\+::hpt\+::\+Hyper\+Parameter\+Tuner}{p.}{classmlpack_1_1hpt_1_1HyperParameterTuner} class documentation and the \doxyref{cross-\/validation tutorial}{p.}{cv}. 