\section{Greedy\+Policy$<$ Environment\+Type $>$ Class Template Reference}
\label{classmlpack_1_1rl_1_1GreedyPolicy}\index{Greedy\+Policy$<$ Environment\+Type $>$@{Greedy\+Policy$<$ Environment\+Type $>$}}


Implementation for epsilon greedy policy.  


\subsection*{Public Types}
\begin{DoxyCompactItemize}
\item 
using \textbf{ Action\+Type} = typename Environment\+Type\+::\+Action
\begin{DoxyCompactList}\small\item\em Convenient typedef for action. \end{DoxyCompactList}\end{DoxyCompactItemize}
\subsection*{Public Member Functions}
\begin{DoxyCompactItemize}
\item 
\textbf{ Greedy\+Policy} (const double initial\+Epsilon, const size\+\_\+t anneal\+Interval, const double min\+Epsilon, const double decay\+Rate=1.\+0)
\begin{DoxyCompactList}\small\item\em Constructor for epsilon greedy policy class. \end{DoxyCompactList}\item 
void \textbf{ Anneal} ()
\begin{DoxyCompactList}\small\item\em Exploration probability will anneal at each step. \end{DoxyCompactList}\item 
const double \& \textbf{ Epsilon} () const
\item 
\textbf{ Action\+Type} \textbf{ Sample} (const arma\+::colvec \&action\+Value, bool deterministic=false)
\begin{DoxyCompactList}\small\item\em Sample an action based on given action values. \end{DoxyCompactList}\end{DoxyCompactItemize}


\subsection{Detailed Description}
\subsubsection*{template$<$typename Environment\+Type$>$\newline
class mlpack\+::rl\+::\+Greedy\+Policy$<$ Environment\+Type $>$}

Implementation for epsilon greedy policy. 

In general we will select an action greedily based on the action value, however sometimes we will also randomly select an action to encourage exploration.


\begin{DoxyTemplParams}{Template Parameters}
{\em Environment\+Type} & The reinforcement learning task. \\
\hline
\end{DoxyTemplParams}


Definition at line 31 of file greedy\+\_\+policy.\+hpp.


\subsection{Member Typedef Documentation}
\mbox{\label{classmlpack_1_1rl_1_1GreedyPolicy_aaf7b2dc5d49d01961601c7c16be76777}} 
\index{mlpack\+::rl\+::\+Greedy\+Policy@{mlpack\+::rl\+::\+Greedy\+Policy}!Action\+Type@{Action\+Type}}
\index{Action\+Type@{Action\+Type}!mlpack\+::rl\+::\+Greedy\+Policy@{mlpack\+::rl\+::\+Greedy\+Policy}}
\subsubsection{Action\+Type}
{\footnotesize\ttfamily using \textbf{ Action\+Type} =  typename Environment\+Type\+::\+Action}


Convenient typedef for action. 


Definition at line 35 of file greedy\+\_\+policy.\+hpp.


\subsection{Constructor \& Destructor Documentation}
\mbox{\label{classmlpack_1_1rl_1_1GreedyPolicy_a7e04af56c8b5bb57890640e7fcb6b676}} 
\index{mlpack\+::rl\+::\+Greedy\+Policy@{mlpack\+::rl\+::\+Greedy\+Policy}!Greedy\+Policy@{Greedy\+Policy}}
\index{Greedy\+Policy@{Greedy\+Policy}!mlpack\+::rl\+::\+Greedy\+Policy@{mlpack\+::rl\+::\+Greedy\+Policy}}
\subsubsection{Greedy\+Policy()}
{\footnotesize\ttfamily \textbf{ Greedy\+Policy} (\begin{DoxyParamCaption}\item[{const double}]{initial\+Epsilon,  }\item[{const size\+\_\+t}]{anneal\+Interval,  }\item[{const double}]{min\+Epsilon,  }\item[{const double}]{decay\+Rate = {\ttfamily 1.0} }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}


Constructor for epsilon greedy policy class. 


\begin{DoxyParams}{Parameters}
{\em initial\+Epsilon} & The initial probability to explore (select a random action). \\
\hline
{\em anneal\+Interval} & The steps during which the probability to explore will anneal. \\
\hline
{\em min\+Epsilon} & Epsilon will never be less than this value. \\
\hline
{\em decay\+Rate} & How much to change the model in response to the estimated error each time the model weights are updated. \\
\hline
\end{DoxyParams}


Definition at line 48 of file greedy\+\_\+policy.\+hpp.


\subsection{Member Function Documentation}
\mbox{\label{classmlpack_1_1rl_1_1GreedyPolicy_a280278726ff7d32f2b7eff5c92a1767a}} 
\index{mlpack\+::rl\+::\+Greedy\+Policy@{mlpack\+::rl\+::\+Greedy\+Policy}!Anneal@{Anneal}}
\index{Anneal@{Anneal}!mlpack\+::rl\+::\+Greedy\+Policy@{mlpack\+::rl\+::\+Greedy\+Policy}}
\subsubsection{Anneal()}
{\footnotesize\ttfamily void Anneal (\begin{DoxyParamCaption}{ }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}


Exploration probability will anneal at each step. 


Definition at line 80 of file greedy\+\_\+policy.\+hpp.

\mbox{\label{classmlpack_1_1rl_1_1GreedyPolicy_a3ababd597760bb1f9782ad2c17aadb41}} 
\index{mlpack\+::rl\+::\+Greedy\+Policy@{mlpack\+::rl\+::\+Greedy\+Policy}!Epsilon@{Epsilon}}
\index{Epsilon@{Epsilon}!mlpack\+::rl\+::\+Greedy\+Policy@{mlpack\+::rl\+::\+Greedy\+Policy}}
\subsubsection{Epsilon()}
{\footnotesize\ttfamily const double\& Epsilon (\begin{DoxyParamCaption}{ }\end{DoxyParamCaption}) const\hspace{0.3cm}{\ttfamily [inline]}}

\begin{DoxyReturn}{Returns}
Current possibility to explore. 
\end{DoxyReturn}


Definition at line 89 of file greedy\+\_\+policy.\+hpp.

\mbox{\label{classmlpack_1_1rl_1_1GreedyPolicy_a631fe506e7d81fba96697ba11c6ace84}} 
\index{mlpack\+::rl\+::\+Greedy\+Policy@{mlpack\+::rl\+::\+Greedy\+Policy}!Sample@{Sample}}
\index{Sample@{Sample}!mlpack\+::rl\+::\+Greedy\+Policy@{mlpack\+::rl\+::\+Greedy\+Policy}}
\subsubsection{Sample()}
{\footnotesize\ttfamily \textbf{ Action\+Type} Sample (\begin{DoxyParamCaption}\item[{const arma\+::colvec \&}]{action\+Value,  }\item[{bool}]{deterministic = {\ttfamily false} }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}


Sample an action based on given action values. 


\begin{DoxyParams}{Parameters}
{\em action\+Value} & Values for each action. \\
\hline
{\em deterministic} & Always select the action greedily. \\
\hline
\end{DoxyParams}
\begin{DoxyReturn}{Returns}
Sampled action. 
\end{DoxyReturn}


Definition at line 64 of file greedy\+\_\+policy.\+hpp.


References mlpack\+::math\+::\+Rand\+Int(), and mlpack\+::math\+::\+Random().


The documentation for this class was generated from the following file\+:\begin{DoxyCompactItemize}
\item 
/var/www/mlpack.\+ratml.\+org/mlpack.\+org/\+\_\+src/mlpack-\/3.\+3.\+1/src/mlpack/methods/reinforcement\+\_\+learning/policy/\textbf{ greedy\+\_\+policy.\+hpp}\end{DoxyCompactItemize}