\section{Prioritized\+Replay$<$ Environment\+Type $>$ Class Template Reference}
\label{classmlpack_1_1rl_1_1PrioritizedReplay}\index{Prioritized\+Replay$<$ Environment\+Type $>$@{Prioritized\+Replay$<$ Environment\+Type $>$}}


Implementation of prioritized experience replay.  


\subsection*{Classes}
\begin{DoxyCompactItemize}
\item 
struct \textbf{ Transition}
\end{DoxyCompactItemize}
\subsection*{Public Types}
\begin{DoxyCompactItemize}
\item 
using \textbf{ Action\+Type} = typename Environment\+Type\+::\+Action
\begin{DoxyCompactList}\small\item\em Convenient typedef for action. \end{DoxyCompactList}\item 
using \textbf{ State\+Type} = typename Environment\+Type\+::\+State
\begin{DoxyCompactList}\small\item\em Convenient typedef for state. \end{DoxyCompactList}\end{DoxyCompactItemize}
\subsection*{Public Member Functions}
\begin{DoxyCompactItemize}
\item 
\textbf{ Prioritized\+Replay} ()
\begin{DoxyCompactList}\small\item\em Default constructor. \end{DoxyCompactList}\item 
\textbf{ Prioritized\+Replay} (const size\+\_\+t batch\+Size, const size\+\_\+t capacity, const double alpha, const size\+\_\+t n\+Steps=1, const size\+\_\+t dimension=State\+Type\+::dimension)
\begin{DoxyCompactList}\small\item\em Construct an instance of prioritized experience replay class. \end{DoxyCompactList}\item 
void \textbf{ Beta\+Anneal} ()
\begin{DoxyCompactList}\small\item\em Annealing the beta. \end{DoxyCompactList}\item 
void \textbf{ Get\+N\+Step\+Info} (double \&reward, \textbf{ State\+Type} \&next\+State, bool \&is\+End, const double \&discount)
\begin{DoxyCompactList}\small\item\em Get the reward, next state and terminal boolean for nth step. \end{DoxyCompactList}\item 
const size\+\_\+t \& \textbf{ N\+Steps} () const
\begin{DoxyCompactList}\small\item\em Get the number of steps for n-\/step agent. \end{DoxyCompactList}\item 
void \textbf{ Sample} (arma\+::mat \&sampled\+States, std\+::vector$<$ \textbf{ Action\+Type} $>$ \&sampled\+Actions, arma\+::rowvec \&sampled\+Rewards, arma\+::mat \&sampled\+Next\+States, arma\+::irowvec \&is\+Terminal)
\begin{DoxyCompactList}\small\item\em Sample some experience according to their priorities. \end{DoxyCompactList}\item 
arma\+::ucolvec \textbf{ Sample\+Proportional} ()
\begin{DoxyCompactList}\small\item\em Sample some experience according to their priorities. \end{DoxyCompactList}\item 
const size\+\_\+t \& \textbf{ Size} ()
\begin{DoxyCompactList}\small\item\em Get the number of transitions in the memory. \end{DoxyCompactList}\item 
void \textbf{ Store} (\textbf{ State\+Type} state, \textbf{ Action\+Type} action, double reward, \textbf{ State\+Type} next\+State, bool is\+End, const double \&discount)
\begin{DoxyCompactList}\small\item\em Store the given experience and set the priorities for the given experience. \end{DoxyCompactList}\item 
void \textbf{ Update} (arma\+::mat target, std\+::vector$<$ \textbf{ Action\+Type} $>$ sampled\+Actions, arma\+::mat next\+Action\+Values, arma\+::mat \&gradients)
\begin{DoxyCompactList}\small\item\em Update the priorities of transitions and Update the gradients. \end{DoxyCompactList}\item 
void \textbf{ Update\+Priorities} (arma\+::ucolvec \&indices, arma\+::colvec \&priorities)
\begin{DoxyCompactList}\small\item\em Update priorities of sampled transitions. \end{DoxyCompactList}\end{DoxyCompactItemize}


\subsection{Detailed Description}
\subsubsection*{template$<$typename Environment\+Type$>$\newline
class mlpack\+::rl\+::\+Prioritized\+Replay$<$ Environment\+Type $>$}

Implementation of prioritized experience replay. 

Prioritized experience replay can replay important transitions more frequently by prioritizing transitions, and make agent learn more efficiently.


\begin{DoxyCode}
@article\{schaul2015prioritized,
 title   = \{Prioritized experience replay\},
 author  = \{Schaul, Tom and Quan, John and Antonoglou,
            Ioannis and Silver, David\},
 journal = \{arXiv preprint arXiv:1511.05952\},
 year    = \{2015\}
 \}
\end{DoxyCode}



\begin{DoxyTemplParams}{Template Parameters}
{\em Environment\+Type} & Desired task. \\
\hline
\end{DoxyTemplParams}


Definition at line 39 of file prioritized\+\_\+replay.\+hpp.



\subsection{Member Typedef Documentation}
\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_aaf7b2dc5d49d01961601c7c16be76777}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Action\+Type@{Action\+Type}}
\index{Action\+Type@{Action\+Type}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Action\+Type}
{\footnotesize\ttfamily using \textbf{ Action\+Type} =  typename Environment\+Type\+::\+Action}



Convenient typedef for action. 



Definition at line 43 of file prioritized\+\_\+replay.\+hpp.

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_ada68ef405b7c331a2bee337614f00088}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!State\+Type@{State\+Type}}
\index{State\+Type@{State\+Type}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{State\+Type}
{\footnotesize\ttfamily using \textbf{ State\+Type} =  typename Environment\+Type\+::\+State}



Convenient typedef for state. 



Definition at line 46 of file prioritized\+\_\+replay.\+hpp.



\subsection{Constructor \& Destructor Documentation}
\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_a2d2ee6b689ad5f996c939be2f1f61ba0}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Prioritized\+Replay@{Prioritized\+Replay}}
\index{Prioritized\+Replay@{Prioritized\+Replay}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Prioritized\+Replay()\hspace{0.1cm}{\footnotesize\ttfamily [1/2]}}
{\footnotesize\ttfamily \textbf{ Prioritized\+Replay} (\begin{DoxyParamCaption}{ }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Default constructor. 



Definition at line 60 of file prioritized\+\_\+replay.\+hpp.



References alpha().

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_a2754147b888db0a76084538e3426c4f9}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Prioritized\+Replay@{Prioritized\+Replay}}
\index{Prioritized\+Replay@{Prioritized\+Replay}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Prioritized\+Replay()\hspace{0.1cm}{\footnotesize\ttfamily [2/2]}}
{\footnotesize\ttfamily \textbf{ Prioritized\+Replay} (\begin{DoxyParamCaption}\item[{const size\+\_\+t}]{batch\+Size,  }\item[{const size\+\_\+t}]{capacity,  }\item[{const double}]{alpha,  }\item[{const size\+\_\+t}]{n\+Steps = {\ttfamily 1},  }\item[{const size\+\_\+t}]{dimension = {\ttfamily StateType\+:\+:dimension} }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Construct an instance of prioritized experience replay class. 


\begin{DoxyParams}{Parameters}
{\em batch\+Size} & Number of examples returned at each sample. \\
\hline
{\em capacity} & Total memory size in terms of number of examples. \\
\hline
{\em alpha} & How much prioritization is used. \\
\hline
{\em n\+Steps} & Number of steps to look in the future. \\
\hline
{\em dimension} & The dimension of an encoded state. \\
\hline
\end{DoxyParams}


Definition at line 82 of file prioritized\+\_\+replay.\+hpp.



\subsection{Member Function Documentation}
\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_a26967aa9c873e7085b621d541d4120e0}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Beta\+Anneal@{Beta\+Anneal}}
\index{Beta\+Anneal@{Beta\+Anneal}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Beta\+Anneal()}
{\footnotesize\ttfamily void Beta\+Anneal (\begin{DoxyParamCaption}{ }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Annealing the beta. 



Definition at line 276 of file prioritized\+\_\+replay.\+hpp.



Referenced by Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Sample().

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_abf36129b66f5b30a65d96d11ebfde027}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Get\+N\+Step\+Info@{Get\+N\+Step\+Info}}
\index{Get\+N\+Step\+Info@{Get\+N\+Step\+Info}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Get\+N\+Step\+Info()}
{\footnotesize\ttfamily void Get\+N\+Step\+Info (\begin{DoxyParamCaption}\item[{double \&}]{reward,  }\item[{\textbf{ State\+Type} \&}]{next\+State,  }\item[{bool \&}]{is\+End,  }\item[{const double \&}]{discount }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Get the reward, next state and terminal boolean for nth step. 


\begin{DoxyParams}{Parameters}
{\em reward} & Given reward. \\
\hline
{\em next\+State} & Given next state. \\
\hline
{\em is\+End} & Whether next state is terminal state. \\
\hline
{\em discount} & The discount parameter. \\
\hline
\end{DoxyParams}


Definition at line 171 of file prioritized\+\_\+replay.\+hpp.



Referenced by Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Store().

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_a48a86a6254329a98e1f15d4722c4e85b}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!N\+Steps@{N\+Steps}}
\index{N\+Steps@{N\+Steps}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{N\+Steps()}
{\footnotesize\ttfamily const size\+\_\+t\& N\+Steps (\begin{DoxyParamCaption}{ }\end{DoxyParamCaption}) const\hspace{0.3cm}{\ttfamily [inline]}}



Get the number of steps for n-\/step agent. 



Definition at line 308 of file prioritized\+\_\+replay.\+hpp.



References alpha().

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_a6ecc6da2d5f83f0eefdc74be3465925a}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Sample@{Sample}}
\index{Sample@{Sample}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Sample()}
{\footnotesize\ttfamily void Sample (\begin{DoxyParamCaption}\item[{arma\+::mat \&}]{sampled\+States,  }\item[{std\+::vector$<$ \textbf{ Action\+Type} $>$ \&}]{sampled\+Actions,  }\item[{arma\+::rowvec \&}]{sampled\+Rewards,  }\item[{arma\+::mat \&}]{sampled\+Next\+States,  }\item[{arma\+::irowvec \&}]{is\+Terminal }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Sample some experience according to their priorities. 


\begin{DoxyParams}{Parameters}
{\em sampled\+States} & Sampled encoded states. \\
\hline
{\em sampled\+Actions} & Sampled actions. \\
\hline
{\em sampled\+Rewards} & Sampled rewards. \\
\hline
{\em sampled\+Next\+States} & Sampled encoded next states. \\
\hline
{\em is\+Terminal} & Indicate whether corresponding next state is terminal state. \\
\hline
\end{DoxyParams}


Definition at line 221 of file prioritized\+\_\+replay.\+hpp.



References Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Beta\+Anneal(), and Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Sample\+Proportional().

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_a1a45c1e17aad599a64fa6f941979ad10}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Sample\+Proportional@{Sample\+Proportional}}
\index{Sample\+Proportional@{Sample\+Proportional}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Sample\+Proportional()}
{\footnotesize\ttfamily arma\+::ucolvec Sample\+Proportional (\begin{DoxyParamCaption}{ }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Sample some experience according to their priorities. 

\begin{DoxyReturn}{Returns}
The indices to be chosen. 
\end{DoxyReturn}


Definition at line 198 of file prioritized\+\_\+replay.\+hpp.



Referenced by Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Sample().

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_ab8983dc8f7847b4c77148b86d0e7fc8d}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Size@{Size}}
\index{Size@{Size}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Size()}
{\footnotesize\ttfamily const size\+\_\+t\& Size (\begin{DoxyParamCaption}{ }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Get the number of transitions in the memory. 

\begin{DoxyReturn}{Returns}
Actual used memory size. 
\end{DoxyReturn}


Definition at line 268 of file prioritized\+\_\+replay.\+hpp.

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_ab17ee90540cf7b26647b57acf16116d5}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Store@{Store}}
\index{Store@{Store}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Store()}
{\footnotesize\ttfamily void Store (\begin{DoxyParamCaption}\item[{\textbf{ State\+Type}}]{state,  }\item[{\textbf{ Action\+Type}}]{action,  }\item[{double}]{reward,  }\item[{\textbf{ State\+Type}}]{next\+State,  }\item[{bool}]{is\+End,  }\item[{const double \&}]{discount }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Store the given experience and set the priorities for the given experience. 


\begin{DoxyParams}{Parameters}
{\em state} & Given state. \\
\hline
{\em action} & Given action. \\
\hline
{\em reward} & Given reward. \\
\hline
{\em next\+State} & Given next state. \\
\hline
{\em is\+End} & Whether next state is terminal state. \\
\hline
{\em discount} & The discount parameter. \\
\hline
\end{DoxyParams}


Definition at line 122 of file prioritized\+\_\+replay.\+hpp.



References Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Transition\+::action, alpha(), Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Get\+N\+Step\+Info(), Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Transition\+::is\+End, Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Transition\+::next\+State, Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Transition\+::reward, and Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Transition\+::state.

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_a9f4cfe4f0266408c291e51db0606b8e0}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Update@{Update}}
\index{Update@{Update}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Update()}
{\footnotesize\ttfamily void Update (\begin{DoxyParamCaption}\item[{arma\+::mat}]{target,  }\item[{std\+::vector$<$ \textbf{ Action\+Type} $>$}]{sampled\+Actions,  }\item[{arma\+::mat}]{next\+Action\+Values,  }\item[{arma\+::mat \&}]{gradients }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Update the priorities of transitions and Update the gradients. 


\begin{DoxyParams}{Parameters}
{\em target} & The learned value. \\
\hline
{\em sampled\+Actions} & Agent\textquotesingle{}s sampled action. \\
\hline
{\em next\+Action\+Values} & Agent\textquotesingle{}s next action. \\
\hline
{\em gradients} & The model\textquotesingle{}s gradients. \\
\hline
\end{DoxyParams}


Definition at line 289 of file prioritized\+\_\+replay.\+hpp.



References Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Transition\+::action, and Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Update\+Priorities().

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_a3b512739fae1601beafbb89d51c40a7d}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Update\+Priorities@{Update\+Priorities}}
\index{Update\+Priorities@{Update\+Priorities}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Update\+Priorities()}
{\footnotesize\ttfamily void Update\+Priorities (\begin{DoxyParamCaption}\item[{arma\+::ucolvec \&}]{indices,  }\item[{arma\+::colvec \&}]{priorities }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Update priorities of sampled transitions. 


\begin{DoxyParams}{Parameters}
{\em indices} & The indices of sample to be updated. \\
\hline
{\em priorities} & Their corresponding priorities. \\
\hline
\end{DoxyParams}


Definition at line 256 of file prioritized\+\_\+replay.\+hpp.



References alpha().



Referenced by Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Update().



The documentation for this class was generated from the following file\+:\begin{DoxyCompactItemize}
\item 
/var/www/mlpack.\+ratml.\+org/mlpack.\+org/\+\_\+src/mlpack-\/git/src/mlpack/methods/reinforcement\+\_\+learning/replay/\textbf{ prioritized\+\_\+replay.\+hpp}\end{DoxyCompactItemize}
