\section{Prioritized\+Replay$<$ Environment\+Type $>$ Class Template Reference}
\label{classmlpack_1_1rl_1_1PrioritizedReplay}\index{Prioritized\+Replay$<$ Environment\+Type $>$@{Prioritized\+Replay$<$ Environment\+Type $>$}}


Implementation of prioritized experience replay.  


\subsection*{Public Types}
\begin{DoxyCompactItemize}
\item 
using \textbf{ Action\+Type} = typename Environment\+Type\+::\+Action
\begin{DoxyCompactList}\small\item\em Convenient typedef for action. \end{DoxyCompactList}\item 
using \textbf{ State\+Type} = typename Environment\+Type\+::\+State
\begin{DoxyCompactList}\small\item\em Convenient typedef for state. \end{DoxyCompactList}\end{DoxyCompactItemize}
\subsection*{Public Member Functions}
\begin{DoxyCompactItemize}
\item 
\textbf{ Prioritized\+Replay} ()
\begin{DoxyCompactList}\small\item\em Default constructor. \end{DoxyCompactList}\item 
\textbf{ Prioritized\+Replay} (const size\+\_\+t batch\+Size, const size\+\_\+t capacity, const double alpha, const size\+\_\+t dimension=State\+Type\+::dimension)
\begin{DoxyCompactList}\small\item\em Construct an instance of prioritized experience replay class. \end{DoxyCompactList}\item 
void \textbf{ Beta\+Anneal} ()
\begin{DoxyCompactList}\small\item\em Annealing the beta. \end{DoxyCompactList}\item 
void \textbf{ Sample} (arma\+::mat \&sampled\+States, arma\+::icolvec \&sampled\+Actions, arma\+::colvec \&sampled\+Rewards, arma\+::mat \&sampled\+Next\+States, arma\+::icolvec \&is\+Terminal)
\begin{DoxyCompactList}\small\item\em Sample some experience according to their priorities. \end{DoxyCompactList}\item 
arma\+::ucolvec \textbf{ Sample\+Proportional} ()
\begin{DoxyCompactList}\small\item\em Sample some experience according to their priorities. \end{DoxyCompactList}\item 
const size\+\_\+t \& \textbf{ Size} ()
\begin{DoxyCompactList}\small\item\em Get the number of transitions in the memory. \end{DoxyCompactList}\item 
void \textbf{ Store} (const \textbf{ State\+Type} \&state, \textbf{ Action\+Type} action, double reward, const \textbf{ State\+Type} \&next\+State, bool is\+End)
\begin{DoxyCompactList}\small\item\em Store the given experience and set the priorities for the given experience. \end{DoxyCompactList}\item 
void \textbf{ Update} (arma\+::mat target, arma\+::icolvec sampled\+Actions, arma\+::mat next\+Action\+Values, arma\+::mat \&gradients)
\begin{DoxyCompactList}\small\item\em Update the priorities of transitions and Update the gradients. \end{DoxyCompactList}\item 
void \textbf{ Update\+Priorities} (arma\+::ucolvec \&indices, arma\+::colvec \&priorities)
\begin{DoxyCompactList}\small\item\em Update priorities of sampled transitions. \end{DoxyCompactList}\end{DoxyCompactItemize}


\subsection{Detailed Description}
\subsubsection*{template$<$typename Environment\+Type$>$\newline
class mlpack\+::rl\+::\+Prioritized\+Replay$<$ Environment\+Type $>$}

Implementation of prioritized experience replay. 

Prioritized experience replay can replay important transitions more frequently by prioritizing transitions, and make agent learn more efficiently.


\begin{DoxyCode}
@article\{schaul2015prioritized,
 title   = \{Prioritized experience replay\},
 author  = \{Schaul, Tom and Quan, John and Antonoglou,
            Ioannis and Silver, David\},
 journal = \{arXiv preprint arXiv:1511.05952\},
 year    = \{2015\}
 \}
\end{DoxyCode}



\begin{DoxyTemplParams}{Template Parameters}
{\em Environment\+Type} & Desired task. \\
\hline
\end{DoxyTemplParams}


Definition at line 39 of file prioritized\+\_\+replay.\+hpp.



\subsection{Member Typedef Documentation}
\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_aaf7b2dc5d49d01961601c7c16be76777}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Action\+Type@{Action\+Type}}
\index{Action\+Type@{Action\+Type}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Action\+Type}
{\footnotesize\ttfamily using \textbf{ Action\+Type} =  typename Environment\+Type\+::\+Action}



Convenient typedef for action. 



Definition at line 43 of file prioritized\+\_\+replay.\+hpp.

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_ada68ef405b7c331a2bee337614f00088}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!State\+Type@{State\+Type}}
\index{State\+Type@{State\+Type}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{State\+Type}
{\footnotesize\ttfamily using \textbf{ State\+Type} =  typename Environment\+Type\+::\+State}



Convenient typedef for state. 



Definition at line 46 of file prioritized\+\_\+replay.\+hpp.



\subsection{Constructor \& Destructor Documentation}
\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_a2d2ee6b689ad5f996c939be2f1f61ba0}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Prioritized\+Replay@{Prioritized\+Replay}}
\index{Prioritized\+Replay@{Prioritized\+Replay}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Prioritized\+Replay()\hspace{0.1cm}{\footnotesize\ttfamily [1/2]}}
{\footnotesize\ttfamily \textbf{ Prioritized\+Replay} (\begin{DoxyParamCaption}{ }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Default constructor. 



Definition at line 51 of file prioritized\+\_\+replay.\+hpp.

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_ae269b586dd95c4bc1b66860c318e0711}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Prioritized\+Replay@{Prioritized\+Replay}}
\index{Prioritized\+Replay@{Prioritized\+Replay}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Prioritized\+Replay()\hspace{0.1cm}{\footnotesize\ttfamily [2/2]}}
{\footnotesize\ttfamily \textbf{ Prioritized\+Replay} (\begin{DoxyParamCaption}\item[{const size\+\_\+t}]{batch\+Size,  }\item[{const size\+\_\+t}]{capacity,  }\item[{const double}]{alpha,  }\item[{const size\+\_\+t}]{dimension = {\ttfamily StateType\+:\+:dimension} }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Construct an instance of prioritized experience replay class. 


\begin{DoxyParams}{Parameters}
{\em batch\+Size} & Number of examples returned at each sample. \\
\hline
{\em capacity} & Total memory size in terms of number of examples. \\
\hline
{\em alpha} & How much prioritization is used. \\
\hline
{\em dimension} & The dimension of an encoded state. \\
\hline
\end{DoxyParams}


Definition at line 62 of file prioritized\+\_\+replay.\+hpp.



\subsection{Member Function Documentation}
\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_a26967aa9c873e7085b621d541d4120e0}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Beta\+Anneal@{Beta\+Anneal}}
\index{Beta\+Anneal@{Beta\+Anneal}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Beta\+Anneal()}
{\footnotesize\ttfamily void Beta\+Anneal (\begin{DoxyParamCaption}{ }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Annealing the beta. 



Definition at line 203 of file prioritized\+\_\+replay.\+hpp.



Referenced by Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Sample().

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_aacbf6723cb49d015e918c82011889dcc}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Sample@{Sample}}
\index{Sample@{Sample}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Sample()}
{\footnotesize\ttfamily void Sample (\begin{DoxyParamCaption}\item[{arma\+::mat \&}]{sampled\+States,  }\item[{arma\+::icolvec \&}]{sampled\+Actions,  }\item[{arma\+::colvec \&}]{sampled\+Rewards,  }\item[{arma\+::mat \&}]{sampled\+Next\+States,  }\item[{arma\+::icolvec \&}]{is\+Terminal }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Sample some experience according to their priorities. 


\begin{DoxyParams}{Parameters}
{\em sampled\+States} & Sampled encoded states. \\
\hline
{\em sampled\+Actions} & Sampled actions. \\
\hline
{\em sampled\+Rewards} & Sampled rewards. \\
\hline
{\em sampled\+Next\+States} & Sampled encoded next states. \\
\hline
{\em is\+Terminal} & Indicate whether corresponding next state is terminal state. \\
\hline
\end{DoxyParams}


Definition at line 149 of file prioritized\+\_\+replay.\+hpp.



References Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Beta\+Anneal(), Sum\+Tree$<$ T $>$\+::\+Get(), Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Sample\+Proportional(), and Sum\+Tree$<$ T $>$\+::\+Sum().

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_a1a45c1e17aad599a64fa6f941979ad10}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Sample\+Proportional@{Sample\+Proportional}}
\index{Sample\+Proportional@{Sample\+Proportional}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Sample\+Proportional()}
{\footnotesize\ttfamily arma\+::ucolvec Sample\+Proportional (\begin{DoxyParamCaption}{ }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Sample some experience according to their priorities. 

\begin{DoxyReturn}{Returns}
The indices to be chosen. 
\end{DoxyReturn}


Definition at line 126 of file prioritized\+\_\+replay.\+hpp.



References Sum\+Tree$<$ T $>$\+::\+Find\+Prefix\+Sum(), and Sum\+Tree$<$ T $>$\+::\+Sum().



Referenced by Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Sample().

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_ab8983dc8f7847b4c77148b86d0e7fc8d}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Size@{Size}}
\index{Size@{Size}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Size()}
{\footnotesize\ttfamily const size\+\_\+t\& Size (\begin{DoxyParamCaption}{ }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Get the number of transitions in the memory. 

\begin{DoxyReturn}{Returns}
Actual used memory size. 
\end{DoxyReturn}


Definition at line 195 of file prioritized\+\_\+replay.\+hpp.

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_a2a1280d285e5c5dc89392ae9a9bb2c97}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Store@{Store}}
\index{Store@{Store}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Store()}
{\footnotesize\ttfamily void Store (\begin{DoxyParamCaption}\item[{const \textbf{ State\+Type} \&}]{state,  }\item[{\textbf{ Action\+Type}}]{action,  }\item[{double}]{reward,  }\item[{const \textbf{ State\+Type} \&}]{next\+State,  }\item[{bool}]{is\+End }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Store the given experience and set the priorities for the given experience. 


\begin{DoxyParams}{Parameters}
{\em state} & Given state. \\
\hline
{\em action} & Given action. \\
\hline
{\em reward} & Given reward. \\
\hline
{\em next\+State} & Given next state. \\
\hline
{\em is\+End} & Whether next state is terminal state. \\
\hline
\end{DoxyParams}


Definition at line 99 of file prioritized\+\_\+replay.\+hpp.



References Sum\+Tree$<$ T $>$\+::\+Set().

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_a8c1b8d14b9e93e488e4f38cb7d8c899c}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Update@{Update}}
\index{Update@{Update}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Update()}
{\footnotesize\ttfamily void Update (\begin{DoxyParamCaption}\item[{arma\+::mat}]{target,  }\item[{arma\+::icolvec}]{sampled\+Actions,  }\item[{arma\+::mat}]{next\+Action\+Values,  }\item[{arma\+::mat \&}]{gradients }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Update the priorities of transitions and Update the gradients. 


\begin{DoxyParams}{Parameters}
{\em target} & The learned value. \\
\hline
{\em sampled\+Actions} & Agent\textquotesingle{}s sampled action. \\
\hline
{\em next\+Action\+Values} & Agent\textquotesingle{}s next action. \\
\hline
{\em gradients} & The model\textquotesingle{}s gradients. \\
\hline
\end{DoxyParams}


Definition at line 216 of file prioritized\+\_\+replay.\+hpp.



References Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Update\+Priorities().

\mbox{\label{classmlpack_1_1rl_1_1PrioritizedReplay_a3b512739fae1601beafbb89d51c40a7d}} 
\index{mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}!Update\+Priorities@{Update\+Priorities}}
\index{Update\+Priorities@{Update\+Priorities}!mlpack\+::rl\+::\+Prioritized\+Replay@{mlpack\+::rl\+::\+Prioritized\+Replay}}
\subsubsection{Update\+Priorities()}
{\footnotesize\ttfamily void Update\+Priorities (\begin{DoxyParamCaption}\item[{arma\+::ucolvec \&}]{indices,  }\item[{arma\+::colvec \&}]{priorities }\end{DoxyParamCaption})\hspace{0.3cm}{\ttfamily [inline]}}



Update priorities of sampled transitions. 


\begin{DoxyParams}{Parameters}
{\em indices} & The indices of sample to be updated. \\
\hline
{\em priorities} & Their corresponding priorities. \\
\hline
\end{DoxyParams}


Definition at line 183 of file prioritized\+\_\+replay.\+hpp.



References Sum\+Tree$<$ T $>$\+::\+Batch\+Update().



Referenced by Prioritized\+Replay$<$ Environment\+Type $>$\+::\+Update().



The documentation for this class was generated from the following file\+:\begin{DoxyCompactItemize}
\item 
/var/www/mlpack.\+ratml.\+org/mlpack.\+org/\+\_\+src/mlpack-\/3.\+3.\+1/src/mlpack/methods/reinforcement\+\_\+learning/replay/\textbf{ prioritized\+\_\+replay.\+hpp}\end{DoxyCompactItemize}
