GreedyPolicy< EnvironmentType > Class Template Reference

Implementation for epsilon greedy policy. More...

Public Types

using ActionType = typename EnvironmentType::Action
 Convenient typedef for action. More...

 

Public Member Functions

 GreedyPolicy (const double initialEpsilon, const size_t annealInterval, const double minEpsilon)
 Constructor for epsilon greedy policy class. More...

 
void Anneal ()
 Exploration probability will anneal at each step. More...

 
const double & Epsilon () const
 
ActionType Sample (const arma::colvec &actionValue, bool deterministic=false)
 Sample an action based on given action values. More...

 

Detailed Description


template
<
typename
EnvironmentType
>

class mlpack::rl::GreedyPolicy< EnvironmentType >

Implementation for epsilon greedy policy.

In general we will select an action greedily based on the action value, however sometimes we will also randomly select an action to encourage exploration.

Template Parameters
EnvironmentTypeThe reinforcement learning task.

Definition at line 30 of file greedy_policy.hpp.

Member Typedef Documentation

◆ ActionType

using ActionType = typename EnvironmentType::Action

Convenient typedef for action.

Definition at line 34 of file greedy_policy.hpp.

Constructor & Destructor Documentation

◆ GreedyPolicy()

GreedyPolicy ( const double  initialEpsilon,
const size_t  annealInterval,
const double  minEpsilon 
)
inline

Constructor for epsilon greedy policy class.

Parameters
initialEpsilonThe initial probability to explore (select a random action).
annealIntervalThe steps during which the probability to explore will anneal.
minEpsilonEpsilon will never be less than this value.

Definition at line 45 of file greedy_policy.hpp.

Member Function Documentation

◆ Anneal()

void Anneal ( )
inline

Exploration probability will anneal at each step.

Definition at line 76 of file greedy_policy.hpp.

◆ Epsilon()

const double& Epsilon ( ) const
inline
Returns
Current possibility to explore.

Definition at line 85 of file greedy_policy.hpp.

◆ Sample()

ActionType Sample ( const arma::colvec &  actionValue,
bool  deterministic = false 
)
inline

Sample an action based on given action values.

Parameters
actionValueValues for each action.
deterministicAlways select the action greedily.
Returns
Sampled action.

Definition at line 60 of file greedy_policy.hpp.

References mlpack::math::RandInt(), and mlpack::math::Random().


The documentation for this class was generated from the following file:
  • /home/ryan/src/mlpack.org/_src/mlpack-3.0.4/src/mlpack/methods/reinforcement_learning/policy/greedy_policy.hpp