Implementation of various Q-Learning algorithms, such as DQN, double DQN. More...

Public Types
using	ActionType = typename EnvironmentType::Action
	Convenient typedef for action. More...

using	StateType = typename EnvironmentType::State
	Convenient typedef for state. More...

Public Member Functions
	QLearning (TrainingConfig config, NetworkType network, PolicyType policy, ReplayType replayMethod, UpdaterType updater=UpdaterType(), EnvironmentType environment=EnvironmentType())
	Create the QLearning object with given settings. More...

bool &	Deterministic ()
	Modify the training mode / test mode indicator. More...

const bool &	Deterministic () const
	Get the indicator of training mode / test mode. More...

double	Episode ()
	Execute an episode. More...

double	Step ()
	Execute a step in an episode. More...

const size_t &	TotalSteps () const

Detailed Description

template<typename EnvironmentType, typename NetworkType, typename UpdaterType, typename PolicyType, typename ReplayType = RandomReplay<EnvironmentType>>
class mlpack::rl::QLearning< EnvironmentType, NetworkType, UpdaterType, PolicyType, ReplayType >

Implementation of various Q-Learning algorithms, such as DQN, double DQN.

For more details, see the following:

@article{Mnih2013,
 author    = {Volodymyr Mnih and
              Koray Kavukcuoglu and
              David Silver and
              Alex Graves and
              Ioannis Antonoglou and
              Daan Wierstra and
              Martin A. Riedmiller},
 title     = {Playing Atari with Deep Reinforcement Learning},
 journal   = {CoRR},
 year      = {2013},
 url       = {http://arxiv.org/abs/1312.5602}
}

Template Parameters

EnvironmentType	The environment of the reinforcement learning task.
NetworkType	The network to compute action value.
UpdaterType	How to apply gradients when training.
PolicyType	Behavior policy of the agent.
ReplayType	Experience replay method.

Definition at line 57 of file q_learning.hpp.

Member Typedef Documentation

◆ ActionType

using ActionType = typename EnvironmentType::Action

Convenient typedef for action.

Definition at line 64 of file q_learning.hpp.

◆ StateType

using StateType = typename EnvironmentType::State

Convenient typedef for state.

Definition at line 61 of file q_learning.hpp.

Constructor & Destructor Documentation

◆ QLearning()

QLearning	(	TrainingConfig	config,
		NetworkType	network,
		PolicyType	policy,
		ReplayType	replayMethod,
		UpdaterType	updater = `UpdaterType()`,
		EnvironmentType	environment = `EnvironmentType()`
	)

Create the QLearning object with given settings.

If you want to pass in a parameter and discard the original parameter object, be sure to use std::move to avoid unnecessary copy.

Parameters

config	Hyper-parameters for training.
network	The network to compute action value.
policy	Behavior policy of the agent.
replayMethod	Experience replay method.
updater	How to apply gradients when training.
environment	Reinforcement learning task.

Member Function Documentation

◆ Deterministic() [1/2]

bool& Deterministic ( )

inline

Modify the training mode / test mode indicator.

Definition at line 104 of file q_learning.hpp.

◆ Deterministic() [2/2]

const bool& Deterministic ( ) const

inline

Get the indicator of training mode / test mode.

Definition at line 106 of file q_learning.hpp.

◆ Episode()

double Episode ( )

Execute an episode.

Returns: Return of the episode.

◆ Step()

double Step ( )

Execute a step in an episode.

Returns: Reward for the step.

◆ TotalSteps()

const size_t& TotalSteps ( ) const

inline

Returns: Total steps from beginning.

Definition at line 101 of file q_learning.hpp.

The documentation for this class was generated from the following file:

src/mlpack/methods/reinforcement_learning/q_learning.hpp

Public Types

Public Member Functions

Detailed Description

template<typename EnvironmentType, typename NetworkType, typename UpdaterType, typename PolicyType, typename ReplayType = RandomReplay<EnvironmentType>> class mlpack::rl::QLearning< EnvironmentType, NetworkType, UpdaterType, PolicyType, ReplayType >

Member Typedef Documentation

◆ ActionType

◆ StateType

Constructor & Destructor Documentation

◆ QLearning()

Member Function Documentation

◆ Deterministic() [1/2]

◆ Deterministic() [2/2]

◆ Episode()

◆ Step()

◆ TotalSteps()

template<typename EnvironmentType, typename NetworkType, typename UpdaterType, typename PolicyType, typename ReplayType = RandomReplay<EnvironmentType>>
class mlpack::rl::QLearning< EnvironmentType, NetworkType, UpdaterType, PolicyType, ReplayType >