Implementation of various Q-Learning algorithms, such as DQN, double DQN. More...

Public Types
using	ActionType = typename EnvironmentType::Action
	Convenient typedef for action. More...

using	StateType = typename EnvironmentType::State
	Convenient typedef for state. More...

Public Member Functions
	QLearning (TrainingConfig config, NetworkType network, PolicyType policy, ReplayType replayMethod, UpdaterType updater=UpdaterType(), EnvironmentType environment=EnvironmentType())
	Create the QLearning object with given settings. More...

	~QLearning ()
	Clean memory. More...

bool &	Deterministic ()
	Modify the training mode / test mode indicator. More...

const bool &	Deterministic () const
	Get the indicator of training mode / test mode. More...

EnvironmentType &	Environment ()
	Modify the environment in which the agent is. More...

const EnvironmentType &	Environment () const
	Get the environment in which the agent is. More...

double	Episode ()
	Execute an episode. More...

const NetworkType &	Network () const
	Return the learning network. More...

NetworkType &	Network ()
	Modify the learning network. More...

StateType &	State ()
	Modify the state of the agent. More...

const StateType &	State () const
	Get the state of the agent. More...

double	Step ()
	Execute a step in an episode. More...

const size_t &	TotalSteps () const

Detailed Description

template<typename EnvironmentType, typename NetworkType, typename UpdaterType, typename PolicyType, typename ReplayType = RandomReplay<EnvironmentType>>
class mlpack::rl::QLearning< EnvironmentType, NetworkType, UpdaterType, PolicyType, ReplayType >

Implementation of various Q-Learning algorithms, such as DQN, double DQN.

For more details, see the following:

@article{Mnih2013,
 author    = {Volodymyr Mnih and
              Koray Kavukcuoglu and
              David Silver and
              Alex Graves and
              Ioannis Antonoglou and
              Daan Wierstra and
              Martin A. Riedmiller},
 title     = {Playing Atari with Deep Reinforcement Learning},
 journal   = {CoRR},
 year      = {2013},
 url       = {http://arxiv.org/abs/1312.5602}
}

Template Parameters

EnvironmentType	The environment of the reinforcement learning task.
NetworkType	The network to compute action value.
UpdaterType	How to apply gradients when training.
PolicyType	Behavior policy of the agent.
ReplayType	Experience replay method.

Definition at line 58 of file q_learning.hpp.

Member Typedef Documentation

◆ ActionType

using ActionType = typename EnvironmentType::Action

Convenient typedef for action.

Definition at line 65 of file q_learning.hpp.

◆ StateType

using StateType = typename EnvironmentType::State

Convenient typedef for state.

Definition at line 62 of file q_learning.hpp.

Constructor & Destructor Documentation

◆ QLearning()

QLearning	(	TrainingConfig	config,
		NetworkType	network,
		PolicyType	policy,
		ReplayType	replayMethod,
		UpdaterType	updater = `UpdaterType()`,
		EnvironmentType	environment = `EnvironmentType()`
	)

Create the QLearning object with given settings.

If you want to pass in a parameter and discard the original parameter object, be sure to use std::move to avoid unnecessary copy.

Parameters

config	Hyper-parameters for training.
network	The network to compute action value.
policy	Behavior policy of the agent.
replayMethod	Experience replay method.
updater	How to apply gradients when training.
environment	Reinforcement learning task.

◆ ~QLearning()

~QLearning ( )

Clean memory.

Member Function Documentation

◆ Deterministic() [1/2]

bool& Deterministic ( )

inline

Modify the training mode / test mode indicator.

Definition at line 120 of file q_learning.hpp.

◆ Deterministic() [2/2]

const bool& Deterministic ( ) const

inline

Get the indicator of training mode / test mode.

Definition at line 122 of file q_learning.hpp.

◆ Environment() [1/2]

EnvironmentType& Environment ( )

inline

Modify the environment in which the agent is.

Definition at line 115 of file q_learning.hpp.

◆ Environment() [2/2]

const EnvironmentType& Environment ( ) const

inline

Get the environment in which the agent is.

Definition at line 117 of file q_learning.hpp.

◆ Episode()

double Episode ( )

Execute an episode.

Returns: Return of the episode.

◆ Network() [1/2]

const NetworkType& Network ( ) const

inline

Return the learning network.

Definition at line 125 of file q_learning.hpp.

◆ Network() [2/2]

NetworkType& Network ( )

inline

Modify the learning network.

Definition at line 127 of file q_learning.hpp.

◆ State() [1/2]

StateType& State ( )

inline

Modify the state of the agent.

Definition at line 110 of file q_learning.hpp.

◆ State() [2/2]

const StateType& State ( ) const

inline

Get the state of the agent.

Definition at line 112 of file q_learning.hpp.

◆ Step()

double Step ( )

Execute a step in an episode.

Returns: Reward for the step.

◆ TotalSteps()

const size_t& TotalSteps ( ) const

inline

Returns: Total steps from beginning.

Definition at line 107 of file q_learning.hpp.

The documentation for this class was generated from the following file:

/home/ryan/src/mlpack.org-go/_src/mlpack-git/src/mlpack/methods/reinforcement_learning/q_learning.hpp

Public Types

Public Member Functions

Detailed Description

template<typename EnvironmentType, typename NetworkType, typename UpdaterType, typename PolicyType, typename ReplayType = RandomReplay<EnvironmentType>> class mlpack::rl::QLearning< EnvironmentType, NetworkType, UpdaterType, PolicyType, ReplayType >

Member Typedef Documentation

◆ ActionType

◆ StateType

Constructor & Destructor Documentation

◆ QLearning()

◆ ~QLearning()

Member Function Documentation

◆ Deterministic() [1/2]

◆ Deterministic() [2/2]

◆ Environment() [1/2]

◆ Environment() [2/2]

◆ Episode()

◆ Network() [1/2]

◆ Network() [2/2]

◆ State() [1/2]

◆ State() [2/2]

◆ Step()

◆ TotalSteps()

template<typename EnvironmentType, typename NetworkType, typename UpdaterType, typename PolicyType, typename ReplayType = RandomReplay<EnvironmentType>>
class mlpack::rl::QLearning< EnvironmentType, NetworkType, UpdaterType, PolicyType, ReplayType >