Implementation of Acrobot game. More...

Classes

class  State
 

Public Types

enum  Action
{
  negativeTorque
,
  zeroTorque
,
  positiveTorque
,
  size

}
 

Public Member Functions

 Acrobot (const double gravity=9.81, const double linkLength1=1.0, const double linkLength2=1.0, const double linkMass1=1.0, const double linkMass2=1.0, const double linkCom1=0.5, const double linkCom2=0.5, const double linkMoi=1.0, const double maxVel1=4 *M_PI, const double maxVel2=9 *M_PI, const double dt=0.2, const double doneReward=0)
 Construct a Acrobot instance using the given constants. More...

 
arma::colvec Dsdt (arma::colvec state, const double torque) const
 This is the ordinary differential equations required for estimation of nextState through RK4 method. More...

 
State InitialSample () const
 This function does random initialization of state space. More...

 
bool IsTerminal (const State &state) const
 This function checks if the acrobot has reached the terminal state. More...

 
arma::colvec Rk4 (const arma::colvec state, const double torque) const
 This function calls the RK4 iterative method to estimate the next state based on given ordinary differential equation. More...

 
double Sample (const State &state, const Action &action, State &nextState) const
 Dynamics of the Acrobot System. More...

 
double Sample (const State &state, const Action &action) const
 Dynamics of the Acrobot System. More...

 
double Torque (const Action &action) const
 This function calculates the torque for a particular action. More...

 
double Wrap (double value, const double minimum, const double maximum) const
 Wrap funtion is required to truncate the angle value from -180 to 180. More...

 

Detailed Description

Implementation of Acrobot game.

Acrobot is a 2-link pendulum with only the second joint actuated. Intitially, both links point downwards. The goal is to swing the end-effector at a height at least the length of one link above the base. Both links can swing freely and can pass by each other, i.e., they don't collide when they have the same angle.

Definition at line 28 of file acrobot.hpp.

Member Enumeration Documentation

◆ Action

enum Action
Enumerator
negativeTorque 
zeroTorque 
positiveTorque 
size 

Definition at line 88 of file acrobot.hpp.

Constructor & Destructor Documentation

◆ Acrobot()

Acrobot ( const double  gravity = 9.81,
const double  linkLength1 = 1.0,
const double  linkLength2 = 1.0,
const double  linkMass1 = 1.0,
const double  linkMass2 = 1.0,
const double  linkCom1 = 0.5,
const double  linkCom2 = 0.5,
const double  linkMoi = 1.0,
const double  maxVel1 = 4 * M_PI,
const double  maxVel2 = 9 * M_PI,
const double  dt = 0.2,
const double  doneReward = 0 
)
inline

Construct a Acrobot instance using the given constants.

Parameters
gravityThe gravity parameter.
linkLength1The length of link 1.
linkLength2The length of link 2.
linkMass1The mass of link 1.
linkMass2The mass of link 2.
linkCom1The position of the center of mass of link 1.
linkCom2The position of the center of mass of link 2.
linkMoiThe moments of inertia for both link.
maxVel1The max angular velocity of link1.
maxVel2The max angular velocity of link2.
dtThe differential value.

Definition at line 113 of file acrobot.hpp.

Member Function Documentation

◆ Dsdt()

arma::colvec Dsdt ( arma::colvec  state,
const double  torque 
) const
inline

This is the ordinary differential equations required for estimation of nextState through RK4 method.

Parameters
stateCurrent State.
torqueThe torque Applied.

Definition at line 218 of file acrobot.hpp.

References M_PI.

Referenced by Acrobot::Rk4().

◆ InitialSample()

State InitialSample ( ) const
inline

This function does random initialization of state space.

Definition at line 195 of file acrobot.hpp.

References Acrobot::State::State().

◆ IsTerminal()

bool IsTerminal ( const State state) const
inline

This function checks if the acrobot has reached the terminal state.

Parameters
stateThe current State.

Definition at line 205 of file acrobot.hpp.

References Acrobot::State::Theta1(), and Acrobot::State::Theta2().

Referenced by Acrobot::Sample().

◆ Rk4()

arma::colvec Rk4 ( const arma::colvec  state,
const double  torque 
) const
inline

This function calls the RK4 iterative method to estimate the next state based on given ordinary differential equation.

Parameters
stateThe current State.
torqueThe torque applied.

Definition at line 305 of file acrobot.hpp.

References Acrobot::Dsdt().

Referenced by Acrobot::Sample().

◆ Sample() [1/2]

double Sample ( const State state,
const Action action,
State nextState 
) const
inline

Dynamics of the Acrobot System.

To get reward and next state based on current state and current action. Always return -1 reward.

Parameters
stateThe current State.
actionThe action taken.
nextStateThe next state.
Returns
reward, it's always -1.0.

The value of angular velocity is bounded in min and max value.

If the acrobot reaches a terminal state, it should be given a positive reward. This will ensure that the agent learns the goal of the game.

Definition at line 148 of file acrobot.hpp.

References Acrobot::State::AngularVelocity1(), Acrobot::State::AngularVelocity2(), Acrobot::IsTerminal(), M_PI, Acrobot::Rk4(), Acrobot::State::Theta1(), Acrobot::State::Theta2(), Acrobot::Torque(), and Acrobot::Wrap().

Referenced by Acrobot::Sample().

◆ Sample() [2/2]

double Sample ( const State state,
const Action action 
) const
inline

Dynamics of the Acrobot System.

To get reward and next state based on current state and current action. This function calls the Sample function to estimate the next state return reward for taking a particular action.

Parameters
stateThe current State.
actionThe action taken.
nextStateThe next state.

Definition at line 186 of file acrobot.hpp.

References Acrobot::Sample().

◆ Torque()

double Torque ( const Action action) const
inline

This function calculates the torque for a particular action.

0 : negative torque, 1 : zero torque, 2 : positive torque.

Parameters
Actionaction taken.

Definition at line 291 of file acrobot.hpp.

References mlpack::math::Random().

Referenced by Acrobot::Sample().

◆ Wrap()

double Wrap ( double  value,
const double  minimum,
const double  maximum 
) const
inline

Wrap funtion is required to truncate the angle value from -180 to 180.

This function will make sure that value will always be between minimum to maximum.

Parameters
valueScalar value to wrap.
minimumMinimum range of wrap.
maximumMaximum range of wrap.

Definition at line 267 of file acrobot.hpp.

Referenced by Acrobot::Sample().


The documentation for this class was generated from the following file:
  • /home/ryan/src/mlpack.org/_src/mlpack-3.1.0/src/mlpack/methods/reinforcement_learning/environment/acrobot.hpp