@rl-js/redux-mdp

0.9.5 • Public • Published

Classes

MdpFactoryEnvironmentFactory

Class for constructing an Environment implemented as a ReduxMDP

ReduxMDPEnvironment

Class representing in an Environment as an MDP using Redux.

Typedefs

State : *

The underlying state representation of the environment. Should be a serializable object, e.g. state => JSON.parse(JSON.stringify(state)) should be an identity

MdpAction : *

An object representing an action in an MDP. The type is specific to the MDP.

Observation : *

An object representing the observation of an agent in the current state. The type is specific to the MDP.

ReduxAction : Object

An Redux action. e.g. a Flux Standard Action: https://github.com/redux-utilities/flux-standard-action Your MdpAction will be converted into a ReduxAction by resolveAction

reducerState

A Redux reducer. Computes the next state without mutating the previous state object

getObservationObservation

A function to get the observation of the agent given the current state.

computeRewardnumber

A function to compute the reward given a state transition, i.e. (s, a, s). This function should be completely deterministic; any non-determinism should be handled by resolveAction.

isTerminatedboolean

A function to compute whether the environment is terminated, i.e. the current episode is over.

resolveActionReduxAction

A function to resolve a MdpAction into a ReduxAction. Any non-determinism in your environment should go here, as your Redux reducer should be completely deterministic.

MdpFactory ⇐ EnvironmentFactory

Class for constructing an Environment implemented as a ReduxMDP

Kind: global class
Extends: EnvironmentFactory

new MdpFactory(params)

Create a factory for a particular MDP

Param Type Default Description
params object Parameters for constructing the MDP
params.reducer Reducer Redux reducer representing the state of the MDP
params.getObservation getObservation Compute the current observation
params.computeReward computeReward Compute the current reward
params.isTerminated isTerminated Compute whether the environment is terminated
[params.resolveAction] resolveAction Resolve the MdpAction into a ReduxAction
[params.gamma] number 1 Reward discounting factor for the MDP

mdpFactory.createEnvironment() ⇒ ReduxMDP

Create an instance of the environment.

Kind: instance method of MdpFactory

mdpFactory.setMdpMiddleware(middleware)

Configure any MdpMiddleware that should be part of the next invocation of createEnvironment()

Kind: instance method of MdpFactory

Param Type
middleware function

mdpFactory.setReduxMiddleware(middleware)

Configure any ReduxMiddleware that should be part of the next invocation of createEnvironment()

Kind: instance method of MdpFactory

Param Type
middleware function

ReduxMDP ⇐ Environment

Class representing in an Environment as an MDP using Redux.

Kind: global class
Extends: Environment

State : *

The underlying state representation of the environment. Should be a serializable object, e.g. state => JSON.parse(JSON.stringify(state)) should be an identity

Kind: global typedef

MdpAction : *

An object representing an action in an MDP. The type is specific to the MDP.

Kind: global typedef

Observation : *

An object representing the observation of an agent in the current state. The type is specific to the MDP.

Kind: global typedef

ReduxAction : Object

An Redux action. e.g. a Flux Standard Action: https://github.com/redux-utilities/flux-standard-action Your MdpAction will be converted into a ReduxAction by resolveAction

Kind: global typedef
Properties

Name Type Description
type string Each action must have a type associated with it.
[payload] * Any data associated with the action goes here
[error] boolean Should be true IIF the action represents an error
[meta] * Any data that is not explicitly part of the payload

reducer ⇒ State

A Redux reducer. Computes the next state without mutating the previous state object

Kind: global typedef
Returns: State - The new state object after the action is applied

Param Type Description
state State The current state of the MDP
action ReduxAction The resolved action for the MDP

getObservation ⇒ Observation

A function to get the observation of the agent given the current state.

Kind: global typedef
Returns: Observation - The observation for the current state

Param Type Description
state State The current state of the MDP

computeReward ⇒ number

A function to compute the reward given a state transition, i.e. (s, a, s). This function should be completely deterministic; any non-determinism should be handled by resolveAction.

Kind: global typedef
Returns: number - The reward for given the state transition.

Param Type Description
state State the current state for the MDP
action ReduxAction The next action
nextState State the next state for the mdp

isTerminated ⇒ boolean

A function to compute whether the environment is terminated, i.e. the current episode is over.

Kind: global typedef
Returns: boolean - True if the environment is terminated, false otherwise.

Param Type Description
state State the current state for the MDP
action ReduxAction The next action
nextState State the next state for the MDP.
time number The current timestep of the MDP, useful for finite horizon MDPs.

resolveAction ⇒ ReduxAction

A function to resolve a MdpAction into a ReduxAction. Any non-determinism in your environment should go here, as your Redux reducer should be completely deterministic.

Kind: global typedef
Returns: ReduxAction - The new state object after the action is applied

Param Type Description
state State the current state for the MDP
action MdpAction The resolved action for the MDP

Readme

Keywords

none

Package Sidebar

Install

npm i @rl-js/redux-mdp

Weekly Downloads

9

Version

0.9.5

License

MIT

Unpacked Size

18 kB

Total Files

8

Last publish

Collaborators

  • cpnota