Shortcuts

openrl.envs.wrappers package

Submodules

openrl.envs.wrappers.atari_wrappers module

class openrl.envs.wrappers.atari_wrappers.ClipRewardEnv(env, cfg=None)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reward(reward)[source]

Bin reward to {+1, 0, -1} by its sign.

class openrl.envs.wrappers.atari_wrappers.EpisodicLifeEnv(env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reset(**kwargs)[source]

Reset only when lives are exhausted. This way all states are still reachable even though lives are episodic, and the learner need not know about any of this behind-the-scenes.

step(action)[source]

Uses the step() of the env that can be overwritten to change the returned data.

class openrl.envs.wrappers.atari_wrappers.FireResetEnv(env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reset(**kwargs)[source]

Uses the reset() of the env that can be overwritten to change the returned data.

class openrl.envs.wrappers.atari_wrappers.NoopResetEnv(env, noop_max=30)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reset(**kwargs)[source]

Do no-op action for a number of steps in [1, noop_max].

class openrl.envs.wrappers.atari_wrappers.WarpFrame(env, width=84, height=84)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(frame)[source]

Returns a modified observation.

Args:

observation: The env observation

Returns:

The modified observation

openrl.envs.wrappers.base_wrapper module

class openrl.envs.wrappers.base_wrapper.BaseObservationWrapper(env, cfg=None, reward_class=None)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(observation: gymnasium.core.ObsType) gymnasium.core.WrapperObsType[source]

Returns a modified observation.

Args:

observation: The env observation

Returns:

The modified observation

reset(*, seed: Optional[int] = None, options: Optional[Dict[str, Any]] = None) Tuple[gymnasium.core.WrapperObsType, Dict[str, Any]][source]

Modifies the env after calling reset(), returning a modified observation using self.observation().

step(action: gymnasium.core.ActType) Tuple[gymnasium.core.WrapperObsType, SupportsFloat, bool, bool, Dict[str, Any]][source]

Modifies the env after calling step() using self.observation() on the returned observations.

class openrl.envs.wrappers.base_wrapper.BaseRewardWrapper(env, cfg=None)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reward(reward: openrl.envs.wrappers.base_wrapper.ArrayType) openrl.envs.wrappers.base_wrapper.ArrayType[source]

Returns a modified environment reward.

Args:

reward: The env step() reward

Returns:

The modified reward

step(action: gymnasium.core.ActType) Tuple[gymnasium.core.ObsType, SupportsFloat, bool, bool, Dict[str, Any]][source]

Modifies the env step() reward using self.reward().

class openrl.envs.wrappers.base_wrapper.BaseWrapper(env, cfg=None, reward_class=None)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

property agent_num
property env_name
property has_auto_reset
set_render_mode(render_mode: Union[None, str])[source]
step(action)[source]

Uses the step() of the env that can be overwritten to change the returned data.

property use_monitor

openrl.envs.wrappers.extra_wrappers module

class openrl.envs.wrappers.extra_wrappers.AddStep(env: gymnasium.core.Env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(observation)[source]

Flattens an observation.

Args:

observation: The observation to flatten

Returns:

The flattened observation

reset(*, seed: Optional[int] = None, options: Optional[Dict[str, Any]] = None) Tuple[gymnasium.core.WrapperObsType, Dict[str, Any]][source]

Modifies the env after calling reset(), returning a modified observation using self.observation().

step(action: gymnasium.core.ActType) Tuple[gymnasium.core.WrapperObsType, SupportsFloat, bool, bool, Dict[str, Any]][source]

Modifies the env after calling step() using self.observation() on the returned observations.

class openrl.envs.wrappers.extra_wrappers.AutoReset(env: gymnasium.core.Env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

property has_auto_reset
class openrl.envs.wrappers.extra_wrappers.ConvertEmptyBoxWrapper(env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(observation)[source]

Returns a modified observation.

Args:

observation: The env observation

Returns:

The modified observation

class openrl.envs.wrappers.extra_wrappers.DictWrapper(env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(observation)[source]

Returns a modified observation.

Args:

observation: The env observation

Returns:

The modified observation

class openrl.envs.wrappers.extra_wrappers.FlattenObservation(env: gymnasium.core.Env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(observation)[source]

Flattens an observation.

Args:

observation: The observation to flatten

Returns:

The flattened observation

class openrl.envs.wrappers.extra_wrappers.FrameSkip(env, num_frames: int = 8)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

step(action)[source]

Uses the step() of the env that can be overwritten to change the returned data.

class openrl.envs.wrappers.extra_wrappers.GIFWrapper(env, gif_path: str, fps: int = 30)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

close()[source]

Closes the wrapper and env.

reset(**kwargs)[source]

Uses the reset() of the env that can be overwritten to change the returned data.

step(action)[source]

Uses the step() of the env that can be overwritten to change the returned data.

class openrl.envs.wrappers.extra_wrappers.MoveActionMask2InfoWrapper(env: gymnasium.core.Env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reset(**kwargs)[source]

Uses the reset() of the env that can be overwritten to change the returned data.

step(action)[source]

Uses the step() of the env that can be overwritten to change the returned data.

class openrl.envs.wrappers.extra_wrappers.RecordReward(env, cfg=None, reward_class=None)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

property has_auto_reset
class openrl.envs.wrappers.extra_wrappers.RemoveTruncated(env: gymnasium.core.Env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

step(action)[source]

Steps through the environment, returning 5 or 4 items depending on output_truncation_bool.

Args:

action: action to step through the environment with

Returns:

(observation, reward, terminated, truncated, info) or (observation, reward, done, info)

class openrl.envs.wrappers.extra_wrappers.ZeroRewardWrapper(env, cfg=None)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reward(reward: openrl.envs.wrappers.base_wrapper.ArrayType) openrl.envs.wrappers.base_wrapper.ArrayType[source]

Returns a modified environment reward.

Args:

reward: The env step() reward

Returns:

The modified reward

openrl.envs.wrappers.extra_wrappers.convert_to_done_step_api(step_returns, is_vector_env: bool = False)[source]
openrl.envs.wrappers.extra_wrappers.step_api_compatibility(step_returns, output_truncation_bool: bool = True, is_vector_env: bool = False)[source]

openrl.envs.wrappers.flatten module

openrl.envs.wrappers.flatten.flatten(space: gymnasium.spaces.space.Space[openrl.envs.wrappers.flatten.T], agent_num: int, x: openrl.envs.wrappers.flatten.T) Union[numpy.ndarray[Any, numpy.dtype[Any]], Dict[str, Any], Tuple[Any, ...], gymnasium.spaces.graph.GraphInstance][source]

Flatten a data point from a space.

This is useful when e.g. points from spaces must be passed to a neural network, which only understands flat arrays of floats.

Args:

space: The space that x is flattened by x: The value to flatten

Returns:

The flattened datapoint

  • For gymnasium.spaces.Box and gymnasium.spaces.MultiBinary, this is a flattened array

  • For gymnasium.spaces.Discrete and gymnasium.spaces.MultiDiscrete, this is a flattened one-hot array of the sample

  • For gymnasium.spaces.Tuple and gymnasium.spaces.Dict, this is a concatenated array the subspaces (does not support graph subspaces)

  • For graph spaces, returns GraphInstance where:
    • GraphInstance.nodes are n x k arrays

    • GraphInstance.edges are either:
      • m x k arrays

      • None

    • GraphInstance.edge_links are either:
      • m x 2 arrays

      • None

Raises:

NotImplementedError: If the space is not defined in gymnasium.spaces.

Example:
>>> from gymnasium.spaces import Box, Discrete, Tuple
>>> space = Box(0, 1, shape=(3, 5))
>>> flatten(space, space.sample()).shape
(15,)
>>> space = Discrete(4)
>>> flatten(space, 2)
array([0, 0, 1, 0])
>>> space = Tuple((Box(0, 1, shape=(2,)), Box(0, 1, shape=(3,)), Discrete(3)))
>>> example = ((.5, .25), (1., 0., .2), 1)
>>> flatten(space, example)
array([0.5 , 0.25, 1.  , 0.  , 0.2 , 0.  , 1.  , 0.  ])

openrl.envs.wrappers.image_wrappers module

class openrl.envs.wrappers.image_wrappers.TransposeImage(env=None, op=[2, 0, 1])[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(ob)[source]

Returns a modified observation.

Args:

observation: The env observation

Returns:

The modified observation

openrl.envs.wrappers.mat_wrapper module

class openrl.envs.wrappers.mat_wrapper.MATWrapper(env: openrl.envs.vec_env.base_venv.BaseVecEnv)[source]

Bases: openrl.envs.vec_env.wrappers.base_wrapper.VectorObservationWrapper

observation(observation)[source]

Defines the observation transformation.

Args:

observation (object): the observation from the environment

Returns:

observation (object): the transformed observation

property observation_space

Return the Env observation_space unless overwritten then the wrapper observation_space is used.

openrl.envs.wrappers.monitor module

class openrl.envs.wrappers.monitor.Monitor(env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

A monitor wrapper for Gym environments, it is used to know the episode reward, length, time and other data.

Parameters

env – The environment

get_episode_lengths() List[int][source]

Returns the number of timesteps of all the episodes

Returns

get_episode_rewards() List[float][source]

Returns the rewards of all the episodes

Returns

get_episode_times() List[float][source]

Returns the runtime in seconds of all the episodes

Returns

get_total_steps() int[source]

Returns the total number of timesteps

Returns

reset(**kwargs)[source]

Calls the Gym environment reset.

Parameters

kwargs – Extra keywords saved for the next episode. only if defined by reset_keywords

Returns

the first observation of the environment

step(action: Union[numpy.ndarray, int])[source]

Step the environment with the given action

Parameters

action – the action

Returns

observation, reward, done, information or observation, reward, terminal, truncated, information

openrl.envs.wrappers.multiagent_wrapper module

class openrl.envs.wrappers.multiagent_wrapper.Single2MultiAgentWrapper(env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

property agent_num
reset(*, seed=None, options=None)[source]

Uses the reset() of the env that can be overwritten to change the returned data.

step(action)[source]

Uses the step() of the env that can be overwritten to change the returned data.

openrl.envs.wrappers.pettingzoo_wrappers module

openrl.envs.wrappers.util module

openrl.envs.wrappers.util.is_wrapped(env: gymnasium.core.Env, wrapper_class: Type[openrl.envs.wrappers.base_wrapper.BaseWrapper]) bool[source]

Check if a given environment has been wrapped with a given wrapper.

Parameters
  • env – Environment to check

  • wrapper_class – Wrapper class to look for

Returns

True if environment has been wrapped with wrapper_class.

openrl.envs.wrappers.util.nest_expand_dim(input: Any) Any[source]
openrl.envs.wrappers.util.unwrap_wrapper(env: gymnasium.core.Env, wrapper_class: Type[openrl.envs.wrappers.base_wrapper.BaseWrapper]) Optional[openrl.envs.wrappers.base_wrapper.BaseWrapper][source]

Retrieve a BaseWrapper object by recursively searching.

Parameters
  • env – Environment to unwrap

  • wrapper_class – Wrapper to look for

Returns

Environment unwrapped till wrapper_class if it has been wrapped with it

Module contents

class openrl.envs.wrappers.AutoReset(env: gymnasium.core.Env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

property has_auto_reset
class openrl.envs.wrappers.BaseObservationWrapper(env, cfg=None, reward_class=None)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(observation: gymnasium.core.ObsType) gymnasium.core.WrapperObsType[source]

Returns a modified observation.

Args:

observation: The env observation

Returns:

The modified observation

reset(*, seed: Optional[int] = None, options: Optional[Dict[str, Any]] = None) Tuple[gymnasium.core.WrapperObsType, Dict[str, Any]][source]

Modifies the env after calling reset(), returning a modified observation using self.observation().

step(action: gymnasium.core.ActType) Tuple[gymnasium.core.WrapperObsType, SupportsFloat, bool, bool, Dict[str, Any]][source]

Modifies the env after calling step() using self.observation() on the returned observations.

class openrl.envs.wrappers.BaseRewardWrapper(env, cfg=None)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reward(reward: openrl.envs.wrappers.base_wrapper.ArrayType) openrl.envs.wrappers.base_wrapper.ArrayType[source]

Returns a modified environment reward.

Args:

reward: The env step() reward

Returns:

The modified reward

step(action: gymnasium.core.ActType) Tuple[gymnasium.core.ObsType, SupportsFloat, bool, bool, Dict[str, Any]][source]

Modifies the env step() reward using self.reward().

class openrl.envs.wrappers.BaseWrapper(env, cfg=None, reward_class=None)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

property agent_num
property env_name
property has_auto_reset
set_render_mode(render_mode: Union[None, str])[source]
step(action)[source]

Uses the step() of the env that can be overwritten to change the returned data.

property use_monitor
class openrl.envs.wrappers.DictWrapper(env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(observation)[source]

Returns a modified observation.

Args:

observation: The env observation

Returns:

The modified observation

class openrl.envs.wrappers.FlattenObservation(env: gymnasium.core.Env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(observation)[source]

Flattens an observation.

Args:

observation: The observation to flatten

Returns:

The flattened observation

class openrl.envs.wrappers.GIFWrapper(env, gif_path: str, fps: int = 30)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

close()[source]

Closes the wrapper and env.

reset(**kwargs)[source]

Uses the reset() of the env that can be overwritten to change the returned data.

step(action)[source]

Uses the step() of the env that can be overwritten to change the returned data.

class openrl.envs.wrappers.MoveActionMask2InfoWrapper(env: gymnasium.core.Env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reset(**kwargs)[source]

Uses the reset() of the env that can be overwritten to change the returned data.

step(action)[source]

Uses the step() of the env that can be overwritten to change the returned data.

class openrl.envs.wrappers.RemoveTruncated(env: gymnasium.core.Env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

step(action)[source]

Steps through the environment, returning 5 or 4 items depending on output_truncation_bool.

Args:

action: action to step through the environment with

Returns:

(observation, reward, terminated, truncated, info) or (observation, reward, done, info)

class openrl.envs.wrappers.Single2MultiAgentWrapper(env)[source]

Bases: gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

property agent_num
reset(*, seed=None, options=None)[source]

Uses the reset() of the env that can be overwritten to change the returned data.

step(action)[source]

Uses the step() of the env that can be overwritten to change the returned data.