Shortcuts

openrl.envs.wrappers package

Submodules

openrl.envs.wrappers.atari_wrappers module

class openrl.envs.wrappers.atari_wrappers.ClipRewardEnv(env, cfg=None)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reward(reward)[源代码]

Bin reward to {+1, 0, -1} by its sign.

class openrl.envs.wrappers.atari_wrappers.EpisodicLifeEnv(env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reset(**kwargs)[源代码]

Reset only when lives are exhausted. This way all states are still reachable even though lives are episodic, and the learner need not know about any of this behind-the-scenes.

step(action)[源代码]

Uses the step() of the env that can be overwritten to change the returned data.

class openrl.envs.wrappers.atari_wrappers.FireResetEnv(env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reset(**kwargs)[源代码]

Uses the reset() of the env that can be overwritten to change the returned data.

class openrl.envs.wrappers.atari_wrappers.NoopResetEnv(env, noop_max=30)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reset(**kwargs)[源代码]

Do no-op action for a number of steps in [1, noop_max].

class openrl.envs.wrappers.atari_wrappers.WarpFrame(env, width=84, height=84)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(frame)[源代码]

Returns a modified observation.

Args:

observation: The env observation

Returns:

The modified observation

openrl.envs.wrappers.base_wrapper module

class openrl.envs.wrappers.base_wrapper.BaseObservationWrapper(env, cfg=None, reward_class=None)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(observation: gymnasium.core.ObsType) gymnasium.core.WrapperObsType[源代码]

Returns a modified observation.

Args:

observation: The env observation

Returns:

The modified observation

reset(*, seed: Optional[int] = None, options: Optional[Dict[str, Any]] = None) Tuple[gymnasium.core.WrapperObsType, Dict[str, Any]][源代码]

Modifies the env after calling reset(), returning a modified observation using self.observation().

step(action: gymnasium.core.ActType) Tuple[gymnasium.core.WrapperObsType, SupportsFloat, bool, bool, Dict[str, Any]][源代码]

Modifies the env after calling step() using self.observation() on the returned observations.

class openrl.envs.wrappers.base_wrapper.BaseRewardWrapper(env, cfg=None)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reward(reward: openrl.envs.wrappers.base_wrapper.ArrayType) openrl.envs.wrappers.base_wrapper.ArrayType[源代码]

Returns a modified environment reward.

Args:

reward: The env step() reward

Returns:

The modified reward

step(action: gymnasium.core.ActType) Tuple[gymnasium.core.ObsType, SupportsFloat, bool, bool, Dict[str, Any]][源代码]

Modifies the env step() reward using self.reward().

class openrl.envs.wrappers.base_wrapper.BaseWrapper(env, cfg=None, reward_class=None)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

property agent_num
property env_name
property has_auto_reset
set_render_mode(render_mode: Union[None, str])[源代码]
step(action)[源代码]

Uses the step() of the env that can be overwritten to change the returned data.

property use_monitor

openrl.envs.wrappers.extra_wrappers module

class openrl.envs.wrappers.extra_wrappers.AddStep(env: gymnasium.core.Env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(observation)[源代码]

Flattens an observation.

Args:

observation: The observation to flatten

Returns:

The flattened observation

reset(*, seed: Optional[int] = None, options: Optional[Dict[str, Any]] = None) Tuple[gymnasium.core.WrapperObsType, Dict[str, Any]][源代码]

Modifies the env after calling reset(), returning a modified observation using self.observation().

step(action: gymnasium.core.ActType) Tuple[gymnasium.core.WrapperObsType, SupportsFloat, bool, bool, Dict[str, Any]][源代码]

Modifies the env after calling step() using self.observation() on the returned observations.

class openrl.envs.wrappers.extra_wrappers.AutoReset(env: gymnasium.core.Env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

property has_auto_reset
class openrl.envs.wrappers.extra_wrappers.ConvertEmptyBoxWrapper(env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(observation)[源代码]

Returns a modified observation.

Args:

observation: The env observation

Returns:

The modified observation

class openrl.envs.wrappers.extra_wrappers.DictWrapper(env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(observation)[源代码]

Returns a modified observation.

Args:

observation: The env observation

Returns:

The modified observation

class openrl.envs.wrappers.extra_wrappers.FlattenObservation(env: gymnasium.core.Env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(observation)[源代码]

Flattens an observation.

Args:

observation: The observation to flatten

Returns:

The flattened observation

class openrl.envs.wrappers.extra_wrappers.FrameSkip(env, num_frames: int = 8)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

step(action)[源代码]

Uses the step() of the env that can be overwritten to change the returned data.

class openrl.envs.wrappers.extra_wrappers.GIFWrapper(env, gif_path: str, fps: int = 30)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

close()[源代码]

Closes the wrapper and env.

reset(**kwargs)[源代码]

Uses the reset() of the env that can be overwritten to change the returned data.

step(action)[源代码]

Uses the step() of the env that can be overwritten to change the returned data.

class openrl.envs.wrappers.extra_wrappers.MoveActionMask2InfoWrapper(env: gymnasium.core.Env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reset(**kwargs)[源代码]

Uses the reset() of the env that can be overwritten to change the returned data.

step(action)[源代码]

Uses the step() of the env that can be overwritten to change the returned data.

class openrl.envs.wrappers.extra_wrappers.RecordReward(env, cfg=None, reward_class=None)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

property has_auto_reset
class openrl.envs.wrappers.extra_wrappers.RemoveTruncated(env: gymnasium.core.Env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

step(action)[源代码]

Steps through the environment, returning 5 or 4 items depending on output_truncation_bool.

Args:

action: action to step through the environment with

Returns:

(observation, reward, terminated, truncated, info) or (observation, reward, done, info)

class openrl.envs.wrappers.extra_wrappers.ZeroRewardWrapper(env, cfg=None)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reward(reward: openrl.envs.wrappers.base_wrapper.ArrayType) openrl.envs.wrappers.base_wrapper.ArrayType[源代码]

Returns a modified environment reward.

Args:

reward: The env step() reward

Returns:

The modified reward

openrl.envs.wrappers.extra_wrappers.convert_to_done_step_api(step_returns, is_vector_env: bool = False)[源代码]
openrl.envs.wrappers.extra_wrappers.step_api_compatibility(step_returns, output_truncation_bool: bool = True, is_vector_env: bool = False)[源代码]

openrl.envs.wrappers.flatten module

openrl.envs.wrappers.flatten.flatten(space: gymnasium.spaces.space.Space[openrl.envs.wrappers.flatten.T], agent_num: int, x: openrl.envs.wrappers.flatten.T) Union[numpy.ndarray[Any, numpy.dtype[Any]], Dict[str, Any], Tuple[Any, ...], gymnasium.spaces.graph.GraphInstance][源代码]

Flatten a data point from a space.

This is useful when e.g. points from spaces must be passed to a neural network, which only understands flat arrays of floats.

Args:

space: The space that x is flattened by x: The value to flatten

Returns:

The flattened datapoint

  • For gymnasium.spaces.Box and gymnasium.spaces.MultiBinary, this is a flattened array

  • For gymnasium.spaces.Discrete and gymnasium.spaces.MultiDiscrete, this is a flattened one-hot array of the sample

  • For gymnasium.spaces.Tuple and gymnasium.spaces.Dict, this is a concatenated array the subspaces (does not support graph subspaces)

  • For graph spaces, returns GraphInstance where:
    • GraphInstance.nodes are n x k arrays

    • GraphInstance.edges are either:
      • m x k arrays

      • None

    • GraphInstance.edge_links are either:
      • m x 2 arrays

      • None

Raises:

NotImplementedError: If the space is not defined in gymnasium.spaces.

Example:
>>> from gymnasium.spaces import Box, Discrete, Tuple
>>> space = Box(0, 1, shape=(3, 5))
>>> flatten(space, space.sample()).shape
(15,)
>>> space = Discrete(4)
>>> flatten(space, 2)
array([0, 0, 1, 0])
>>> space = Tuple((Box(0, 1, shape=(2,)), Box(0, 1, shape=(3,)), Discrete(3)))
>>> example = ((.5, .25), (1., 0., .2), 1)
>>> flatten(space, example)
array([0.5 , 0.25, 1.  , 0.  , 0.2 , 0.  , 1.  , 0.  ])

openrl.envs.wrappers.image_wrappers module

class openrl.envs.wrappers.image_wrappers.TransposeImage(env=None, op=[2, 0, 1])[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(ob)[源代码]

Returns a modified observation.

Args:

observation: The env observation

Returns:

The modified observation

openrl.envs.wrappers.mat_wrapper module

class openrl.envs.wrappers.mat_wrapper.MATWrapper(env: openrl.envs.vec_env.base_venv.BaseVecEnv)[源代码]

基类:openrl.envs.vec_env.wrappers.base_wrapper.VectorObservationWrapper

observation(observation)[源代码]

Defines the observation transformation.

Args:

observation (object): the observation from the environment

Returns:

observation (object): the transformed observation

property observation_space

Return the Env observation_space unless overwritten then the wrapper observation_space is used.

openrl.envs.wrappers.monitor module

class openrl.envs.wrappers.monitor.Monitor(env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

A monitor wrapper for Gym environments, it is used to know the episode reward, length, time and other data.

参数

env -- The environment

get_episode_lengths() List[int][源代码]

Returns the number of timesteps of all the episodes

返回

get_episode_rewards() List[float][源代码]

Returns the rewards of all the episodes

返回

get_episode_times() List[float][源代码]

Returns the runtime in seconds of all the episodes

返回

get_total_steps() int[源代码]

Returns the total number of timesteps

返回

reset(**kwargs)[源代码]

Calls the Gym environment reset.

参数

kwargs -- Extra keywords saved for the next episode. only if defined by reset_keywords

返回

the first observation of the environment

step(action: Union[numpy.ndarray, int])[源代码]

Step the environment with the given action

参数

action -- the action

返回

observation, reward, done, information or observation, reward, terminal, truncated, information

openrl.envs.wrappers.multiagent_wrapper module

class openrl.envs.wrappers.multiagent_wrapper.Single2MultiAgentWrapper(env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

property agent_num
reset(*, seed=None, options=None)[源代码]

Uses the reset() of the env that can be overwritten to change the returned data.

step(action)[源代码]

Uses the step() of the env that can be overwritten to change the returned data.

openrl.envs.wrappers.pettingzoo_wrappers module

openrl.envs.wrappers.util module

openrl.envs.wrappers.util.is_wrapped(env: gymnasium.core.Env, wrapper_class: Type[openrl.envs.wrappers.base_wrapper.BaseWrapper]) bool[源代码]

Check if a given environment has been wrapped with a given wrapper.

参数
  • env -- Environment to check

  • wrapper_class -- Wrapper class to look for

返回

True if environment has been wrapped with wrapper_class.

openrl.envs.wrappers.util.nest_expand_dim(input: Any) Any[源代码]
openrl.envs.wrappers.util.unwrap_wrapper(env: gymnasium.core.Env, wrapper_class: Type[openrl.envs.wrappers.base_wrapper.BaseWrapper]) Optional[openrl.envs.wrappers.base_wrapper.BaseWrapper][源代码]

Retrieve a BaseWrapper object by recursively searching.

参数
  • env -- Environment to unwrap

  • wrapper_class -- Wrapper to look for

返回

Environment unwrapped till wrapper_class if it has been wrapped with it

Module contents

class openrl.envs.wrappers.AutoReset(env: gymnasium.core.Env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

property has_auto_reset
class openrl.envs.wrappers.BaseObservationWrapper(env, cfg=None, reward_class=None)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(observation: gymnasium.core.ObsType) gymnasium.core.WrapperObsType[源代码]

Returns a modified observation.

Args:

observation: The env observation

Returns:

The modified observation

reset(*, seed: Optional[int] = None, options: Optional[Dict[str, Any]] = None) Tuple[gymnasium.core.WrapperObsType, Dict[str, Any]][源代码]

Modifies the env after calling reset(), returning a modified observation using self.observation().

step(action: gymnasium.core.ActType) Tuple[gymnasium.core.WrapperObsType, SupportsFloat, bool, bool, Dict[str, Any]][源代码]

Modifies the env after calling step() using self.observation() on the returned observations.

class openrl.envs.wrappers.BaseRewardWrapper(env, cfg=None)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reward(reward: openrl.envs.wrappers.base_wrapper.ArrayType) openrl.envs.wrappers.base_wrapper.ArrayType[源代码]

Returns a modified environment reward.

Args:

reward: The env step() reward

Returns:

The modified reward

step(action: gymnasium.core.ActType) Tuple[gymnasium.core.ObsType, SupportsFloat, bool, bool, Dict[str, Any]][源代码]

Modifies the env step() reward using self.reward().

class openrl.envs.wrappers.BaseWrapper(env, cfg=None, reward_class=None)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

property agent_num
property env_name
property has_auto_reset
set_render_mode(render_mode: Union[None, str])[源代码]
step(action)[源代码]

Uses the step() of the env that can be overwritten to change the returned data.

property use_monitor
class openrl.envs.wrappers.DictWrapper(env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(observation)[源代码]

Returns a modified observation.

Args:

observation: The env observation

Returns:

The modified observation

class openrl.envs.wrappers.FlattenObservation(env: gymnasium.core.Env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

observation(observation)[源代码]

Flattens an observation.

Args:

observation: The observation to flatten

Returns:

The flattened observation

class openrl.envs.wrappers.GIFWrapper(env, gif_path: str, fps: int = 30)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

close()[源代码]

Closes the wrapper and env.

reset(**kwargs)[源代码]

Uses the reset() of the env that can be overwritten to change the returned data.

step(action)[源代码]

Uses the step() of the env that can be overwritten to change the returned data.

class openrl.envs.wrappers.MoveActionMask2InfoWrapper(env: gymnasium.core.Env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

reset(**kwargs)[源代码]

Uses the reset() of the env that can be overwritten to change the returned data.

step(action)[源代码]

Uses the step() of the env that can be overwritten to change the returned data.

class openrl.envs.wrappers.RemoveTruncated(env: gymnasium.core.Env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

step(action)[源代码]

Steps through the environment, returning 5 or 4 items depending on output_truncation_bool.

Args:

action: action to step through the environment with

Returns:

(observation, reward, terminated, truncated, info) or (observation, reward, done, info)

class openrl.envs.wrappers.Single2MultiAgentWrapper(env)[源代码]

基类:gymnasium.core.Env[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType], Generic[gymnasium.core.WrapperObsType, gymnasium.core.WrapperActType, gymnasium.core.ObsType, gymnasium.core.ActType]

property agent_num
reset(*, seed=None, options=None)[源代码]

Uses the reset() of the env that can be overwritten to change the returned data.

step(action)[源代码]

Uses the step() of the env that can be overwritten to change the returned data.