Shortcuts

openrl.envs.vec_env.wrappers package

Submodules

openrl.envs.vec_env.wrappers.base_wrapper module

class openrl.envs.vec_env.wrappers.base_wrapper.VecEnvWrapper(env: openrl.envs.vec_env.base_venv.BaseVecEnv)[source]

Bases: openrl.envs.vec_env.base_venv.BaseVecEnv, abc.ABC

Wraps the vectorized environment to allow a modular transformation.

This class is the base class for all wrappers for vectorized environments. The subclass could override some methods to change the behavior of the original vectorized environment without touching the original code.

Note:

Don’t forget to call super().__init__(env) if the subclass overrides __init__().

property action_space: Union[gymnasium.spaces.space.Space[gymnasium.core.ActType], gymnasium.spaces.space.Space[gymnasium.core.WrapperActType]]

Return the Env action_space unless overwritten then the wrapper action_space is used.

property agent_num
call(name, *args, **kwargs)[source]

Call a method, or get a property, from each parallel environment.

Args:

name (str): Name of the method or property to call. *args: Arguments to apply to the method call. **kwargs: Keyword arguments to apply to the method call.

Returns:

List of the results of the individual calls to the method or property for each environment.

close(**kwargs)[source]

Clean up the environment’s resources.

close_extras(**kwargs)[source]

Clean up the extra resources e.g. beyond what’s in this base class.

env_is_wrapped(wrapper_class: Type[openrl.envs.wrappers.base_wrapper.BaseWrapper], indices: Union[None, int, Iterable[int]] = None) List[bool][source]

Check if worker environments are wrapped with a given wrapper

property env_name
property metadata: Dict[str, Any]

Returns the Env metadata.

property np_random: numpy.random._generator.Generator

Returns the environment’s internal _np_random that if not set will initialise with a random seed.

Returns:

Instances of np.random.Generator

property observation_space: Union[gymnasium.spaces.space.Space[gymnasium.core.ObsType], gymnasium.spaces.space.Space[gymnasium.core.WrapperObsType]]

Return the Env observation_space unless overwritten then the wrapper observation_space is used.

property parallel_env_num: int
random_action(infos=None)[source]

Get a random action from the action space

property render_mode: Optional[str]

Returns the Env render_mode.

reset(**kwargs)[source]

Reset all environments.

property reward_range: Tuple[SupportsFloat, SupportsFloat]

Return the Env reward_range unless overwritten then the wrapper reward_range is used.

set_attr(name, values)[source]

Set a property in each sub-environment.

Args:

name (str): Name of the property to be set in each individual environment. values (list, tuple, or object): Values of the property to be set to. If values is a list or

tuple, then it corresponds to the values for each individual environment, otherwise a single value is set for all environments.

step(actions, *args, **kwargs)[source]

Step all environments.

property unwrapped
property use_monitor
class openrl.envs.vec_env.wrappers.base_wrapper.VectorActionWrapper(env: openrl.envs.vec_env.base_venv.BaseVecEnv)[source]

Bases: openrl.envs.vec_env.wrappers.base_wrapper.VecEnvWrapper

Wraps the vectorized environment to allow a modular transformation of the actions. Equivalent of ActionWrapper for vectorized environments.

actions(actions: gymnasium.core.ActType) gymnasium.core.ActType[source]

Transform the actions before sending them to the environment.

Args:

actions (ActType): the actions to transform

Returns:

ActType: the transformed actions

step(actions: gymnasium.core.ActType, *args, **kwargs)[source]

Steps through the environment using a modified action by action().

class openrl.envs.vec_env.wrappers.base_wrapper.VectorObservationWrapper(env: openrl.envs.vec_env.base_venv.BaseVecEnv)[source]

Bases: openrl.envs.vec_env.wrappers.base_wrapper.VecEnvWrapper

Wraps the vectorized environment to allow a modular transformation of the observation. Equivalent to gym.ObservationWrapper for vectorized environments.

observation(observation: gymnasium.core.ObsType) gymnasium.core.ObsType[source]

Defines the observation transformation.

Args:

observation (object): the observation from the environment

Returns:

observation (object): the transformed observation

reset(**kwargs)[source]

Modifies the observation returned from the environment reset using the observation().

step(actions, *args, **kwargs)[source]

Modifies the observation returned from the environment step using the observation().

class openrl.envs.vec_env.wrappers.base_wrapper.VectorRewardWrapper(env: openrl.envs.vec_env.base_venv.BaseVecEnv)[source]

Bases: openrl.envs.vec_env.wrappers.base_wrapper.VecEnvWrapper

Wraps the vectorized environment to allow a modular transformation of the reward. Equivalent of RewardWrapper for vectorized environments.

reward(reward: openrl.envs.vec_env.wrappers.base_wrapper.ArrayType) openrl.envs.vec_env.wrappers.base_wrapper.ArrayType[source]

Transform the reward before returning it.

Args:

reward (array): the reward to transform

Returns:

array: the transformed reward

step(actions, *args, **kwargs)[source]

Steps through the environment returning a reward modified by reward().

openrl.envs.vec_env.wrappers.gen_data module

class openrl.envs.vec_env.wrappers.gen_data.GenDataWrapper(env: openrl.envs.vec_env.base_venv.BaseVecEnv, data_save_path: str, total_episode: int)[source]

Bases: openrl.envs.vec_env.wrappers.base_wrapper.VecEnvWrapper

close(**kwargs)[source]

Clean up the environment’s resources.

reset(**kwargs)[source]

Reset all environments.

step(action: gymnasium.core.ActType, *args, **kwargs)[source]

Step all environments.

class openrl.envs.vec_env.wrappers.gen_data.GenDataWrapper_v1(env: openrl.envs.vec_env.base_venv.BaseVecEnv, data_save_path: str, total_episode: int)[source]

Bases: openrl.envs.vec_env.wrappers.base_wrapper.VecEnvWrapper

close(**kwargs)[source]

Clean up the environment’s resources.

reset(**kwargs)[source]

Reset all environments.

step(action: gymnasium.core.ActType, *args, **kwargs)[source]

Step all environments.

class openrl.envs.vec_env.wrappers.gen_data.TrajectoryData(env_num, total_episode, observation_space, action_space, agent_num: int)[source]

Bases: object

dump(save_path)[source]
init_empty_dict(source_dict={})[source]
reset(reset_data)[source]
step(step_data)[source]

openrl.envs.vec_env.wrappers.reward_wrapper module

class openrl.envs.vec_env.wrappers.reward_wrapper.RewardWrapper(env: openrl.envs.vec_env.base_venv.BaseVecEnv, reward_class: openrl.rewards.base_reward.BaseReward)[source]

Bases: openrl.envs.vec_env.wrappers.base_wrapper.VecEnvWrapper

batch_rewards(buffer)[source]
step(action: gymnasium.core.ActType, extra_data: Optional[Dict[str, Any]]) Union[Any, numpy.ndarray, List[Dict[str, Any]]][source]

Step all environments.

openrl.envs.vec_env.wrappers.vec_monitor_wrapper module

class openrl.envs.vec_env.wrappers.vec_monitor_wrapper.VecMonitorWrapper(vec_info: openrl.envs.vec_env.vec_info.base_vec_info.BaseVecInfo, env: openrl.envs.vec_env.base_venv.BaseVecEnv)[source]

Bases: openrl.envs.vec_env.wrappers.base_wrapper.VecEnvWrapper

statistics(buffer)[source]
step(action: gymnasium.core.ActType, extra_data: Optional[Dict[str, Any]] = None)[source]

Step all environments.

property use_monitor

openrl.envs.vec_env.wrappers.zero_reward_wrapper module

class openrl.envs.vec_env.wrappers.zero_reward_wrapper.ZeroRewardWrapper(env: openrl.envs.vec_env.base_venv.BaseVecEnv)[source]

Bases: openrl.envs.vec_env.wrappers.base_wrapper.VectorRewardWrapper

reward(reward: openrl.envs.vec_env.wrappers.base_wrapper.ArrayType) openrl.envs.vec_env.wrappers.base_wrapper.ArrayType[source]

Transform the reward before returning it.

Args:

reward (array): the reward to transform

Returns:

array: the transformed reward

Module contents