openrl.utils.callbacks package¶
Submodules¶
openrl.utils.callbacks.callbacks module¶
- class openrl.utils.callbacks.callbacks.BaseCallback(verbose: int = 0)[源代码]¶
基类:
abc.ABCBase class for callback.
- 参数
verbose -- Verbosity level: 0 for no output, 1 for info messages, 2 for debug messages
- init_callback(agent: openrl.runners.common.base_agent.BaseAgent) None[源代码]¶
Initialize the callback by saving references to the RL model and the training environment for convenience.
- logger: openrl.utils.logger.Logger¶
- on_step() bool[源代码]¶
This method will be called by the model after each call to
env.step().For child callback (of an
EventCallback), this will be called when the event is triggered.- 返回
If the callback returns False, training is aborted early.
- set_parent(parent: openrl.utils.callbacks.callbacks.BaseCallback) None[源代码]¶
Set the parent of the callback.
- 参数
parent -- The parent callback.
- class openrl.utils.callbacks.callbacks.CallbackList(callbacks: List[openrl.utils.callbacks.callbacks.BaseCallback], stop_logic: str = 'OR')[源代码]¶
基类:
openrl.utils.callbacks.callbacks.BaseCallbackClass for chaining callbacks.
- 参数
callbacks -- A list of callbacks that will be called sequentially.
- set_parent(parent: openrl.utils.callbacks.callbacks.BaseCallback) None[源代码]¶
Set the parent of the callback.
- 参数
parent -- The parent callback.
- class openrl.utils.callbacks.callbacks.ConvertCallback(callback: Callable[[Dict[str, Any], Dict[str, Any]], bool], verbose: int = 0)[源代码]¶
基类:
openrl.utils.callbacks.callbacks.BaseCallbackConvert functional callback (old-style) to object.
- 参数
callback --
verbose -- Verbosity level: 0 for no output, 1 for info messages, 2 for debug messages
- class openrl.utils.callbacks.callbacks.EventCallback(callback: Optional[openrl.utils.callbacks.callbacks.BaseCallback] = None, verbose: int = 0)[源代码]¶
基类:
openrl.utils.callbacks.callbacks.BaseCallbackBase class for triggering callback on event.
- 参数
callback -- Callback that will be called when an event is triggered.
verbose -- Verbosity level: 0 for no output, 1 for info messages, 2 for debug messages
- init_callback(agent: openrl.runners.common.base_agent.BaseAgent) None[源代码]¶
Initialize the callback by saving references to the RL model and the training environment for convenience.
- class openrl.utils.callbacks.callbacks.EveryNTimesteps(n_steps: int, callbacks: Union[List[Dict[str, Any]], Dict[str, Any], openrl.utils.callbacks.callbacks.BaseCallback], stop_logic: str = 'OR')[源代码]¶
基类:
openrl.utils.callbacks.callbacks.EventCallbackTrigger a callback every
n_stepstimesteps- 参数
n_steps -- Number of timesteps between two trigger.
callback -- Callback that will be called when the event is triggered.
openrl.utils.callbacks.callbacks_factory module¶
- class openrl.utils.callbacks.callbacks_factory.CallbackFactory[源代码]¶
基类:
object- static get_callback(callback: Dict[str, Any]) openrl.utils.callbacks.callbacks.BaseCallback[源代码]¶
- static get_callbacks(callbacks: Union[Dict[str, Any], List[Dict[str, Any]]], stop_logic: str = 'OR') openrl.utils.callbacks.callbacks.CallbackList[源代码]¶
- static register(id: str, callback_class: Type[openrl.utils.callbacks.callbacks.BaseCallback])[源代码]¶
openrl.utils.callbacks.checkpoint_callback module¶
- class openrl.utils.callbacks.checkpoint_callback.CheckpointCallback(save_freq: int, save_path: Union[str, pathlib.Path], name_prefix: str = 'rl_model', save_replay_buffer: bool = False, verbose: int = 0)[源代码]¶
基类:
openrl.utils.callbacks.callbacks.BaseCallbackCallback for saving a model every
save_freqcalls toenv.step(). By default, it only saves model checkpoints, you need to passsave_replay_buffer=Trueto save replay buffer checkpoints.警告
When using multiple environments, each call to
env.step()will effectively correspond ton_envssteps. To account for that, you can usesave_freq = max(save_freq // n_envs, 1)- 参数
save_freq -- Save checkpoints every
save_freqcall of the callback.save_path -- Path to the folder where the model will be saved.
name_prefix -- Common prefix to the saved models
save_replay_buffer -- Save the model replay buffer
verbose -- Verbosity level: 0 for no output, 2 for indicating when saving model checkpoint
openrl.utils.callbacks.eval_callback module¶
- class openrl.utils.callbacks.eval_callback.EvalCallback(eval_env: Union[str, Dict[str, Any], gymnasium.core.Env, openrl.envs.vec_env.base_venv.BaseVecEnv], callbacks_on_new_best: Optional[Union[List[Dict[str, Any]], Dict[str, Any], openrl.utils.callbacks.callbacks.BaseCallback]] = None, callbacks_after_eval: Optional[Union[List[Dict[str, Any]], Dict[str, Any], openrl.utils.callbacks.callbacks.BaseCallback]] = None, n_eval_episodes: int = 5, eval_freq: int = 10000, log_path: Optional[Union[str, pathlib.Path]] = None, best_model_save_path: Optional[Union[str, pathlib.Path]] = None, deterministic: bool = True, render: bool = False, asynchronous: bool = True, verbose: int = 1, warn: bool = True, stop_logic: str = 'OR', close_env_at_end: bool = True)[源代码]¶
基类:
openrl.utils.callbacks.callbacks.EventCallbackCallback for evaluating an agent.
警告
When using multiple environments, each call to
env.step()will effectively correspond ton_envssteps. To account for that, you can useeval_freq = max(eval_freq // n_envs, 1)- 参数
eval_env -- The environment used for initialization
callback_on_new_best -- Callback to trigger when there is a new best model according to the
mean_rewardcallbacks_after_eval -- Callback to trigger after every evaluation
n_eval_episodes -- The number of episodes to test the agent
eval_freq -- Evaluate the agent every
eval_freqcall of the callback.log_path -- Path to a folder where the evaluations (
evaluations.npz) will be saved. It will be updated at each evaluation.best_model_save_path -- Path to a folder where the best model according to performance on the eval env will be saved.
deterministic -- Whether the evaluation should use a stochastic or deterministic actions.
render -- Whether to render or not the environment during evaluation
verbose -- Verbosity level: 0 for no output, 1 for indicating information about evaluation results
warn -- Passed to
evaluate_policy(warns ifeval_envhas not been wrapped with a Monitor wrapper)
openrl.utils.callbacks.processbar_callback module¶
- class openrl.utils.callbacks.processbar_callback.ProgressBarCallback[源代码]¶
基类:
openrl.utils.callbacks.callbacks.BaseCallbackDisplay a progress bar when training SB3 agent using tqdm and rich packages.
openrl.utils.callbacks.stop_callback module¶
- class openrl.utils.callbacks.stop_callback.StopTrainingOnMaxEpisodes(max_episodes: int, verbose: int = 0)[源代码]¶
基类:
openrl.utils.callbacks.callbacks.BaseCallbackStop the training once a maximum number of episodes are played.
For multiple environments presumes that, the desired behavior is that the agent trains on each env for
max_episodesand in total formax_episodes * n_envsepisodes.- 参数
max_episodes -- Maximum number of episodes to stop training.
verbose -- Verbosity level: 0 for no output, 1 for indicating information about when training ended by reaching
max_episodes
- class openrl.utils.callbacks.stop_callback.StopTrainingOnNoModelImprovement(max_no_improvement_evals: int, min_evals: int = 0, verbose: int = 1)[源代码]¶
基类:
openrl.utils.callbacks.callbacks.BaseCallbackStop the training early if there is no new best model (new best mean reward) after more than N consecutive evaluations.
It is possible to define a minimum number of evaluations before start to count evaluations without improvement.
It must be used with the
EvalCallback.- 参数
max_no_improvement_evals -- Maximum number of consecutive evaluations without a new best model.
min_evals -- Number of evaluations before start to count evaluations without improvements.
verbose -- Verbosity level: 0 for no output, 1 for indicating when training ended because no new best model
- class openrl.utils.callbacks.stop_callback.StopTrainingOnRewardThreshold(reward_threshold: float, verbose: int = 0)[源代码]¶
基类:
openrl.utils.callbacks.callbacks.BaseCallbackStop the training once a threshold in episodic reward has been reached (i.e. when the model is good enough).
It must be used with the
EvalCallback.- 参数
reward_threshold -- Minimum expected reward per episode to stop training.
verbose -- Verbosity level: 0 for no output, 1 for indicating when training ended because episodic reward threshold reached
Module contents¶
- class openrl.utils.callbacks.CallbackFactory[源代码]¶
基类:
object- static get_callback(callback: Dict[str, Any]) openrl.utils.callbacks.callbacks.BaseCallback[源代码]¶
- static get_callbacks(callbacks: Union[Dict[str, Any], List[Dict[str, Any]]], stop_logic: str = 'OR') openrl.utils.callbacks.callbacks.CallbackList[源代码]¶
- static register(id: str, callback_class: Type[openrl.utils.callbacks.callbacks.BaseCallback])[源代码]¶