openrl.algorithms package¶ Submodules¶ openrl.algorithms.base_algorithm module¶ class openrl.algorithms.base_algorithm.BaseAlgorithm(cfg, init_module, agent_num: int, device=device(type='cpu'))[源代码]¶ 基类:abc.ABC prep_rollout()[源代码]¶ prep_training()[源代码]¶ abstract train(buffer, turn_on=True)[源代码]¶ openrl.algorithms.ppo module¶ class openrl.algorithms.ppo.PPOAlgorithm(cfg, init_module, agent_num: int = 1, device: Union[str, torch.device] = 'cpu')[源代码]¶ 基类:openrl.algorithms.base_algorithm.BaseAlgorithm cal_value_loss(value_normalizer, values, value_preds_batch, return_batch, active_masks_batch)[源代码]¶ ppo_update(sample, turn_on=True)[源代码]¶ prepare_loss(critic_obs_batch, obs_batch, rnn_states_batch, rnn_states_critic_batch, actions_batch, masks_batch, available_actions_batch, old_action_log_probs_batch, adv_targ, value_preds_batch, return_batch, active_masks_batch, turn_on)[源代码]¶ to_single_np(input)[源代码]¶ train(buffer, turn_on=True)[源代码]¶ Module contents¶