openrl.algorithms package¶
Submodules¶
openrl.algorithms.a2c module¶
- class openrl.algorithms.a2c.A2CAlgorithm(cfg, init_module, agent_num: int = 1, device: Union[str, torch.device] = 'cpu')[source]¶
openrl.algorithms.base_algorithm module¶
openrl.algorithms.behavior_cloning module¶
- class openrl.algorithms.behavior_cloning.BCAlgorithm(cfg, init_module, agent_num: int = 1, device: Union[str, torch.device] = 'cpu')[source]¶
openrl.algorithms.ddpg module¶
- class openrl.algorithms.ddpg.DDPGAlgorithm(cfg, init_module, agent_num: int = 1, device: Union[str, torch.device] = 'cpu')[source]¶
Bases:
openrl.algorithms.base_algorithm.BaseAlgorithm- cal_value_loss(value_normalizer, values, value_preds_batch, return_batch, active_masks_batch)[source]¶
- prepare_actor_loss(obs_batch, next_obs_batch, rnn_states_batch, actions_batch, masks_batch, action_masks_batch, value_preds_batch, rewards_batch, active_masks_batch, turn_on)[source]¶
openrl.algorithms.dqn module¶
- class openrl.algorithms.dqn.DQNAlgorithm(cfg, init_module, agent_num: int = 1, device: Union[str, torch.device] = 'cpu')[source]¶
openrl.algorithms.gail module¶
openrl.algorithms.mat module¶
openrl.algorithms.ppo module¶
- class openrl.algorithms.ppo.PPOAlgorithm(cfg, init_module, agent_num: int = 1, device: Union[str, torch.device] = 'cpu')[source]¶
Bases:
openrl.algorithms.base_algorithm.BaseAlgorithm- cal_value_loss(value_normalizer, values, value_preds_batch, return_batch, active_masks_batch)[source]¶
openrl.algorithms.sac module¶
- class openrl.algorithms.sac.SACAlgorithm(cfg, init_module, agent_num: int = 1, device: Union[str, torch.device] = 'cpu')[source]¶
Bases:
openrl.algorithms.base_algorithm.BaseAlgorithm- cal_value_loss(value_normalizer, values, value_preds_batch, return_batch, active_masks_batch)[source]¶
- prepare_actor_loss(obs_batch, next_obs_batch, rnn_states_batch, actions_batch, masks_batch, action_masks_batch, value_preds_batch, rewards_batch, active_masks_batch, turn_on)[source]¶
openrl.algorithms.vdn module¶
- class openrl.algorithms.vdn.VDNAlgorithm(cfg, init_module, agent_num: int = 1, device: Union[str, torch.device] = 'cpu')[source]¶
Bases:
openrl.algorithms.base_algorithm.BaseAlgorithm- cal_value_loss(value_normalizer, values, value_preds_batch, return_batch, active_masks_batch)[source]¶