Stable Baseline 3

強化学習

Documents

以下が公式ドキュメント(2023年11月時点)

Stable-Baselines3 Docs - Reliable Reinforcement Learning Implementations — Stable Baselines3 2.5.0a0 documentation

Environment(学習環境)

GitHub - DLR-RM/rl-baselines3-zoo: A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included. - DLR-RM/rl-baseline...

Classic Control Environments

有名なCartPole(倒立振子)等が用意されている.

末尾の”-v0”は各環境のバージョンを指しているらしい.

恐らく実験の再現性を考慮している.

RL AlgoCartPole-v1MountainCar-v0Acrobot-v1Pendulum-v1MountainCarContinuous-v0
ARS✔️✔️✔️✔️✔️
A2C✔️✔️✔️✔️✔️
PPO✔️✔️✔️✔️✔️
DQN✔️✔️✔️N/AN/A
QR-DQN✔️✔️✔️N/AN/A11
DDPGN/AN/AN/A✔️✔️
SACN/AN/AN/A✔️✔️
TD3N/AN/AN/A✔️✔️
TQCN/AN/AN/A✔️✔️
TRPO✔️✔️✔️✔️✔️

Box2D Environments

RL AlgoBipedalWalker-v3LunarLander-v2LunarLanderContinuous-v2BipedalWalkerHardcore-v3CarRacing-v0
ARS✔️✔️
A2C✔️✔️✔️✔️
PPO✔️✔️✔️✔️
DQNN/A✔️N/AN/AN/A
QR-DQNN/A✔️N/AN/AN/A
DDPG✔️N/A✔️
SAC✔️N/A✔️✔️
TD3✔️N/A✔️✔️
TQC✔️N/A✔️✔️
TRPO✔️✔️
Lunar Lander

Atari Games

RL AlgoBeamRiderBreakoutEnduroPongQbertSeaquestSpaceInvaders
A2C✔️✔️✔️✔️✔️✔️✔️
PPO✔️✔️✔️✔️✔️✔️✔️
DQN✔️✔️✔️✔️✔️✔️✔️
QR-DQN✔️✔️✔️✔️✔️✔️✔️
Atari Games
タイトルとURLをコピーしました