LunarLander (Gymnasium)
PPO experiments and learning notes.
Overview
- Goal: Train an agent that consistently lands with high reward
- Algorithm: PPO
- Tracking: TensorBoard curves and diagnostics
- Repo: Coming soon
Results
Coming soon.
PPO experiments and learning notes.
Coming soon.