LunarLander (Gymnasium)

PPO experiments and learning notes.

Overview

Goal: Train an agent that consistently lands with high reward
Algorithm: PPO
Tracking: TensorBoard curves and diagnostics
Repo: Coming soon

Results

Coming soon.