Reinforcement Learning for Quadrupedal Locomotion
Training robust walking policies for the Unitree Go2 robot using Proximal Policy Optimization (PPO)
Link to Github Repo
Demo
Description
- Developed a high-fidelity locomotion policy for the Unitree Go2 in Isaac Lab, leveraging Proximal Policy Optimization (PPO) to train robust gait controllers
- Implemented a custom actuator friction model incorporating static and viscous friction terms into the low-level PD control loop, significantly narrowing the sim-to-real gap by explicitly compensating for hardware non-linearities
- Integrated Raibert Heuristics gait shaping and dynamic foot clearance constraints into the reward formulation, ensuring stable trotting gaits capable of precisely tracking planar velocity commands under stochastic joint disturbances