Reinforcement Learning for Quadrupedal Locomotion | Yipeng Wang

Developed a high-fidelity locomotion policy for the Unitree Go2 in Isaac Lab, leveraging Proximal Policy Optimization (PPO) to train robust gait controllers
Implemented a custom actuator friction model incorporating static and viscous friction terms into the low-level PD control loop, significantly narrowing the sim-to-real gap by explicitly compensating for hardware non-linearities
Integrated Raibert Heuristics gait shaping and dynamic foot clearance constraints into the reward formulation, ensuring stable trotting gaits capable of precisely tracking planar velocity commands under stochastic joint disturbances