14. 2D bouldering game with RL agent
Prologue
I have done bouldering a few times per week for approximately one year at this point, and I have always wondered how I could somehow apply machine learning to bouldering. The ideal project would be to create a 3D scan of a route and let a digital robot solve it for me. It would be cool to see if it could determine the “gimmick” of a route, such as having to hang in your feet to get into the right end position:

I have no idea how that would turn out, but it was probably a bit ambitious for now, so my first shot at it is instead a 2D bouldering “game”/environment with a reinforcement-learning agent playing to get up as high as possible within a time limit. There are many small details, so I’ll intentionally simplify the descriptions - especially of the agent-environment interaction.
Bouldering Environment
The bouldering environment is built in PyGame with Gymnasium such that it works with standard reinforcement-learning models. It consists of a collection of circles that corresponds to the holds, as well as our stick figure player. It’s not a game as such, but rather a control task, as there are way too many inputs and that the inputs are continuous (in theory), and not discrete.
The player has control over 9 joints: Shoulders, elbows, hips, knees, and an abdominal joint such that the torso is not stiff. The following figure highlights joints (red) and hands/feet used to grab holds (blue):

Getting the player to feel just a bit natural was very hard, as this is not built on top of any physics engine. Instead, I had to tune different parameters such as strength, torque, etc. for each limb/joint. I also introduced stamina such that it has to use its entire body, and not rely on arms alone. This also mimics real life in a way.
The mechanics of grabbing a holds was also weird to pull off, because fixing for instance a hand onto a hold means that everything else has to rotate around that point. As you’ll see later, the holds have an invisible field that enables easy attachment and movement within it. This was the best solution I could come up with that gave reasonable results.
Reinforcement-Learning Agent
Stable-Baselines3 has the Proximal Policy Optimization (PPO) model which I’ll use to teach an agent how to move this stick figure up. The overall goal is to get as high up as possible, hence it doesn’t use a simple win/lose reward signal. The initial reward was set as the height of the player, but as long as the player doesn’t fall down it could stack points at low heights. I then set the episode time to 1200 ticks or frames. But it was clear that this wasn’t the thing that stopped it from climbing up. I ended up adding a small penalty for staying idle, as well as adding a few rewards for making contact and grabbing onto holds.
The input for the agent was NOT the pure environment state (observation), as that would be way to much information and noise. Instead, it was a reduced version that had information of the closest 4 (or something) holds for each hand and foot, as well as all joint angles and staminas.
Results
There were many iterations of experimenting with small details and visual changes. I’ll just show them in order.
This was the very first thing produced. The player is too small, and it automatically attach to nearest holds using a wire:

This next one has the player looking more like the drawing from before, even though the legs cross in a weird way. It also extended each limb when it grabbed a hold:

Then I disabled the limbs lengthening, giving it a more natural look:

Trying to add more realistic torque on the joints it suddenly had a hard time standing properly:

Some other fixes had passed, including having more natural movement, instead of almost teleporting the limbs:

I get that it’s probably hard to tell which changes were made based off of single screenshots. Sorry about that. BUT I do have some videos of the newer iterations (I’m skipping many steps now). Major visual changes: the holds visible to the agent are highlighted in green, limbs become more red as they become fatigued, and the information in the top help us understand stuff about the environment such as height and tick number.
This one is the first time the agent has shown promise:
The next one climbs more naturally, swinging left and right as it ascents. It also utilizes it legs much more:
As it has learned to climb somewhat properly at this point, I wanted to test it on a wall with way fewer holds after some more training, which it does wonderfully:
Conclusion
I know what you’re thinking. “Hey! That looked super weird?!”. Yes, the physics are not on point here, and the player has superhuman strength. Buuuuut that’s a small technicality, and it DID learn how to climb given the mechanics of the environment. And it’s really fun to looks at. I call this a success!
Next time I should try a proper game engine with good physics (Unity or Unreal Engine), and maybe even make it 3D. But that sounds like a longer project.