Seattle-based software engineer, Peter Whidden, has achieved a remarkable feat by training an artificial intelligence (AI) algorithm to navigate and play Pokémon Red. Over the course of several years, the AI has logged more than 50,000 hours of gameplay. The story of this development has gained significant attention, with Whidden’s YouTube video chronicling the AI’s progress amassing over 2.2 million views within nine days.
Key Takeaway
Seattle-based software engineer, Peter Whidden, has trained an AI algorithm to play Pokémon Red, accumulating over 50,000 hours of gameplay. His YouTube video chronicling the AI’s journey has gained significant attention, with over 2.2 million views. The AI’s engagement with the game and its peculiar behaviors have captivated viewers, showcasing the potential of reinforcement learning in gaming.
AI’s Engagement and Interest
Whidden expressed his delight at the level of interest and engagement from people who are keen to create and experiment with similar AI algorithms. He has uploaded the code used in this project to GitHub, along with instructions on how to operate and train the AI. One fan successfully applied the code to Pokémon Crystal, another retro Game Boy installment.
Reinforcement Learning Model
The AI utilizes a reinforcement learning model that rewards the AI with points for leveling up Pokémon, exploring new areas, winning battles, and defeating gym leaders. While these incentives may sometimes deviate from the game’s progression, the AI’s failures have a peculiar charm, which resonates with viewers, resulting in the viral spread of Whidden’s video.
For instance, in one attempt, the AI gets stuck in the first area of the game, Pallet Town, fixating on the animated water and observing the NPCs’ movements without proceeding further. It appears to find these simple experiences novel and engaging, suggesting an appreciation for the beauty of the Kanto region or, perhaps, an ethical stance against Pokémon battles.
Curiosity and Behavioral Associations
Whidden points out that the AI’s enjoyment of the scenery contradicts the objective of progress. He highlights the paradox faced in real life—curiosity often leads to significant discoveries but simultaneously exposes individuals to distractions and potential adverse consequences. The AI’s behavior reflects this dilemma.
Furthermore, the AI exhibits an association formed through negative reinforcement. An incident at the Pokémon Center, where the AI mistakenly deposits a Pokémon into storage, leads to a sharp decrease in the sum of all Pokémon levels in its party. This single detrimental event creates a lasting impact, causing the AI to avoid Pokémon Centers altogether in subsequent games.
AI Limitations and Progress
Despite displaying emotions or reactions akin to humans, the AI remains a computer program. It cannot comprehend or interpret the in-game dialogue, leading to difficulties in early stages of the game. To circumvent this, Whidden modified the program to skip ahead to the point after delivering a required item to the Pokémon Professor in Pallet Town. Additionally, the AI is equipped with Squirtle as its starter Pokémon, enhancing its chances of success in the initial stages.
In the aforementioned YouTube video, the AI’s progress is seen to have reached Mt. Moon, located between the first and second gym. Cave navigation is renowned for its complexity, even for human players. However, Whidden’s recent adjustments to reward mechanisms and implementation of a different learning algorithm allowed the AI to conquer the challenges posed by Mt. Moon and reach Cerulean City.
Peter Whidden’s success and comprehensible explanation of complex concepts through the familiar medium of Pokémon have garnered considerable attention. While other researchers have utilized reinforcement learning to study AI’s capabilities in gaming, this particular project resonates due to its relatability to Pokémon enthusiasts worldwide.