On an otherwise uneventful Sunday night, a milestone was reached in the application of AI to complex and concrete problems. Let’s recap for a second. OpenAI was founded by Elon Musk, the media’s favourite billionaire, and Sam Altman, president of Y Combinator. The company aims to promote friendly AI research to benefit humanity while avoiding the existential risks associated with it. As such, they have been trying to apply AI to an array of problems, and have even built an open source, free to contribute environment which provides several mini-games on which anyone can try to apply their algorithm and compare their results against the community’s.
In 2017, they decided to take on Dota 2, a highly-complex, strategy-heavy and, more importantly, team-based video game. Each of these properties is associated with common issues in AI research: highly-complex environment implies a set of possible “game states” is so large that it is impossible to fully explore; strategy-heavy implies that long-term and short-term investments must be balanced properly; finally, team-based implies collaboration between agents, possibly the most complicated of the listed issues.
In August 2017, the first milestone was reached, when OpenAI’s agent won multiple duels against professional players in a limited version of the game, where several features and characters were missing. Of course, the game is not meant to be played in such a scenario, and although the achievement was still considerable, it wasn’t yet enough. They continued improving their AI, and even started beating amateur teams in 2018. Finally, on Sunday the 5th of August, a major event was organised by OpenAI to test their current level. Many of the previously set restrictions were lifted, with more characters and features available. Most importantly, the game was now being played in teams of 5 players, with the humans all ranked in the top 99.5th percentile.
Long story short, the professional players tried their best, and had some good moments, but overall were heavily defeated by the team of AIs. At points, the game was even becoming frustrating for the humans, with some admirable attacks being blocked, and even used against them. Only for the last game, after the humans conceded twice, the AI was given a heavy handicap in the form of a poor choice of characters, which lead to an easier, although still fought victory for the humans.
So, how did this happen? It turns out there is no magic recipe. Most of the techniques used by OpenAI are not novel and are actually well known in the reinforcement learning community. One of the main engineers even stated how she believes the community was not giving enough credit to these simple methods. It is quite common, especially in machine learning, to always search for more complex tools, instead of trying to refine those currently known, which is exactly what the engineers at OpenAI did. Refine, improve, and train over a large amount of time were the ingredients to the win, with the current agents training for the equivalent of 160 human years.
Video games should not be seen as just games, but rather as a safe environment to experiment.
Now, an important question that I personally hear a lot, and that you might be wondering, is: how is an AI learning to play a game of any use? One could say that it can be sold as a product for professional gamers to train on, after all the gaming industry is now extremely wealthy, but this is actually a short-sighted view. These developments are a lot more valuable; games should not, in this context, be seen as just a game, but rather as an environment which allows to easily train and test new algorithms without any danger or limitation. They are a sandbox; just like when teenagers drive for the first time in parking lots with their parents, or when a kid plays football against the wall by himself. These are not the real, complex environment of urban driving and 11-a-side games, but no one would ever question that the experience gained is useful.
In this light, Dota 2 offers several of the properties of real-world problems and as such, being able to achieve impressive results should be seen as a first large step into the safe use of AI in more general aspects of everyday life. At the end of the day, most of the role of AI in the future should be as assistant to humans and as such, now that we have proof that it is possible to train them to collaborate with each other, a natural step is to see if it is possible for them to efficiently collaborate with humans, implying different constraints such as less training time available, and adapted communication.