Contrary to recent reports, OpenAI’s internal research project known as “Q*” is probably not an AI technology that poses a threat to humanity. While initial headlines may have suggested otherwise, further investigation reveals that Q* may not be as groundbreaking or dangerous as it sounds. In fact, it may not be anything new at all.
Key Takeaway
OpenAI’s Q* project, which was initially portrayed as a potential threat to humanity, may not be as significant or dangerous as initially reported. It appears to be an extension of existing work surrounding Q-learning and A* algorithms. Many researchers believe that OpenAI’s ideas are not unique and are actively pursued by others in the field. While the true impact of Q* remains to be seen, it aligns with current trends in AI research and could offer advancements, particularly in improving the reasoning abilities of language models.
Background of the Q* Project
Last week, Reuters and The Information reported that OpenAI staff members had expressed concerns about the “prowess” and “potential danger” of the Q* project in a letter to the company’s board of directors. The letter claimed that Q* had the ability to solve certain math problems, although only at a grade-school level. However, there is uncertainty as to whether the letter was ever received by OpenAI’s board.
Upon closer examination, AI researchers, including Yann LeCun, the Chief AI Scientist at Meta, have expressed skepticism about the significance of Q*. The name Q* potentially refers to “Q-learning,” an AI technique that aids in learning and improving a specific task. Additionally, the asterisk may be a reference to A*, an algorithm used for exploring routes in graphs.
Q* in an Existing Context
Evidence suggests that Q* is not a groundbreaking breakthrough. Researchers have noted that Q-learning and A* algorithms have been around for a while. Google DeepMind already applied Q-learning in 2014 to develop an AI algorithm capable of playing Atari 2600 games at a human level. A* traces its origins back to a 1968 academic paper. Moreover, UC Irvine researchers have previously explored the combination of Q-learning with A*.
Nathan Lambert, a research scientist at the Allen Institute for AI, believes that Q* is primarily focused on studying high school math problems rather than posing a threat to humanity. Mark Riedl, a computer science professor at Georgia Tech, also criticizes the media’s portrayal of OpenAI’s quest for artificial general intelligence (AGI). Researchers dispute the notion that Q* represents a step towards AGI, emphasizing that OpenAI is a “fast follower” rather than a pioneering organization.
The Current Landscape of AI Research
Riedl, like Lambert, points out that the ideas behind Q* are actively pursued by numerous researchers across academia and industry. OpenAI’s contributions can often be replicated by researchers in other organizations or institutions. While the potential impact of Q* remains to be seen, it aligns with ongoing trends in AI research. Riedl also highlights that dozens of papers on similar topics have been published in the last six months, further indicating that OpenAI’s ideas are not unique.
Potential Benefits of Q*
It is worth noting that, regardless of the exact nature of Q*, it could still bring advancements. If Q* incorporates techniques described in a previous OpenAI research paper on improving the reasoning abilities of language models, it could significantly enhance the capabilities of these models. The paper outlines the potential for controlling the “reasoning chains” of language models, ensuring they follow logical and desirable paths. This approach aims to reduce the risk of models reaching malicious or incorrect conclusions by avoiding “foreign to human thinking” and spurious patterns.