Topic of the Month
Chatbot Explosion
Once a rarity, top-shelf AI chatbots are now almost commonplace.
Google launched its most powerful model, Gemini Ultra, in February, followed by Anthropic’s Claude 3 Opus in March and Meta’s Llama 3 last week. While OpenAI’s GPT-4 was unambiguously the top AI model in 2023, Google and Anthropic have claimed Claude and Gemini now rival GPT-4. In addition, Yann LeCun, chief AI scientist at Meta, recently said an algorithm even bigger than Llama 3 is in training—the company has so far only launched small and midsized algorithms—and it too could be a match for GPT-4.
This race to the top has been fueled by a rapid shift to generative AI overall. According to the Stanford Institute for Human-Centered AI’s (HAI) 2024 AI Index report, the number of foundation models—these are the complex, data-hungry algorithms like GPT-4, Claude, Gemini, and Llama—built per year has grown by a factor of almost 38 between 2019 and 2023. Last year, industry and academia combined to release 149 foundation models in total.
There’s little to indicate the pace will slow this year.
Clearly, models from the likes of OpenAI, Google, Meta, and Anthropic are converging. Mistral, the French AI startup behind the Mixtral AI models, is in the mix too. But it’s Meta's Llama 3 and the company's upcoming 400-billion-parameter algorithm alluded to by LeCun that may unleash a flood.
Llama is a so-called “open-weights" model. It isn’t fully open source—Meta, for example, withholds certain information and restricts usage over a max number of users—but it can be tweaked and modified. Developers made over 30,000 new variants based on Llama 2, CEO of Hugging Face, Clement Delange, told Wired recently. It’s likely we can expect the same for Llama 3 and its larger sibling too, only the new models will be more capable.
To date, according to the HAI report, closed algorithms like those from Google and OpenAI outperform open algorithms by a median 24.2 percent. Meta's new releases may help narrow the gap. What can we expect when hundreds or thousands of GPT-4-caliber algorithms arrive?
The open-source AI community is nothing if not experimental. Just weeks after Llama leaked last year, some developers had created versions that could run on laptops and phones. Others later fine-tuned Llama to improve performance, expand context windows, and add new languages. Despite worries the risk of openly releasing advanced models is too great, Meta says this experimentation is what motivates it to go open.
“AI is better when more people look at the code,” LeCun said at an MIT conference this month. “Infrastructure needs to be open source—it just progresses faster.”
For all this, it’s worth remembering the industry has been chasing GPT-4 for over a year.
Viewed from that angle, GPT-4-like performance is already a bit quaint, not least because so many models have attained it. Yes, there’s a lot we can still do with these tools—AI as a whole has reached or surpassed human performance in an impressive range of tasks. But it still struggles in areas like common sense, reasoning, and planning, and it’s still prone to bias and hit-or-miss factuality.
In recognition of both significant progress and the challenges ahead, the HAI report retired a number of benchmarks this year that have become “saturated” due to plateauing abilities or a shift in focus to more difficult challenges by researchers. A host of newly introduced benchmarks measuring the likes of agent-based behavior, causal, moral, and mathematical reasoning, coding, and factuality hint at where we’re headed.
These challenges are likely already driving work on future AI models.
OpenAI’s next big update could come as soon as this summer. It seems likely the algorithm will improve on GPT-4, but whether it’s able to make significant inroads on these benchmarks—or run laps around Gemini, Claude, and Llama—remains to be seen. Some problems may require fundamental breakthroughs. Still, OpenAI is no doubt working furiously to level up. And, for that matter, Google, Meta, and others are too.
“Look, I don’t want to downplay the accomplishment of GPT-4, but I don’t want to overstate it either,” OpenAI CEO Sam Altman told Lex Fridman in a March podcast interview. “And I think at this point that we are on an exponential curve, we’ll look back relatively soon at GPT-4 like we look back at GPT-3 now.”
Nothing comes automatically. But the industry has the talent, cash, and motivation to continue pushing the envelope. How far and how fast it will yield is still up for debate.
Know someone who might enjoy the Singularity Monthly?
Share this newsletter with them.