🔥SEJ Live is Back! The AI Search Playbook.

  1. SEJ
  2.  ⋅ 
  3. Generative AI

Google’s Sergey Brin Sees A Path To AGI But Not Beyond It

Google co-founder Sergey Brin says AI is heading toward AGI but admits he has no idea what comes next.

Google’s Sergey Brin Sees A Path To AGI But Not Beyond It

In a recent AGI House interview, Sergey Brin described Gemini as a system whose capabilities are not just evolving but integrating world knowledge across languages and modalities. He said the software that AI runs on has also evolved beyond what it was originally designed for, and while Brin can envision Gemini achieving AGI, he also couldn’t see what comes next.

AGI: Artificial General Intelligence

AGI is a level of AI that can learn, understand, and apply knowledge across tasks in a manner similar to humans. Today’s AI can produce useful answers, write code, analyze images, and solve many narrow problems, but it does not yet understand the world or independently apply knowledge across domains the way a human can.

OpenAI, Google DeepMind, and Anthropic are all developing AGI, but they emphasize different reasons for what they want to do with it. OpenAI focuses on economic benefits, Google DeepMind emphasizes scientific discovery, and Anthropic prioritizes human progress.

Next Big Thing: AI Capabilities Are Converging

Brin said that Google’s earlier AI progress relied on specialized models that were built for specific tasks. But he said that Gemini is increasingly achieving state-of-the-art performance across multiple domains like mathematics and scientific reasoning. What Google is seeing is that capabilities that used to rely on models trained to do specific things are now giving way to model families that can do it all: convergence.

He also said that convergence was something that happened; it wasn’t something he expected when Google began developing AI.

The context of his answer was a question about what the next big thing is, with his answer being convergence.

Brin responded:

“I think the exciting thing is that all of these things are converging to the same general models.

In the past, we would have to have specialized models. And in the case of protein folding, we obviously still do.

But increasingly, our main Gemini LLMs can be the state-of-the-art for math, for example, and for other kinds of scientific questions. So that convergence is, I don’t know, I guess it’s not something I really would have predicted at the outset. But it’s been kind of incredible to see.

And I guess baked into that is this concept of transfer, just the idea that when you train for a certain class of problems, let’s say you’re training for coding, that that actually can help your math reasoning and vice versa.

And that’s been really exciting to see… the multimodal capability also is an example of that. Like, can you actually get a transfer from being able to process images to actually being able to think through kind of geometric text problems too.”

Transfer learning is one reason convergence is happening. Transfer learning is where you train a model in one thing and it turns out that it has benefits in accomplishing tasks in something else that’s seemingly unrelated. So what’s happening now is that Google is finding that combining things like vision training, mathematics and reasoning are contributing to improvements across multiple capabilities.

Transformers Are “Weirdly Flexible”

Brin was asked if transformers will play a role in AGI. Transformers are the software that AI runs on and the breakthrough that enabled things like ChatGPT. Brin’s answer mentions MOE, which stands for Mixture Of Experts. MOE is a technique for routing specific tasks to specialized internal “experts” to increase efficiency.

For the question of whether AGI will run on transformers, Brin answered:

“Transformers have been weirdly flexible. We use them for image and video in addition to text. So they’ve exceeded their original capability.

Now, to be fair, along the way, they’ve also changed. I mean, we have whatever, sparse kind of MOE, transformers. I mean, there are a lot of little details that have shifted along the way, so it’s not like the exact same thing as the transformer paper.

If I could guess, could something close to that be AGI? I would say yes.

That’s just my guess, just because they’ve been able to evolve so much.

But like I said, they are changing. It’s not like the exact same thing as the original transformer paper.”

World Models Are Converging With Gemini

Brin was asked if world models would help AI achieve AGI, if that’s a part of reaching that goal. A world model is an AI’s internal simulation of reality that helps it anticipate what might happen next. By predicting the consequences of different actions, it can make better decisions and plan ahead.

He mentioned Google’s Gemini Omni as an example of this direction in AI. Gemini Omni was introduced in mid-May at Google I/O. Google describes it as their new “any input to output” multimodal AI model family. It combines Gemini’s reasoning abilities with generative media capabilities, starting with video creation and editing. Google describes it as a model that can eventually “create anything from any input.”

The question asked was:

“What’s your perspective on how world models can help reach AGI?”

Brin answered:

“Yeah, I mean, world models are like video, basically, models. And I guess there’s a couple– people talk about AGI pretty broadly.

I think of it as, I think of AGI as the idea of, the AI can actually improve itself.

But other people, and I think probably those people are more correct, sort of think AGI means, well, the AI needs to be able to do anything a person can do.

And those are two different things.

So to do anything a person can do, you absolutely need to be able to understand and interact with the physical world.

So for that, being able to you know, dream, imagine what’s going to happen in the world if you do something and comprehend it is obviously important.

So, I think the world models, yes, if you’re going to do everything and that, you know, extends to robotics and things like that, world models are key.

And yeah, you guys have probably had more time to play with our Gem Omni model honestly than I have, because I’m deep into self-improvement game.

But yeah, we’ve been working on that for a long time, Omni’s the latest version of that.

Omni is also pretty cool because it’s just the same, you know, Gemini, like we trained it also with all the text and all the other things, trains exactly the same way.

The fact that these converge is kind of amazing. But yes, you need that capability for this ability to interact physically.”

The takeaway is that Gemini is taking a new direction with the convergence of world models. It’s the next stage of growth.

What Comes After AGI?

Someone asked Brin about what comes after AGI, which was a really good question. What was interesting about Brin’s answer is that he didn’t have one. Brin’s response was that he couldn’t really see beyond it. He compared AI to previous technology waves like the web and mobile computing, but he did not identify a paradigm of what comes next.

The implication is that figuring out what comes after AGI would itself be a major opportunity.

He said:

“Wow, that’s a great question.

What’s sort of next after we hit AGI?

I mean, I think everybody is pretty focused on accelerating the growth in AI right now. What comes after?

We started with obviously the web and internet search. We kind of went through the mobile generation, which was another pretty big explosion.

I guess now people are– now AI is a huge new industry trend. And what comes after that?

Boy.. I mean, I think if you can answer that, you’ll have a fantastic company on your hands.”

What It All Means

  • Brin sees AI moving toward AGI through convergence.
  • Capabilities once handled by separate models are merging into broader model families.
  • Transfer learning helps one kind of expertise improve performance in another.
  • Transformers continue to evolve.
  • World models may be Gemini’s next stage of growth.
  • It may be that nobody knows what comes after AGI until they’ve achieved it.

OpenAI, Google DeepMind, and Anthropic are all working toward creating AGI, prioritizing different goals for it.

Brin’s description of Gemini offers a glimpse into how Google thinks AGI may be achieved. He described a process of convergence, where capabilities that once required separate systems are increasingly appearing within the same model family. One reason this is happening is transfer learning, where training a model in one domain improves its abilities in another.

That same convergence is now extending into world models. Rather than treating physical-world understanding as a separate discipline, Google is integrating those capabilities into Gemini itself. Brin pointed to Gemini Omni as an example of how reasoning, multimodal understanding, and world-model capabilities are increasingly becoming part of the same system.

What comes after AGI remains an open question. Brin said he can imagine current AI architectures continuing to evolve toward AGI, but when asked what follows it, he did not have an answer. If AGI is the next frontier, whatever comes after it could be the foundation of an entirely new generation of companies and technologies.

And that is where we are headed with AI.

Watch the interview here:

Featured image/Screenshot

Category News Generative AI
SEJ STAFF Roger Montti Owner - Martinibuster.com at Martinibuster.com

I have 25 years hands-on experience in SEO, evolving along with the search engines by keeping up with the latest ...