simple.ai by @dharmesh
Posts
Google's New AI Agents vs. Agent.ai

Google's New AI Agents vs. Agent.ai

A look at what's coming (and how we fit in)

Dharmesh Shah
December 12, 2024

Yesterday, Google made their biggest AI announcement of the year.

And it wasn't just another model update — it was their official entry into the agent era that I've been raving about for the past year.

In case you missed it, here's a quick snapshot of the important stuff announced:

Gemini 2.0 Flash: Their most capable AI model yet with a new real-time audio/video streaming feature
Deep Research: An agent that can explore complex topics across the internet and create comprehensive reports
Project Mariner: An AI agent that can navigate websites and complete tasks through Chrome just like a human can
Jules: An AI-powered coding agent that integrates directly with GitHub
Project Astra: A universal AI assistant that ‘sees your world‘ and can use Google Search, Lens, and Maps

In this newsletter, I'm going to break down how they stack up against agent.ai (spoiler alert: the two platforms are remarkably different, and both have pros and cons)

What’s available from Google now (vs just in preview)

Candidly, there's a lot to unpack here.

I'll start with the stuff that's actually available to try now and finish off with some of the announcements that are only available to a select few.

Gemini 2.0 Flash Experimental (available now)

Gemini 2.0 Flash Experimental is the first model of the new 2.0 lineup that's available to try today.

What's particularly exciting is that this new model outperforms all of Google's previous 1.5 models on key benchmarks at 2x the speed while being significantly smaller in size. This is a big deal because it shows Google has made fundamental improvements in efficiency, not just raw power.

The model also comes with a 'Streaming Mode,' which is a magical experience in particular. It allows the AI to see your screen and chat with you via voice in real-time.

And the most impressive part is that all of Gemini 2.0 (including Streaming) is completely free to try in Google AI Studio and through the Gemini API. This kind of enterprise-grade AI capability being available at no cost is amazing for accessibility.

Deep Research Agents (available now)

Deep Research is Google's new agentic research assistant feature that's available now in Gemini Advanced. It’s impressive.

It creates a multi-step research plan for your approval, then systematically explores the web, going through multiple rounds of searching and analysis. You can think of it like Perplexity (if you’ve tried it), but a bit more in-depth with the output (albeit slower).

The end result is a comprehensive report with citations and sources, all formatted and ready to export to Google Docs.

Unfortunately, this feature is only available for Gemini Advanced (paying) subscribers right now, but it’s probably worth the subscription if you do any extensive research as part of your job.

Project Mariner (trusted testers only)

Project Mariner is Google's new browser-focused AI agent prototype, currently being tested through an experimental Chrome extension.

It's their first real attempt at letting AI interact and work with web interfaces. This means it can actually navigate websites, click, fill out forms, and complete multi-step processes in the same way a human would.

For safety purposes, it can only work in your active tab and requires confirmation before taking any sensitive actions (meaning you can’t have 20 agents running simultaneously).

While it's currently limited to trusted testers, this technology hints at where we're headed—AI that can handle the tedious parts of web browsing and online tasks for us.

Jules Coding agents (trusted testers only)

Jules is Google's new experimental AI-powered coding agent that integrates directly with GitHub.

Jules approaches has rthe ability to assist coding tasks by creating multi-step plans to address issues, modify multiple files, and even prepare pull requests to land fixes directly back into GitHub.

What’s interesting is that it's not just writing code like ChatGPT, it's starting to think like a developer. The agent also works asynchronously, meaning you can assign tasks and continue with your own work while it handles bug fixes and other time-consuming tasks in the background.

It's currently limited to a select group of trusted testers, but Google plans to make it available to more developers in early 2025.

Project Astra (trusted testers only)

Project Astra is Google's vision of a universal AI assistant that can see the world with you in real-time. While it was announced earlier this year at Google I/O, they made some new upgrades.

What's new and particular notable is its broad integration with Google's ecosystem - it can seamlessly use Search, Lens, and Maps as tools to complete tasks. This means it can understand and act on information from multiple sources and formats.

The agent also noe includes some impressive memory capabilities, with up to 10 minutes of in-session memory and the ability to remember past conversations for better personalization.

While it's still in limited testing, Google is already expanding trials to include prototype glasses. Excited to hear more about that!

How Google’s agents compare to Agent.ai

Now, you may be thinking: how are we planning to integrate these features into Agent.ai?

The short answer: We already do! You can build agents to use Google’s new Gemini 2 Flash model (which is amazing and fast).

Our mission is to build 'The Professional Network For AI Agents'.

While Google is focusing on building broad, general-purpose agents (think Swiss Army knives), we're helping foster a network of specialized expert agents (think master craftsmen) built by citizen developers. Both approaches are valuable, but they serve different purposes.

The agent era is big enough for both approaches, and it’s exciting to see Google validating what we've been saying all along.

The future of AI is agentic.

-@dharmesh

What'd you think of today's email?

Click below to let me know.