simple.ai by @dharmesh
Posts
What OpenAI Operator Means for AI Agents

What OpenAI Operator Means for AI Agents

...and how you can take advantage of this opportunity

Dharmesh Shah
January 30, 2025

Confession: I’ve been staying up past my usual bed time this week. I’m a night owl in general (usual bedtime is 2am), but this week was different. So many exciting things going on.

In the span of just a few days, AI agents suddenly gained the ability to browse the web just like we do.

And it’s not just one release — we're seeing this from multiple AI giants simultaneously (which usually means something big is happening).

Here's what went down:

OpenAI launched Operator (their browser-controlling AI)
Perplexity released their mobile Assistant agent
And suddenly, the future of AI agents feels a lot closer than we thought

These developments are particularly fascinating for us at agent.ai, where we've been working on specialized agents for months. The ability to browse the web opens up entirely new possibilities.

In today's newsletter, I'm breaking down what these new AI agents can (and can't) do, why this matters, and what developers need to know about our new AI-first internet.

—Dharmesh

So, what happened this past week?

Picture this: I'm sipping my daily Soylent (I use it to replace lunch) when Perplexity drops their Assistant announcement. Before I could finish, OpenAI countered with Operator.

If you know me, you can guess what happened next — I abandoned my drink and dove straight into testing (sorry, Soylent).

And for those wondering, both tools are ready to use right now. Unlike Claude's "Computer Use" feature, there's no complex setup — you just log in and start using them.

OpenAI Operator

OpenAI Operator is an AI agent available to US-based subscribers on the $200 pro plan. I know $200/month is expensive, and I don’t recommend it for everyone, but if OpenAI comes out with a product, I’m buying it, and I don’t regret it one bit. Even got my son (who turned 14) a subscription for his birthday. He was thrilled.

Anyways, back to Operator. It can independently navigate web browsers to complete tasks — which officially marks the company's first major step into autonomous AI agents.

It works like this:

Completes web tasks by typing, clicking searching, and scrolling in a cloud browser (a browser that runs on OpenAI’s infrastructure, not your local computer).
A computer-using agent (CUA) perceives, reasons, and acts on screen in real-time.
Notifies you to ’take control’ for payments and other sensitive stuff. (This is a bit cumbersome, but there’s no way around it).
It still gets blocked by certain websites, including Reddit and the New York Times
Operator is still in “research preview” (not ready for prime time yet), but in about half the cases where I asked it to do something, it actually worked, which is amazing. Watching an agent click around on a website on your behalf and seeing it “reason through” what it needs to do to accomplish a goal is quite fun and gives you a window into the future.

Perplexity Assistant

Perplexity Assistant is a free, agent-like tool for Android that can control phone apps and perform tasks with multimodal and voice capabilities — directly challenging voice assistants like Google’s Gemini and Siri.

A quick snapshot of it’s capabilities:

Calls apps to perform tasks, like booking an Uber or reserving a dinner table
Remembers your whole conversation while getting things done
Takes voice and camera-based instructions
Works great for an early AI agent product, but only available on Android phones

Why is this a big deal?

With Perplexity Assistant and particularly Operator, we're starting to see the early innings on something I've been predicting since starting agent.ai: an internet where AI agents do the heavy lifting while we humans supervise and course-correct.

And while there are still some rough edges (much like the early days of mobile apps), this isn't just hype — it's fundamentally changing how we'll interact with the internet.

Here’s why:

The Entire Internet Becomes Programmable: Instead of being limited to websites with official APIs, Operator can make any website automatable. Now every website (even those without APIs) suddenly becomes programmable, whether it was intended to be or not.
The Web Will Evolve for Agents: I love APIs, but as we move towards a more agent-based world, we'll need to rethink how websites interact with AI. Today we have robots.txt for web crawlers. Tomorrow? Imagine agents.txt (or agents.json) files that tell AI agents exactly how to interact with a site (more on this below). Update: I recently learned of llms.txt — which accomplishes a fair amount of what I was considering here. I don’t like the name, but the idea is spot on.
Agents Working Together: Operator will soon be able to think about how to split up tasks efficiently. For example, imagine you need to process 50 pages of results. Instead of doing it sequentially, it could spawn two instances — one starting from page 1, the other from page 50 — meeting in the middle.

Soon, AI agents will be able to operate other AI platforms (like agent.ai) and handle multiple tasks simultaneously. The possibilities are endless.

But here's the thing: We're still very early — think GPT-3 days for agents, before the "ChatGPT moment" that changed everything.

If you want to stay ahead of this curve, here's what I'd recommend:

Start experimenting with tools like Operator, Perplexity Assistant, and agent.ai
Pay attention to what they do well now (and what they still struggle with)
Test with an open mind, knowing that this is just the start
If you're building products, start thinking about how you'll adapt for an agent-first world

The truth is you can read about this stuff all day, but nothing beats hands-on experience. Anyone that starts preparing for this shift now will have a massive advantage when the agent revolution hits full speed.

What this means for developers

The rise of web-browsing AI agents raises a fascinating question: is the internet we built for humans ready for AI agents? (Spoiler alert: not really, but that's where the opportunity lies).

These early agents already show some superhuman capabilities:

Infinite attention spans (no coffee breaks needed)
Parallel processing (imagine having 10 browsers working simultaneously)
Perfect memory (every detail gets tracked)

But today’s websites weren't built for this kind of “user”. It's like we built roads before cars existed, and now we need to rethink the infrastructure.

Websites will need dedicated agent interfaces. Authentication systems will need to verify both humans and agents. Rate limiting will need to account for agent efficiency. Oh, and we’ll need to rethink the whole CAPTCHA thing too, because it turns out, A.I. is pretty good at detecting traffic lights and bridges in an image.

Here’s how I think you can take advantage of this opportunity as a developer building agents:

Think "Agent-First": Your agent isn't just a script anymore — it's a user that can click, type, and navigate just like a human. But unlike humans, it can do this 24/7 without getting tired. Design for this scale.
Focus on Specialization: Just because your agent can browse the entire internet doesn't mean it should. The most successful agents we've seen on agent.ai are the ones that do one thing exceptionally well. Think specialist, not generalist.
Design for Composability: It’s possible that the most active user of your agent might be another agent.
Test Like Your Business Depends on It: An agent hitting a dead end is like a support ticket waiting to happen. Break it, fix it, break it again — then make it better. The time you spend testing now will save you countless hours later.

At agent.ai, we're already optimizing our platform for this future, making it easier to build, discover, and hire specialized agents. And the possibilities are incredible.

Curious: What agent feature would you want to see next? Would love to hear your thoughts in the poll below.

—Dharmesh (@dharmesh)

What'd you think of today's email?

Click below to let me know.