simple.ai by @dharmesh
Posts
Cognitive Composability and the AI Tools Marketplace

Cognitive Composability and the AI Tools Marketplace

Only fools ignore the tools and try to build from scratch

Dharmesh Shah
November 30, 2023

Now that some time has passed since that terrible, awful, no good 5 days that Sam Altman was no longer CEO of OpenAI — and then was again, I think it’s time to jump back in.

One thing that doesn’t get talked about a lot (except in geeky AI dev circles) is a feature enhancement that OpenAI announced at Dev Day.

Here’s an excerpt from the announcement.

Function calling lets you describe functions of your app or external APIs to models, and have the model intelligently choose to output a JSON object containing arguments to call those functions. We’re releasing several improvements today, including the ability to call multiple functions in a single message.

OK, let me translate that for you into English.

Developers can create tools and describe them in a way that makes them callable as functions from GPT. So when the LLM needs to do something it can’t natively do, it can see if there’s a tool available to do it.

OK, why does this matter and what does it mean for Agent AI?

In my very first post, I shared my totally unofficial definition of Agent AI. It’s been really well received. It even got picked up by the super-awesome and super-discerning HubSpot AI team (hi team!)

Agent AI: Software that uses artificial intelligence to  pursue a specified goal. It accomplishes this by decomposing the goal into actionable tasks, monitoring its progress, and engaging with digital resources and other agents as necessary.

Let’s dig into that last part: “…engaging with digital resources…”.

This is where function calling (also known as “tools”) comes in.

Where we are headed is the ability for anyone to define a function/tool that can be made available to an LLM like GPT-4 Turbo as we are seeing now in the new “Create your own GPT” feature that OpenAI also launched at Dev Day (those folks ship!). The LLM doesn’t call the tool directly (yet), but it does pass back to the application what functions should be called — and with which parameters. And, now, OpenAI lets multiple function calls be “invoked” at once.

But, this idea is not just about GPT. The open source world is moving towards this model as well.

This Is The Future…It’s Just Not Here (Yet)

One day any developer will be able to build a tool that accomplishes a particular task. They can define/describe what their tool does, and how to use it (what the inputs and outputs are) in a structured way (similar to how you configure tools in Custom GPTs).

And, one day, there could be an AI tools marketplace. They all conform to a shared, open standard. As a developer of a tool, you add it to the marketplace.

Now imagine thousands of these tools being both discoverable and invokable over the Internet .

At first, these could be used by conversational bots like ChatGPT, LLMs themselves like GPT4-Turbo or one of the open source models like Mistral.

But, the next step is for Agent AIs to be able to use them. That’s when things get really interesting.

An Agent AI Powered By Discoverable Tools

I struggle with things that are overly abstract, so let’s work through a high-level example.

Let’s say you’re the head of strategy at a company. Or you’re a startup founder. Part of your job is to keep up with key competitors and give an update to your team once a quarter in the form of an Amazon-style memo.

So, let’s say there’s an Agent AI available for competitive research. Remember, it’s an agent so you give it a goal not a task.

“I’m the head of strategy for Acme Widgets. Write a memo outlining updates from my key competitors. Include snippets from recent earnings calls. Recommend any actions should be taken.”

Now, it’s unlikely that a single piece of software is going to be able to do everything that needs to be done to accomplish this goal.

It might break it down into tasks that include:

Identify key competitors
Determine which companies are public and retrieve their latest earnings call transcripts and audio.
Browse companies websites for any recent announcements.
See if there’s been any change to their positioning, pricing or packaging.
Identify the key executives at the company. Determine if any of them have made a public statement on social media.

You get the idea. Coming up with the list of things to be done is hard — but possible. We can sort of do that today. But actually doing those things is much harder.

The Agent AI Tool Marketplace

Enter the Agent AI Tool Marketplace (that’s not a thing — yet). No seriously, it’s not. Though if it did exist, agent.ai would be a super-cool domain name. 🙂

Like any online marketplace, there’s a list of digital tools being offered. There are categories. There’s search. There are ratings and reviews.

The big difference is that this marketplace is not built for a human to peruse but for an AI agent to peruse. It’s not a human buying the tool, it’s an agent. It’s not a human leaving a rating/review, but an agent.

But, I’m getting ahead of myself.

Let’s go back to our example of an agent for competitive research.

Our Agent AI could take each task for our competitive research goal and determine whether or not a tool exists in the marketplace to accomplish that task. If it finds that, it could try it out and review the results. It might even use an LLM to evaluate the results.

Side bar: Perhaps the marketplace would provide data to developers (who are the suppliers) of what kinds of tools are being searched for. This helps bridge the gap between demand and supply.

In any case, our agent now has a much better chance of accomplishing more of what needs to be done to progress towards the goal.

In the early stages, it’s unlikely that for complex goals, the agent’s going to be able to do everything — even with tools. But, as things advance, it should be able to accomplish more and more.

The Power of Cognitive Composability

Here’s why I’m so excited about this.

This idea of making things modular and raising the level of abstraction at which we work is as old as software development. The difference here is that we’re not just raising the level of abstraction for a human developer but also for AI software itself.

So right now, LLMs (Large Language Models) are all the rage. But in the future, it’s possible that the way we get things done is composing things with a combination of LLMs, SMMs (Small, Mighty Models), agents and tools.

It’s what I call Cognitive Composition (because it sounds cool and I have a longtime love affair with alliteration).

This is how we get leverage.

Here’s what this means for software companies: Right now, many of us are providing APIs (Application Programming Interfaces) which expose the capabilities and data of our platform to others, so they can build on top of it.

In the future, I envision AAIs — Application Agent Interfaces. But that’s a topic for another post.

Stay tuned.