Are AI Agents now ready for launch?

Ready for Launch

Overview

Since the start of 2024 I’ve been hearing about AI agents revolutionising how we work and so far that has mostly been a vision of the future. An AI agent, as opposed to a chat assistant such as ChatGPT is an autonomous application that beyond answering questions will undertake an action and can operate autonomously. These actions could be the sorts of things you do on your computer now like sending an email, processing and moving data, or browsing the web. They could also be the sort of thing a script or an application might currently do.

The ability to define an agent that has a focus on a specific task with a definable context, and can interact with other agents to undertake more complex tasks has huge potential. This process of decomposition is familiar to any engineer and the broken down tasks become much more understandable and testable, so can much more practically be to use reliably. The agents then become like building blocks and the opportunity for reuse makes them significantly more powerful again.

Tooling

The tools to enable the development and operation of AI Agents have developed quickly and there is a huge variety available already such as:

developer options like LangChain
visual workflow tools such as Gumloop and (open source) n8n
AI agent via SaaS such as Manus
operational and infrastructure management with the likes of AgentOps, CrewAI, and (open source) OpenDevin

Challenges

The constraint of much enterprise I.T. is interoperability between systems and the lack of shared data models. There are large ecosystems of businesses that have been active solving these challenges such as Mulesoft and Zapier. The advantage foundational service providers have (the likes of Microsoft and SalesForce) is that they can try and solve the interoperability challenge by holding (or vacuuming up) all the data. This is a powerful advantage which they are leveraging now with the likes of Agentforce and MS Agents 365.

Solving interoperability

The initial solution for this was to take a leaf out of past engineering approachs such as automating QA and have the AI Agent drive a browser and replicate (literally) the work of a human to solve the access constraint. This naive approach may work in some cases but is likely to not be desirable for service providers so is not a stable solution to consider. A possible solution for this is a standard for communication called Model Context Protocol (MCP) which was created by Anthropic.

The Model Context Protocol is an open standard that enables developers to build secure, two-way connections between their data sources and AI-powered tools. The architecture is straightforward: developers can either expose their data through MCP servers or build AI applications (MCP clients) that connect to these servers.

Recently OpenAI announced their support for MCP which has driven significant momentum behind this to become a universal standard. Although it’s still early days for this approach it is quite promising.

Is it now ready for use?

The short answer is YES. With a multitude of tooling options, and standards emerging to facilitate interoperability, there is no reason to wait further for the dust to settle on AI Agents. Here is some further guidance on key factors when considering whether a task might be well suited to implementation by an AI Agent:

Task Clarity and Structure There are clear objectives, known constraints, and the activity can be expressed in logical steps or patterns.
Data Availability There’s a dataset or a way to simulate the context for the agent to learn and be developed.
Outcome Measurability You can quantify task performance (e.g., accuracy, cost savings, user satisfaction, time saved).
Autonomy The task benefits from an environment where the outcome can be improved upon so that it learns or adapts over time.
Complexity and Repetition How much reasoning is required or can a pre-defined set of rules be made?
Environment Dynamics Does the context change over time requiring ongoing adaptation?
Risk and Supervision What sort of oversight and traceability is required? What would be the impact of a wrong outcome?

Cloud Native IT, Infrastructure & Security Practice

Toby Vidler

Tags

Overview

Tooling

Challenges

Solving interoperability

Is it now ready for use?