Agentic AI: The Evolution of Application Development
Image: Depositphotos
I’ve been following the evolution of AI since the 1970s, especially the more recent era of data-centric AI systems based on highly sophisticated models trained with large amounts of information and powerful computer technologies. We were wowed when in 1997 Deep Blue won a celebrated chess match against then reigning champion Gary Kasparov, — one of the earliest, most concrete grand challenges of AI. The 2010s saw increasingly powerful deep learning AI systems surpass human levels of performance in a number of tasks like image and speech recognition, skin and breast cancer detection, and playing championship-level Go.
More recently, the impressive ability of large language models (LLMs) and chatbots to interact with us and generate cogent, articulate sentences has given us the illusion that we’re dealing with a well educated, intelligent human, rather than with a sophisticated stochastic parrot that’s been trained with huge amounts of human language but has no human-like understanding of the ideas underlying the sentences it’s putting together.
Agentic systems are now taking AI to the next level.
“We are beginning an evolution from knowledge-based, gen-AI-powered tools — say, chatbots that answer questions and generate content — to gen AI–enabled agents that use foundation models to execute complex, multistep workflows across a digital world,” wrote Lareina Yee, Michael Chui, Roger Roberts, and Stephen Xu in “Why agents are the next frontier of generative AI,” a McKinsey Digital report published in July of 2024.
“Broadly speaking, agentic systems refer to digital systems that can independently interact in a dynamic world,” said the report. “While versions of these software systems have existed for years, the natural-language capabilities of gen AI unveil new possibilities, enabling systems that can plan their actions, use online tools to complete those tasks, collaborate with other agents and people, and learn to improve their performance. Gen AI agents eventually could act as skilled virtual coworkers, working with humans in a seamless and natural manner. A virtual assistant, for example, could plan and book a complex personalized travel itinerary, handling logistics across multiple travel platforms. Using everyday language, an engineer could describe a new software feature to a programmer agent, which would then code, test, iterate, and deploy the tool it helped create.”
“In short, the technology is moving from thought to action,” noted the authors.
I have long understood what the report calls the thought phase of AI based on AI models that can make reasonable statistical predictions after being trained with large amounts of information. However, it’s taking me a while to figure out how to think of the emerging action phase of AI, where a person tells the AI agent what to do using an LLM and the agent proceeds to do so without human intervention. What does that mean in practice?
Wikipedia defines Agentic AI as “a class of artificial intelligence that focuses on autonomous systems that can make decisions and perform tasks without human intervention.” It further adds that “The core concept of agentic AI is the use of AI agents to perform automated tasks but without human intervention.”
We’ve long been using computers to automate tasks and applications. Over the years, we’ve been doing so with increasingly sophisticated application development platforms, and programming languages. For example, I first learned to program in the summer of 1962 using assembly language, which generally had one language statement for each machine instruction. Over the years, I started programming in high level languages, such as Fortran for scientific applications, Pl/1 for more general purpose applications and so on. Depending on the size and complexity of the application, it was all quite labor intensive. It was up to me decompose the application into a number of modular components, use existing programming libraries like scientific subroutines, and make sure that the overall program had no bugs and worked as I had intended.
Application development platforms have since become increasingly sophisticated, based on significantly higher level languages and tools which enable us to automate more and more of the labor involved in developing highly complex applications. For example, Robotic Process Automation (RPA) uses so-called software robotics tool to automate relatively simple, repetitive tasks by being shown a human performs the task and then automatically mimicking the their actions in subsequent tasks.
After reading lots of article on the subject, I’ve concluded that agentic AI should be viewed as the evolution of application development over sixty years since I learned to program in assembly language. Instead of carefully specifying every step of the application, as in traditional, deterministic programming, agentic application development systems are based on the statistically oriented methods of AI foundation models, which have been trained on vast datasets that can be applied across a wide range of use cases. Programmers are now becoming very high level application engineers whose job is to design, clearly specify the overall application they're building including the various agents or assistants needed to develop the application. Their job is to define the goals and actions they should follow under a variety of conditions using LLMs and other tools, as well as to coordinate the overall execution of the application to make sure that it’s working as intended.
“Agentic systems traditionally have been difficult to implement, requiring laborious, rule-based programming or highly specific training of machine-learning models,” said the McKinsey report. “Gen AI changes that. When agentic systems are built using foundation models (which have been trained on extremely large and varied unstructured data sets) rather than predefined rules, they have the potential to adapt to different scenarios in the same way that LLMs can respond intelligibly to prompts on which they have not been explicitly trained. Furthermore, using natural language rather than programming code, a human user could direct a gen AI–enabled agent system to accomplish a complex workflow. A multiagent system could then interpret and organize this workflow into actionable tasks, assign work to specialized agents, execute these refined tasks using a digital ecosystem of tools, and collaborate with other agents and humans to iteratively improve the quality of its actions.”
Agents can unlock significant value for business based on their potential to automate the highly variable number of relatively simple variations in complex applications that have historically been difficult to address in a cost- or time-efficient manner. For example, something as simple as organizing a business trip “can involve numerous possible itineraries encompassing different airlines and flights, not to mention hotel rewards programs, restaurant reservations, and off-hours activities, all of which must be handled across different online platforms. While there have been efforts to automate parts of this process, much of it still must be done manually.”
Gen AI–enabled agents can ease the automation of complex and open-ended application in three important ways:
Agents can manage multiplicity. “Many business use cases and processes are characterized by a linear workflow, with a clear beginning and series of steps that lead to a specific resolution or outcome. This relative simplicity makes them easily codified and automated in rule-based systems.” However, explicit rule-based applications break down when faced with less predictable situations not contemplated by the original application designers. “But gen AI agent systems, because they are based on foundation models, have the potential to handle a wide variety of less-likely situations for a given use case, adapting in real time to perform the specialized tasks required to bring a process to completion.”
Agent systems can be directed with natural language. “Currently, to automate a use case, it first must be broken down into a series of rules and steps that can be codified. These steps are typically translated into computer code and integrated into software systems — an often costly and laborious process that requires significant technical expertise. Because agentic systems use natural language as a form of instruction, even complex workflows can be encoded more quickly and easily. What’s more, the process can potentially be done by nontechnical employees, rather than software engineers.”
Agents can work with existing software tools and platforms. In addition agents can be trained to work with existing applications, search the web for information, collect human feedback, and cooperate with other agents. “Digital-tool use is both a defining characteristic of agents (it’s one way that they can act in the world) but also a way in which their gen AI capabilities can uniquely be brought to bear. Foundation models can learn how to interface with tools, whether through natural language or other interfaces.”
“Although agent technology is quite nascent, increasing investments in these tools could result in agentic systems achieving notable milestones and being deployed at scale over the next few years,” said the McKinsey report in conclusion. “As such, it is not too soon for business leaders to learn more about agents and consider whether some of their core processes or business imperatives can be accelerated with agentic systems and capabilities.”
Finally, the report recommends that organizations should consider three key factors as they prepare for the advent of agentic systems:
Codification of relevant knowledge: Implementing complex use cases will likely require organizations to define and document business processes into codified workflows that are then used to train agents.
Strategic tech planning: Organizations will need to organize their data and IT systems to ensure that agent systems can interface effectively with existing infrastructure.
Human-in-the-loop control mechanisms: As gen AI agents begin interacting with the real world, control mechanisms are essential to balance autonomy and risk. Humans must validate outputs for accuracy, compliance, and fairness work with subject matter experts to maintain and scale agent systems.
Agentic AI is a very exciting next step in application development, but, given the continuing hype around AI, let me conclude with some words of caution.
Some of the articles I've read feel like magic to me. You tell the system what you want using an LLM and the system automatically does it, much like asking a chatbot a question. While that might work for very simple applications, the knowledge and experience required goes up very rapidly depending on the complexity of the overall system being developed, much as it does with just about any engineering job. It brings to mind a blog I wrote a few months ago, “Will AI Devour Software Engineering (SE)?,” based on a recent paper by CMU computer scientists Eunsuk Kang and Mary Shaw.
“To the contrary, the engineering discipline of software is rich and robust; it encompasses the full scope of software design, development, deployment, and practical use; and it has regularly assimilated radical new offerings from AI,” said Kang and Shaw. “Current AI innovations such as machine learning, large language models (LLMs) and generative AI will offer new opportunities to extend the models and methods of SE. They may automate some routine development processes, and they will bring new kinds of components and architectures. If we’re fortunate they may force SE to rethink what we mean by correctness and reliability. They will not, however, render SE irrelevant.”
Irving Wladawsky-Berger
Irving Wladawsky-Berger, PhD., is a Research Affiliate at MIT's Sloan School of Management and at Cybersecurity at MIT Sloan (CAMS) and Fellow of the Initiative on the Digital Economy, of MIT Connection Science, and of the Stanford Digital Economy Lab.
Visit Irving on LinkedIn.