Artificial intelligence expert, Andrew Ng gave one of his excellent speeches at Sequoia Capital AI Ascent 2024 and guess what?
He brought to light a new trend in the AI landscape – Agentic Workflow and AI Agents.
Agentic workflow is an innovative process of interacting with LLMs to complete complex tasks and produce outputs that are significantly more accurate than traditional methods.
Unlike the zero-shot method, Agentic workflow takes a more iterative and multi-step approach to break down one complex task into several small steps. This allows the model to consider your feedback at every step, self-reflect and collaborate with multiple agents to execute the tasks and produce output that’s over 41% more accurate than the conventional method.
This article aims to decode the concept of Agentic Workflow, its design patterns and workflow pillars.
Agentic workflow is the sophisticated iterative and multi-step process to interact and instruct Large Language Models to complete complex tasks with more accuracy.
In this process, a single task is divided into multiple small tasks that are more manageable and leave scope for improvements throughout the task completion process.
Additionally, the Agentic workflow also involves deploying several AI agents to carry out specific roles and tasks. These agents are equipped with specific personalities and attributes that make them capable of collaborating and executing defined tasks with high accuracy.
Another key highlight of Agentic Workflow is the use of advanced prompt engineering techniques and frameworks. The process includes techniques like chain of thought, planning and self-reflection that enable the AI agents to:
The prompt engineering techniques and multi-agent approach enable the AI agents to autonomously plan, collaborate, determine and execute the necessary steps to complete tasks.
Let's assume you ask the LLM to write you a blog. In the traditional approach, you will enter one prompt instructing the LLM to write a blog on a certain topic. It is like asking someone to write a blog from start to finish without reviewing the research sources, checking the outlines and improving the tone and quality of the content.
The traditional zero-shot approach leaves no scope for iterations, feedback and improvements during the process of writing the blog. This significantly reduces the accuracy and quality of the output.
On the contrary, in the Agentic workflow, we don’t give a single prompt to write the blog. Instead, we break down the task into smaller tasks like:
Here the LLM is instructed to complete the bigger task by following a step-by-step process. The output of each step acts as the input for the next task.
In summary, Agentic Workflow is an iterative and collaborative model that transforms the interaction with LLMs into a series of manageable, refinable steps, allowing for continuous improvement and adaptation throughout the task-completion process.
Similarly, Andrew Ng in his speech showed a case study where his team evaluated the performance of GPT3 and GPT4 models on coding capabilities. The team used a coding benchmark HumanEval to test the difference in results between the traditional “zero-shot prompting” method and the Agentic Workflow method in solving code problems.
The task was: “Given a non-empty list of integers, return the sum of all even-positioned elements.”
This case vividly demonstrates that even lower versions of large language models (such as GPT-3.5) can achieve superior performance in solving complex problems by breaking down tasks into multiple steps and repeatedly iterating and optimizing, surpassing the performance of a one-time direct output generation.
There are three pillars of the Agentic workflow process.
Let's understand each of them.
Also Read: What Are AI Agents? How To Build AI Agents For Your Business Tasks?
Andrew Ng explained 4 common AI agent design patterns to use in the Agentic workflow:
This pattern features an AI system enhancing its capabilities through self-feedback and iterative refinement. By reflecting on and analyzing its initial output, the AI system can improve the quality and accuracy of its results.
This method is applicable not only to programming tasks but also to other fields such as writing, design, or any activity that benefits from iterative improvement.
These techniques enable language models to become more adaptive and flexible, effectively catering to users' needs. In real-world applications, this method is frequently employed, involving multiple rounds of interaction and gradual corrections to help the AI deliver more satisfactory responses.
The concept of tool use emerged from early explorations in the field of computer vision. Initially, language models couldn't process images, so the solution was to create functions that could interact with visual APIs for tasks like image generation and object detection.
With the advent of multimodal language models such as GPT, the idea of tool use gained popularity, transforming language models from isolated systems into intelligent agents integrated with external tools and knowledge bases.
Through tool use, language models can now undertake a variety of tasks, including web searches, code generation, and enhancing personal productivity, thereby significantly extending their original natural language processing capabilities.
Looking ahead, the integration of tool use is likely to become a crucial direction in the evolution of language models, equipping them with enhanced planning, reasoning, and action capabilities.
Planning involves training language models to reason, devise, and decompose complex tasks. This capability allows language models to go beyond merely answering questions by proactively developing and executing action plans.
With planning abilities, language models can autonomously break down tasks, identify the necessary substeps and tools, and coordinate the use of various models.
For example, as Andrew mentioned, a language model might need to first detect a person's posture in an image, then call an image generation model to create a new image, and finally integrate it with voice synthesis to produce the final output.
Multiagent collaboration involves multiple language models or agents working together through interaction to complete complex tasks.
For instance, simulating experts in different roles, like doctors and nurses, can help in jointly developing diagnostic and treatment plans.
The crucial aspect of this approach is training the agents to collaborate efficiently, ensuring a clear division of labor to prevent conflicts and contradictions.
In the future, multiagent systems could become a powerful tool for solving complex problems, showcasing a level of collaborative ability that exceeds that of individual agents.
In conclusion, manually simulating Agentic Workflows within chatbots is not just a stepping stone, but a springboard. It propels us towards a future brimming with intelligent and collaborative AI systems.
Through these simulations, we can unlock the true potential of LLMs and AI agents, fostering innovation and propelling us forward in the ever-evolving realm of artificial intelligence.
The knowledge gleaned from these experiments will serve as the blueprint for crafting Agentic Workflows that revolutionize industries and empower real-world applications.
As we delve deeper into this exploration, we stand poised to unlock a future empowered by effective AI collaboration.
Is finding the right tech partner to unlock AI benefits in your business hectic?
Ampcome is here to help. With decades of experience in data science, machine learning, and AI, I have led my team to build top-notch tech solutions for reputed businesses worldwide.
Let’s discuss how to propel your business!
At Ampcome, we engineer smart solutions that redefine industries, shaping a future where innovations and possibilities have no bounds.