From Chatbots to Autonomous Agents: The future of AI is already here
From Brain Prediction to Software Autonomy: How Intelligent Agents Will Transform Our World
In the book A Thousand Brains, scientist and entrepreneur Jeff Hawkins, known for inventing the famous Palm Pilot PDA, the precursor to smartphones, presents a vision of the human brain as a prediction machine. Our brain is constantly anticipating what might happen, inventing near-term futures, in search of potential risks.
Other scientists have suggested that this predictive capability may be the reason our species survived over time, while other species of the Homo genus became extinct. It once kept us alive, but now that technological innovation has brought us a world full of abundance, a new set of problems is beginning to emerge.
FOMO in the realm of technological innovation keeps us constantly wondering what will come next, what the next great revolution will be—specifically, what will follow Generative Artificial Intelligence, which has kept us entertained over the last two years.
Agents That Will Take Actions: The Next Technological Evolution
Everything points to the next major evolution in Artificial Intelligence being Agents. This represents an advancement over the current chatbot technology based on LLMs (Large Language Models). In the near future, ChatGPT, Gemini, or Claude could become so intelligent that they develop the ability to reason at a human level to solve problems and be granted the autonomy to take actions on their own.
Evolution and levels of General Artificial Intelligence
1. Chatbots | Conversational language AI |
2. Reasoners | Human-level problem-solving |
3. Agents | Systems that can take actions |
4. Innovators | AI that can assist in invention |
5. Organizations | AI that can perform the work of an organization |
Does this sound too futuristic? It does. However, just a few years ago, we couldn’t have imagined having a tool like ChatGPT at our disposal, which, the more we use it, the more helpful it becomes for all kinds of work-related tasks. To understand what we’re talking about, let’s take a first look at the idea of an Agent through the AgentGPT initiative.
Planning and executing tasks to achieve goals
AgentGPT allows users to configure and deploy what they call autonomous AI agents directly from the browser. These agents can be customized with a specific goal and are tasked with planning and executing tasks to achieve that goal autonomously. They use advanced language models, such as GPT-4, to understand and perform tasks without human intervention. Users can employ AgentGPT for a variety of objectives, such as developing marketing strategies, building web applications, and creating content. The platform stands out for its ability to perform multiple iterations to continuously improve its results and its advanced configuration options for specific tasks.
To make this possible, what they’ve done at AgentGPT is connect several GPTs to work together in a coordinated way. One GPT analyzes the goal, another proposes tasks, another selects them based on relevance, another executes them, and so on. This approach has been shown to produce better results than if a single GPT were asked to perform the entire process.
Agents with autonomy
Thus, the concept of autonomy takes center stage when we talk about Agents. However, as we can imagine, to grant autonomy to software, certain precautions must be taken, consisting of pre-configuration with two key elements:
- The first is to limit their range of action, as initially, these Agents will not be multipurpose but will be integrated within a specific domain, usually within a website. For example, they might handle customer service. In this way, we could have an Agent that can do all sorts of things the user needs, but always within the confines of the website’s theme. In the case of a travel website, it could assist the customer throughout the process, even booking the trip through the chat, but it wouldn’t be able to provide psychological or medical advice, as all topics outside of travel would be restricted.
- The second element in this configuration is granting the Artificial Intelligence access to the company’s internal systems, primarily through APIs, so that autonomy becomes real, and the Agent can take actions on its own. This means not only providing the information it has been trained on but also, for example, handling the booking process for flights and accommodations or connecting with various alert and messaging systems for users—billing management, travel insurance contracts, and so on.
This is how Agents make the integration of GenAI across the entire Internet a reality, not just through specific services that have so far been offered for generating images, text, videos, etc.
Additionally, multimodal and multi-device capabilities will be of great importance for users to access the full potential of this technology. In fact, one of the major challenges developers of LLMs (OpenAI, Google, Meta…) are currently facing is adapting them to function smoothly across all kinds of devices, thereby facilitating their scalability.
Virtual Assistants capable of performing all kinds of tasks
To give you an idea, we’ve had this technology at our disposal for less than two years, and it still shows many consistency issues, but we’re already asking it to go a step further. Multimodality has been a significant advancement that has come in just a year—a very short time.
For example, it’s now possible to take a picture of what’s inside your refrigerator and ask it to create a menu and recipe list with the available items. But perhaps the next step we’re waiting for is that, if essential ingredients for preparing a healthy meal are missing, the Agent itself will take care of placing an online order to have those items delivered to your home.
AI-based Chatbots could mainly answer all our questions and help us with various tasks related to content creation. When this technology is integrated into all kinds of websites as Agents, it will also have the capability to perform specific functions related to the content of that website. Finally, when the websites themselves grant access to ChatGPT, Gemini, Claude, and others through their APIs, we will have true virtual assistants that can perform all kinds of tasks for us. How? By navigating the Internet according to pre-established orders.
Agents as part of our daily lives
At SNGULAR, we have been working with Agents since the appearance of LLMs and have already implemented this type of technology in several client projects. This allows us to say that the trend is real, and that many companies will want to integrate these functions into their software and websites.
For now, we won’t delve into the “innovations” and “organizations” that OpenAI predicts, but is there any doubt that this trend is on the horizon? Many companies and professionals are already working to make this possible. As a general recommendation, it seems like a good idea to start preparing for this, even if only mentally.
Mastering Artificial Intelligence
At the rate AI is evolving, mastering it is becoming a necessity. At TecnoFor, we offer training to help you develop the skills needed to implement AI and take advantage of its capabilities in your work environment:
- Fundamentals of Prompting in Generative AI
- Generative AI: Practical Course to Improve Productivity and Creativity