AutoGen Agents – Fast overview for beginners
AutoGen Agents – Fast overview for beginners
AutoGen Agents: A Beginner’s Guide to Automated Task Handling
In the rapidly evolving world of technology, automation has become a cornerstone in enhancing efficiency and productivity. AutoGen Agents, a compelling advent in this field, are reshaping how tasks are automated, making it easier for both beginners and seasoned tech enthusiasts to streamline their operations. This detailed guide will provide a comprehensive overview of AutoGen Agents, explaining their functionality, the benefits they offer, and guiding you through creating your first automated workflow using these innovative tools.
What Are AutoGen Agents?
AutoGen Agents are digital entities designed to perform specific tasks autonomously without human intervention. These agents can use tools, language, and communication to interact with other agents and people, solve problems, and complete tasks swiftly. Ideal for automating repetitive or complex functions, AutoGen Agents operate through predefined or dynamically generated workflows, depending on the nature of the task at hand.
How Do AutoGen Agents Work?
To understand how AutoGen Agents function, consider a simple two-agent workflow example, where one agent communicates with another to perform a task. This process involves several key steps:
- Initiation: A human user initiates a request (e.g., "Generate a word cloud").
- Processing by User Proxy: The first agent, known as the User Proxy, receives this request. Its primary role is to act as a mediator, passing the request to the second agent.
- Handling by the Assistant Agent: The second agent, the Assistant, processes the request. Equipped with capabilities to integrate with large language models, the Assistant understands the request, generates the necessary code, and sends it back to the User Proxy.
- Execution and Response: The User proxy executes the code and, upon successful execution, informs the Assistant, which in turn notifies the user that the request has been successfully completed.
This workflow illustrates the basic operating mechanism of AutoGen Agents, showcasing their ability to collaborate and solve tasks efficiently.
Creating Your Own AutoGen Agents
Setting up your own AutoGen Agents involves a few structured steps that can help you build a functionally effective agent system. Here’s a beginner-friendly guide to creating a simple application using Auto page agents following our earlier example of generating a word cloud:
Step 1: Setting Up Your Environment
- Open your terminal, create a new directory called
autogen_agents_demo
, and set up a virtual environment by running:python -m venv vm source vm/bin/activate
Step 2: Installing AutoGen
- Install AutoGen by running:
pip install pyautogen
Step 3: Writing Your Application Code
-
Create a new script,
app.py
, and initiate it with import statements and configuration settings, preferably using a large language model like GPT-4, and set up both User Proxy and Assistant Agents.Example snippet:
import autogen from autogen.agent import AssistantAgent, UserProxyAgent llm_config = {'model': 'gpt-4', 'api_key': 'your_api_key'} assistant = AssistantAgent(config=llm_config) user_proxy = UserProxyAgent(human_input_mode='never', code_executor='local') # Start a conversation between the agents to generate a word cloud response = user_proxy.initiate_chat(assistant, "Generate a word cloud")
Step 4: Execute and Monitor
- Run your application script and observe how the agents interact to complete your request. This process might involve iterative debugging based on the agents’ feedback on code execution successes or failures.
Advantages of Using AutoGen Agents
AutoGen Agents provide numerous benefits that make them an attractive choice for automating various tasks:
- Efficiency: They can perform tasks in significantly less time than manual processing.
- Scalability: Easy to scale up for handling multiple tasks simultaneously.
[h3]Watch this video for the full details:[/h3]
Here’s all you need to know about AutoGen Agents if you’re just starting out!
This tutorial will take you through what AI Agents are, how they communicate with each other, and how you can install and build them yourself using AutoGen.
We’ll look at ConversableAgent and work with two of its subclasses: UserProxyAgent and AssistantAgent and see how they can coordinate to generate a word cloud image based on the contents of a given webpage.
Let’s go!
—
Unlock the source code for free at: https://hey.gettingstarted.ai
—
Connect with me:
Subscribe: @gswithai
Follow on X: https://www.twitter.com/gswithai
???? Book Consultation: https://forms.gle/XXE8tA2jBWFWYHaf8
—
Other resources:
Read on the blog: https://www.gettingstarted.ai/autogen-agents-overview/
ConverableAgent: https://microsoft.github.io/autogen/docs/reference/agentchat/conversable_agent/
User Proxy: https://microsoft.github.io/autogen/docs/reference/agentchat/user_proxy_agent
Assistant: https://microsoft.github.io/autogen/docs/reference/agentchat/assistant_agent
—
Timeline:
00:00 Introduction
00:37 What are AI Agents?
00:55 AutoGen Workflows
02:41 Overview of UserProxyAgent
03:19 Overview of AssistantAgent
03:45 Creating the demo application
04:35 Writing Python code
04:52 Creating the AssistantAgent
05:14 Creating the UserProxyAgent
06:07 Initiating AutoGen Workflow
06:59 Demo: Running the application
07:58 Exploring the work directory
08:29 Wrapping up & Summary
[h3]Transcript[/h3]
hey how’s it going today I’m going to ask a couple of AI assistants to do something for me check this out hey assistant generate a wordcloud based on the content of this URL then the agents are going to talk to each other so that they can generate something that looks like this now if I wanted to do this from scratch using python it would have taken me a few hours maybe but autogen agents were able to do this in less than a minute now in this tutorial I’m going to show you everything you need to know to get started with autogen agents we’re going to go over what they are how they work how you can create them and finally how agents were able to generate this wordcloud all by themselves without any human intervention but before make sure you subscribe to the channel right now so you don’t miss any future updates or any future tutorials let’s define what agents are now you see you can think of an agent as a digital version of a person this is because agents can use tools language and communication with other agents and people to solve problems now let’s take a look at this diagram it’s not fancy I’ve tried my best so I hope you appreciate the design effort that I’ve done on this as you can see at the top here we have a human and then we have two agents at the bottom on the left side the one with hat is called the user proxy and on the right side you have another agent with the mustache he’s called the assistant this is a two agent workflow setup because we have two agents that will work together to solve the task that was initiated by the human it starts with a request the user says generate a word cloud the request reaches the first agent the user proxy in this case it forwards the message to the assistant this is because the assistant integrates with a large language model so it’s able to understand the request and generate the necessary code then the assistant will send the code to the user proxy once the user proxy executes the function PNG file is created and the user proxy lets the assistant know that the code Works finally the assistant informs the user that the request succeeded now during this process the assistant May generate code that doesn’t work or the user proxy may fail to execute the code for example let’s suppose that the user proxy is trying to execute a function that requires a dependency but it’s not in installed on the system so it runs into a module not found error once this happens the user proxy says to the assistant hey I ran into this issue so I wasn’t able to use your code check it out and let me know then the assistant using the large language model generates a fix for this and sends it back to the user proxy now this back and forth may take a few minutes but usually the agents figure it out by themselves and come up with a working solution awesome now it’s important to point out that both the user proxy agent and the assistant agent are subclasses of a generic agent type with an autogen called conversible agent now conversible means to have a conversation so a conversible agent is essentially a generic agent type that can be configured to have conversations with other agents and people now let’s zoom into the user proxy agent to see how it’s different from the assistant agent since both of them are subclasses of the conversible agent type now a user proxy is configured by default to prompt for human input every time it receives a message in my case earlier I set this to never so it never asked me for anything but by default that’s how the user proxy behaves now the second point is that a user proxy is not connected to a large language model that’s by default can obviously override these settings later finally code execution is configured by default this means that the user proxy is the agent that’s going to execute any code that’s received by other agents within the workflow cool now let’s take a look at the assistant agent and see how it’s different from the user proxy agent now by default the assistant agent will never stop within a conversation to ask for a human input next it integrates with a large language model to solve a task this means that it can use the large language model to generate code and understand the request finally it is not configured to execute any code this means that it can generate the code but it’s going to need the help of another agent with code execution capabilities to execute the code let’s see how you can build an autogen app with a two agent workflow based on the wordcloud example from before first we’re going to open the terminal window then we’re going to create a new directory and we’re going to call it autogen agents demo demo next we’re going to create a virtual environment and activate it so to do that we’re going to do python VMV and we’ll call the environment VM and to activate it we’re going to do Source activate now since our assistant agent is going to be using a large language model in this case from open AI we’re going to need an open aai API key so if you don’t have one go to the open a console and create a new key copy it and we’re going to add it here in the terminal and to do that we’re going to type export and you’re going to add your key here now I’m going to use bit to download and install autogen and set it up within our project so I’m going to P install by autogen the last thing I’m going to do is create a new file app. Pi okay now we’re going to write some code first I’m going to do import OS in our case we’re going to be using GPT 4 so we’re going to do llm config and then we’re going to say model using GPT 4 and our API key that’s the variable that we Ed from the step before next we’re going to create our assistant agent and to do that we’re going to import assistant agent and then we’re going to do assistant assistant agent I’m going to call it assistant this is a friendly name that you give to your assistant or your agent you can call it whatever you want and we’re going to pass in the configuration LM config that’s because the assistant agent is going to be using GPT 4 to generate some code now we’re going to create our user proxy and I’m just going to add here and I’m going to do user proxy user proxy agent human input mode never so I don’t want it to stop and ask me every time it receives a message I just want it to work autonomously and then we’re going to say llm config false Now by default this is false I’m just adding it here to show you that you can modify these things now we’re going to set up our code execution config and that’s going to be executor and we’re going to say autogen coding local command line code executor and this instructs the user proxy agent that it should execute the code in the local command line and we’re going to specify a directory coding you can call it whatever you want this directory is going to be used by this agent to store code or any other files all right now for this to work we just need to import autogen here so we’re going to do import autogen easy so far right the last thing that we’re going to do is initiate this conversation and send in our request and we’re going to see how the agents are going to talk to each other and figure it out let’s do that I’m going to do user proxy and we’re going to call initiate chat then we’re going to specify which assistant is going to handle the request so we’re going to say assistant this means that user proxy is going to send in our message to the assistant then we’re going to specify our message generate a word cloud if you’re not subscribed to the blog make sure to check it out save the image as word cloud.png that’s it now we can specify extra parameters for the initiate chat here like Max turns and that’s going to specify the maximum amount of messages that these agents can use but for this example we’re not going to do that so I’m just going to keep it like this and I’m going to save and we’re just going to run this let’s go back to our terminal we’re just going to do python laptop high now I’m just going to go through every interaction between the agents first we can see that the user proxy relate our message to generate the wordcloud to the assistant as you can see that’s the first interaction and then the assistant using GPT 4 made sense of this request and came up with a five-step solution and then it generated this code which is kind of impressive to be honest and then the user proxy tried to execute it but it failed as you can see here and it said to to the assistant hey you know the code failed because uh there is no module named wordcloud then the assistant said you don’t have the wordcloud python Library so install it and it gave it the PIP command to install wordcloud beautiful soup and these other packages now the user proxy comes back to the assistant and says hey the execution succeeded this time so the assistant comes back to the user proxy and says great the python script was executed so this means that you have a work Cloud PNG file in the same directory where you ran the script and terminate now if we peek at the directory that the agents are using the one that we called coding we can see the generated files and what the agents did behind the scenes check this out that’s our working directory and as you can see we have our app.py file that’s where the code is and we have our environment file I’m just interested in the coding directory that we’ve created this is where the agents did the work they wrote the code here and they executed the code from this directory also we can see that the work Cloud file is saved within this directory now let’s take a quick look at the output and see what it looks like all right that’s actually impressive for something that took about 20 seconds now obviously that’s a super basic demo of what’s possible with autogen agents you can build more complicated workflows that include not just one assistant but many more each one with a unique set of skills theguy the limit with what you can build I’m glad you made it this far if you enjoyed the video give it a thumbs up and subscribe to the channel because there are many more tutorials like this one that I’m working on that way you don’t miss any updates thank you for watching and I’ll see you soon