Exploring Multi-Agent AI and AutoGen with Chi Wang
Exploring Multi-Agent AI and AutoGen with Chi Wang
Exploring Multi-Agent AI and AutoGen with Chi Wang
In the ever-evolving domain of artificial intelligence, multi-agent systems stand out as a revolutionary approach to emulate complex behaviors akin to human intelligence. Chi Wang, a principal researcher at Microsoft, is at the forefront of this innovation with the creation of AutoGen. This framework leverages large language models (LLMs), tools, and human inputs to build dynamic multi-agent systems. In this insightful discussion hosted by Joanne Chen, General Partner at Foundation Capital, Chi Wang delves into the intricacies of multi-agent AI, the challenges, and real-world applications facilitated by AutoGen.
Introduction to Multi-Agent Systems
The concept of multi-agent systems is rooted deeply in the theory of intelligence, drawing inspiration from Marvin Minsky’s "Society of Mind" theory (1986). This theory posits that human intelligence comprises numerous simpler processes, which, when combined, exhibit highly intelligent behaviors. Similarly, in multi-agent systems, individual agents might not be inherently powerful, but when strategically composed and interacted, they exhibit a synergistic intelligence.
Core Concepts and Applications of AutoGen
AutoGen stands as a beacon of innovation, offering an open-source framework that combines the technical prowess of LLMs with the intuition of human input. This hybrid setup allows for the creation of robust applications spanning various industries. Chi Wang illustrates how simple agents, when composed into a network, can tackle complex tasks that are otherwise daunting for a single agent. This approach not only simplifies the development process but also enhances the system’s efficiency and capability.
Benefits and Challenges of Multi-Agent Architectures
While multi-agent systems are celebrated for their ability to handle complex, distributed tasks efficiently, they also bring forth challenges, primarily due to their intricate nature. Chi Wang discusses the double-edged sword of complexity in multi-agent systems, highlighting the balance that needs to be maintained to leverage their full potential without succumbing to inefficiencies.
Real-World Use Cases and Enterprise Application
From gaming simulations and social interactions to complex decision-making in business environments, the applications of multi-agent systems are vast. AutoGen, in particular, has found utility in various sectors, including healthcare, where it is used to generate synthetic data for training AI models, to environmental science, where it aids in calculating wildfire areas from satellite imagery. The adaptability and expansiveness of AutoGen’s application have set a new precedent in the practical deployment of AI technologies in enterprise settings.
Future Directions and Innovations
Looking ahead, Chi Wang shares his enthusiasm for exploring brain-inspired architectures to further enhance the reasoning and planning capabilities of AI systems. This initiative aims to mimic human cognitive processes to refine the functionalities of AI agents, pushing the boundaries of what artificial systems can achieve.
Conclusion: The Road Ahead for AI
As AI continues to integrate more deeply into various facets of human activity, the role of frameworks like AutoGen becomes increasingly significant. By bridging the gap between human cognitive processes and artificial intelligence, Chi Wang’s work with AutoGen is paving the way for more intuitive, efficient, and impactful AI applications. The future of AI, spearheaded by innovations such as multi-agent systems, promises a landscape where AI’s potential is boundless, guided by human ingenuity and enhanced by machine precision.
[h3]Watch this video for the full details:[/h3]
In this episode, I’m joined by Chi Wang, a principal researcher at Microsoft and the creator of AutoGen, a open-source framework that allows developers to combine LLMs, tools, and human input to build multi-agent AI systems.
By enabling AI agents to collaborate, learn from each other, and contribute their unique skills, AutoGen is unlocking a new frontier of AI capabilities. It’s quickly gained traction among both academics and enterprises and is currently powering a wide range of use cases, including synthetic data generation, code generation, and pharamaceutical data science.
In our conversation, Chi breaks down the core concepts behind multi-agent AI, the pros and cons of multi-agent architectures, and the real-world use cases enabled by AutoGen. He also shares some of the open research challenges he’s tackling, his perspective on the future of AI, and what excites him most about where the field is headed.
Read takeaways from our conversation here: https://foundationcapital.com/the-promise-of-multi-agent-ai/
(00:00) Intro
(01:27) Chi’s background and early interest in AI
(05:42) Defining agents and their core capabilities
(08:13) Pros and cons of multi-agent systems
(11:23) Multi-agent architectures and the “Society of Mind” theory
(14:36) Real-world use cases enabled by multi-agent systems
(16:45) The backstory and genesis of AutoGen
(19:43) How AutoGen’s architecture leverages language models, tools, and human input
(23:13) More examples of AutoGen’s diverse applications
(29:23) How AutoGen is being used in enterprises and production
(32:42) Advice for AI builders focusing on the enterprise
(40:24) What’s next for AutoGen and open research challenges
(47:17) Resource recommendations for AI builders
(48:09) What excites Chi most about the future of AI
[h3]Transcript[/h3]
I think it’s very important to uh recognize that multi-agent approach is a good way to build a single agent this has a very a very good root in the theory of intelligence so back in 1986 Mar Minsky uh had this Society of Mind Theory uh he proposed that uh the human intelligence or any other uh natural intelligent system often consist of rly simpler processes and uh combining them together the entire system can ex exhibit uh more intelligent Behavior uh um so so when we see multi agent um we we don’t necessarily mean each agent in the system is a powerful agent uh it can you can start from relatively simple agents uh but once you compo them together uh and make them uh interact uh in a good way uh they would exib exhib higher level of intelligence welcome to AI in the real world I am Joanne Chen General partner of foundation capital I work closely with startups are reshaping business with AI in the series I have discussions with leading AI researchers we’ll explore how state-of-the-art AI models are being applied in Enterprises today in this episode I’m joined by Chi Wong a principal researcher of Microsoft and the creator of autogen an open source framework that combines llms tools and human input to build multi-agent systems autogen has quickly gained traction and has been adopted by both academics and Enterprises to create applications in this conversation she breaks down the Core Concepts behind agents the pros and cons of multi-agent architectures and the real world use cases enabled by autogen he also shares some of the open research challenges he’s tackling and what excites him most about where AI is headed I had so much fun chatting with Chi and I think you’ll enjoy this as well here’s our conversation well thank you so much she for uh joining us and having this conversation I think there’s a lot of interest especially in the market around some of the real world applications of agents and multi-agent um uh models um maybe we’ll just do quick introductions like what SP your interest in studying and building the space What drew you specifically to studying agents yeah that can date back to a very long time ago in my childhood I learned my first basic program from a very old Apple device for snake game and soon I learned some other even more interesting AI system like uh system that can predict Fortune for human being those was very early days you know that was rule based kind of approach and in college I participate in like competitions like using agents to build uh game players uh that can play compete with other agents uh so those are like some some early examples that got me interested I did study uh some data mining text mining Topic in my PhD and those time we also studying generative models interestingly and generative language models but those are very different from today’s large language models the new generative AI uh probably the younger generation uh didn’t know uh those those kind of earlier General models they were small they were mod restricted uh mainly for understanding analyzing text um but uh those are uh early efforts uh and uh I’ve been like uh interest in this space for a long time uh and and seeing the the all the progress um but the you know like like else really the CH gbt gbt 3 um kind of large language models were um a big Revolution and that got me back into this space and um uh started to uh work on this so before that right before that happened I was mainly working on the Automated machine learning uh for model selection hyper parameter tuning Etc it’s generic for all kinds of machiner models uh but large models are so new and interesting and so I soon uh started drawing my Approach uh in the lry models and found some interesting behaviors uh and applications so that’s where I got some insights into uh the applications building on top of these larer models and found a larger space to work on uh meaning uh there’s a big opportunity to think about a larger design Space by combining models with tools connecting with humans Etc and build really complex systems and I was looking for the right way to think about these applications and that right way to empower developers to build their application easily and I found that agent concept happened to be the most intuitive natural concept to reason about the behaviors of these model and different time type of tools and then related it back to other previous experience early experience about AI agents so then everything just pointed me to the same direction that yeah this is the right direction I should go from that that’s great and were you doing this mostly uh in Academia I’m Microsoft what was the what was some of the places you were doing this from a timeline standpoint I was working on atima machine learning and Hyper tuning at Microsoft research I worked on open source project called Flo uh and uh the yeah and and when I started using Flo to tune hyper parameters for dbd models for example I was also in micr result research uh and still still building the same open source project Flo so so everything about autogen was built in inside this open source project until the October last October we moved it to a standalone report on GitHub so when I started working on Flo I also started collaboration with Academia uh I have a collaborator from pen University uh who joined me since the beginning uh so yeah since the beginning development of it was collaboration uh uh with with open source Community awesome awesome that’s a super super good comprehensive introduction let’s talk about some Basics uh of Agents just so that our audience understands what how it works let’s talk about that and then we’ll talk about uh what you’ve built specifically um after that so from a from a agent standpoint could you maybe just explain in very simple terms what is an agent what are the core K of abilities just to start there are indeed many different types of uh definitions of Agents uh when I think about it uh I was looking for the most generic notion that can incorporate all these different types of definitions and to do that I really need to think about the minimal um set of uh Concepts that are needed when when we think about them and in our definition we think about the agent as a entity that can receive messages send messages and generate replies to respond to other agents we think this is the most uh we think it’s minimal set of capabilities agent need to have uh underneath they can have different types of uh backhands to support them to perform this kind of actions there some some of the agent can use large models to generate replies uh some other agents can use tools underneath uh to generate uh tool-based replies and other agents can use human input as a way to reply to other agents um and uh you can also have agents that mix this different types of backand uh or have more complex agents that have internal conversations between multiple agents uh but uh on the surface uh other agents still perceive at a uh per serve it as a single entity and when communicating to them uh and I think with this type of definition we kind corporate both very simple agents that can perform readly simpler tasks using a single backend um but also we can have agents that actually contain multiple simpler agents and compose uh more complex and Powerful agents and you can recursively build up uh this type of Agents um so that and the agent uh concept can cover all these different types of uh complexities got it that that makes that makes sense could you maybe share some you know pros and cons I know there’s complexity from a pro standpoint right when you have a multi- Asian system but there’s also a lot of uh downsides of an invers like that due to this complexity could you maybe talk about that a little bit more so I think it um it’s very interesting uh topic when we uh relate multi- agent system and single agent I think there are at least two different um Dimensions we to think about when about single versus smarty what dimension is about interface uh that means um from the users point of view do they see do they interact with a system with uh in a single interaction Point uh or do they see explicity multiple agents working on into intera with multiple of them so that’s form interface and another dimension is the architecture for example a single uh agent interface can have actually multiple agents underneath running at the back end uh inside actor but the users don’t need to see that faity um so um when we think about that then um the benefit uh or prod counts are different uh from interface point of VI view in some application uh it’s better to have uh a single interaction point for the user to simplify it but in other applications uh for example when the application about having multiple agents debate about the subject and users need to expit see what the agent says in that case it’s beneficial for them to actually see the multiple agent Behavior or think about the social simulation kind of experiment people also want to see all the differences Behavior but on other hand for for the system part uh I actually think the multi-agent uh design of the architecture is easier to maintain and understand and extend than a single agent system uh because even even uh even for the single interface uh single agent based interface uh having some multi- agent implementation will make the system more modular uh and easier for developers to add or remove components of functionality ex I think it’s very important to recognize that multi-agent approach is a good way to build a single agent uh so this is maybe um not uh people not immediately think about it in that way but when you think about it uh what I meant earlier about uh composing relatively simple agents uh uh and build more powerful agent this has a very a very good rout in the theory of intelligence so back in 1986 Mar Minsky uh had this Society of Mind Theory uh he proposed that uh the human intelligence or any other uh natural intelligent system often consist of rely simple processes and combining them together the entire system can ex exhibit uh more intelligent Behavior uh um so so when we see agent um we we don’t necessarily mean each agent in the system is a powerful agent uh it can you can start from relatively simple agents uh but once you compo them together uh and make some uh inact uh in a good way uh they would exhib exhib higher level of intelligence they don’t have a good uh single agent that can do everything we want yet and why is that it’s could be because um we hav to figure out the right way of um composing the multi- agent to uh uh to to build this powerful single agent uh but first you need to have a framework that allows easy experimentation of these different ways of company M agents so I I believe that if people practice this uh practice of uh using M agents to solve a problem they they will also have a quicker way to uh reach on that okay they can figure out a robust way of building a single agent um using this way inste otherwise uh um there are just too many possibilities there too many ways uh to uh build this single agent uh if you every time you rewrite the entire system is not going to be uh easy to maintain and make make progress so that’s why uh this modularity uh from a programming point of view uh is is is is very useful uh on the other hand uh you don’t you don’t have to stop at there right you can always think about multi- agent uh system as a way to multiply that uh Power right uh you can always connect them with other agents um and for for again for different reasons uh even if the single agent uh uh even if the even when the performance of a single agent is good enough uh you may also want to uh uh for example make this single agent uh teach some other relatively weaker agent uh so that they can uh become better with low cost so we have a technique for example called ecos assistant uh that is uh that shows the example of doing that uh this this uh technique can use uh agents with um uh different Power uh different capabilities for example you can mix gbt 4 based agent with gbt 3.5 based agent the G4 agent is more capable but more more expensive but by by using this G4 agent teach a 3.5 agent you can reduce the cost by a lot but overall you can improve the performance even compared to using a single G4 agent I haven’t seen uh lot of uh people using that in that way but some other potential benefits you can get from M system yeah could you share some more use cases examples that are enabled by multi-agents especially in in the real world and also what do you see in terms of new possibilities in the future with this sure yeah as I mentioned earlier there are at least two types of applications that benefit from M system uh one is uh the single agent based interface uh when you want the agent to perform increasingly complex tasks uh you’ll often find that uh you need to extend the system with different capabilities tools uh Etc uh and if you implement that single agent system with multiple agents uh they can often increase the capability to handling more content tasks or improve the uh quality of the response so one example is uh when you want the agent to give you uh suggestions uh about uh Data Insights when you want to perform complex data analytics uh often requires agents of different roles uh to form a task some agents are good at uh retrieving the data uh and present to others some other agents are good at uh having deep analytics providing insights and others agents can do critics uh and uh suggest uh more actions and some other can do planning Etc so yeah for for for really for to complish a complex task you can you can build this agency in this way with different roles and another type of application uh is um where people want to explicitly uh uh make multiple entities of different roles uh perform perform ches like for example in gaming if you want to simulate uh a chess game you need to have at least two different players and those are especially uh build at different agents or if you want to uh build a like football game that that will evolve even more entities uh and um uh like I mentioned earlier the multi-agent debate is another type of um application you very want to expensively watch different agent Behavior Uh and social social stimulation will involve even larger number of uh agents got it got it got it let’s talk about autogen specifically and then we’ll go back to kind of some of these applications in the Enterprise after could you start with giving us the backstory on autogen um you know what was the Genesis how it developed you know how what does the autogen technical architecture look like so let’s maybe let’s just start with the backstory first in the beginning I was thinking about using the autom technique to optimize the hyper parameters for these L models to find both high quality configuration but bring down the low cost try to use the minimal cost to answer questions in a set R Factory result and during that experiment I found often to make the model to generate uh good result eventually for using in application it’s not enough to just uh do the inference once and directly take the result often there needs to be um post-processing uh checking whether they satisfy certain conditions and if not we need to change the way we prompt them Etc so that uh makes me realize there’s there are many different creative ways of using the langage models U and um then uh I found that a agent abstraction is very powerful way uh to leverage the most um uh the biggest advance in these model capabilities they are so good at uh performing conversations for example with human or with other agents and if elaborate that well we could potentially build multiple agents with different roles and make them talk to each other um and and iterate over uh their feedback and improve the the performance um so autogen is designed to be such a generic framework to allow this kind of easy definition of agents and easily making them interact with each other uh for example through the conversations and it also allows easy integration with both different type of models different kind of tools and different ways of getting human participation so it also unifies uh these different type of entities and allow of natural human participation uh in these applications uh and it’s designed to be able to support very diverse applications on top of them so we aimed autogen to be this generic uh fundamental programming framework for agent uh just like P for learn got but the key components you know llms external tools human input Etc they work nicely together with auten’s technical architecture how does the system decide on using LM versus outside resources like is this up to the person who’s who’s using autogen to develop yeah to to to a large degree uh so we uh gave the developers full control of uh defining the different type of Agents we do have default uh behaviors for some uh building type of agents for example we have a assistant agents which by default uses larer model to generate reply uh and we give some default assist message to allow you to write code uh to solve complex tasks and also instructions to debug the uh code and handle different kind of situations in the instructions uh but we use that just mainly as one example about demonstrating that you can add a very complex instructions to agent and give them the ability to ex exhibit different types of behavior uh it’s easy for them to customize that instruction and give the agent different roles we also have a default user proxy agent which is designed to act on behalf of the human uh by default it would ask for human input uh at each round but if the decides to skipe the input then the user proxy will invoke uh its automatic reply for example running the code suggested by the other agent or use tools suggest by the assistant agent uh and then send the response back to the ass agent um but but these are mainly designed as as basic examples devisers can quickly build a useful application just using all default agents just for example using the two agent conversing three each other they can do a lot of things already for example similar to what chbt Plus plugin and pl code interpretor can do they can add more agents and customize the reply functions of each agent you can give them different tools uh or uh add different types of capabilities uh you us this customiz reply functions one really interesting um way to customize the reply function is to use NIS chat as a reply function for example uh in a ma in in application having two agents uh talking to each other when use a proxy and when assistant agent trying to solve Mass problem uh we will find that sometimes the student ask a question and agent doesn’t always know how to answer correctly then we can add this capability of having a n chat inside that assistant agent so that this assistant is able to consult other experts uh both expert agents and expert human uh for example teacher or lecture uh inside that n chat and once they finish the inner conversation the original student assistant agent can take that uh uh conversation out to back and uh send back to the the student um this is a good way to extend the capability of agent by uh adding n chat in inside it um and you can reely do this to enable even more complex patterns I know there are areas you mentioned this this example there’s also you know coding gameplay all sorts of other stuff yeah um that this works really well for could you give a few more examples yes there there are indeed quite a few more interesting examples uh I can categor them by maybe the type of conversations a little bit I mentioned one as Nas chat uh I want to mention a few others uh one is called group chat um that in that case people design agents of different roles uh and put them together in the group uh so that each of them can send messages to everyone else and they all share the same context uh underneath there was a group chat manager who doesn’t participate in the conversation uh but more performs a role like orchestration it’ll check the current progress of the chat and decide who to speak next based on their roles and their current progress of the conversation this also very generic conversation pattern that is used by many different applications uh including the data analytics uh including uh making suggestions for like trips like trip planner uh including giving Financial advice uh so many different applications can benefit from this uh type of um group chat uh because uh if you find um the agents are missing some aspects in in their uh discussion you could add easily add a agent uh to make them focus on particular aspect uh so then they will for example in the planning uh I see you there defining different type of agents that some of them uh take care of the uh the tourism some of them take work schedule some others take the gym scheduling Etc so that then the final result presented will be more comprehensive uh the the ment debate can also be easily interp in this way um but I do want to emphasize that this is not the only way to have mations interacting together n chat is another way and the Third Way is uh you can have some hierarchical structure uh also using NY chat uh and um and in and also in even in group chat you can have um um pre precisely specified transition condition from One agent to another instead of all instead of all ring on the group chat manager to decide who to speak next uh in that way you can um uh we have developers to prec defy okay in what condition we can consider this Tas is is is done and we can transl translate to another agent to perform the next task Etc anything surprising or expected that you have found um uh as you have have seen people use this and through your work there are lot of different kind of surprises uh one is about the uh diversity of applications uh that people are trying uh we did presented a few different diverse examples in our initial study uh but the uh way people are using it are much more creative and often gave me new new Inspirations I I remember like one um one not so surprising application but I think it’s very promising is uh use autogen to generate synthetic data uh and then uh use gener data to train uh for example small language models to make them uh uh to make them process new capabilities that the produc didn’t have one recent example is this all car Mass effort uh they use uh autogen to generate data that um contain different uh examples of solving Mass problems they can cover a very diverse use cases and select the high quality examples from them and using multiple agents they could consider these different criteria required by the training data and select only the high quality ones and eventually they they give the small very small AR model better performance in in solving some I think Primary School and Middle School level Mass problem um and and there are a few other examples I see uh just using M agent to challenge each other and uh cheach each other uh so so that they can um uh simulate the uh teaching learning Behavior Uh and and you can also use that to then apply some reinfor learning technique to improve models that’s one promising Direction uh of this multi-agent uh I application one recent example I found is uh using autogen to calculate Wildfire area uh from from satellite map uh so that’s kind of very special and unique application that wouldn’t think of before I learn about it uh it’s about like for example using the Google Earth engine uh to look at the the map after wild wildare happens and then makes agents to call bunch of apis and try to calculate the exact area I think one the related related kind of application is the uh embodied agents uh using uh both language models and vision models combine different model modality and make make interesting application like for example in robotics or in designing visual visual artifacts like generate images having some agent gen images some other agents to criticize quality and make suggestions you can also take users feedback and try convert that into some agent understandable ways of improving the the content makes a lot of sense why don’t we talk about just the enterprise software ecosystem right there’s a lot of interest in in these agent Technologies there’s a lot of builders that are trying to do different things but there’s still it’s still very early in terms of adop within larger companies like in production so curious from your stand standpoint how do you see autogen being used in you know larger organizations or even organizations from a production standpoint yeah so that’s very interesting you know a common feedback I heard from the different companies using it is they are either thinking or starting to build some multi-agent framework inside the company to support their application and when they found autogen they realized this is the kind of the the framework they’re looking for so then they could directly build their own agent platform on top of it instead of developing from scratch so that’s quite uh common especially the more the larger is the company the more they kind of need this common agent platform because for example large company have different different teams building different kind of agents and if if each of them uses special different technique or framework it’s hard for them to later join force and build more complex system but having a common uh ground uh is is kind important for them so we do see these two type of uh use cases now one is like some organization uses autogen as a backbone to build their own agent platform and and and other are using it for um diverse uh scenarios uh directly building applications you you knowledge yeah I think both are uh uh pretty common across different size of of companies are there one or two that you’re particularly excited about or you just you know are really proud of just seeing in the wild I didn’t know that many companies are already using autogen until one day this Reach Out say we’ve been using this for months and we’re happy to share with you yeah so one recent example is like nor NST yeah I didn’t know them using aogen until the their vpf data science saying that they they have been building a production ready data analytics Pro platform using autogen like for pH pharmaceutical domain and they want to enable every everyone previously who wanted to participate in the community but who were not able to do this kind Antics to join and they’re also trying to extend autogen to address some like industry regulation requirements yeah so this is like very good example of a Enterprise trying to use autogen backbone they already can offer good value in in their prototype or product uh but they also to extend them to address uh other issues to make it more mature and more customized to their specific needs uh but dat analytics indeed is I think one type of common applications I’ve seen got it got it got it that’s that’s that’s great that’s a huge um huge uh customer win I think uh given just their the size and scope of the company uh very cool any advice you would give um to the AI Builders who are focused on the Enterprise um you know how do how how do they become successful like the customer you just mentioned I think there there can be um many uh lessons uh to to learn just based from the for from other the developers activities um so uh I I think uh it’s good to have both um both uh let’s say creativity right uh and uh some some Lev of grounding I think both both are good things to keep in mind when design applications creative side I’m seeing that the the practitioners really have a Amazing Ideas they they probably have thousand of interesting ideas and everyone have opinion for example how AGI should be achieved by having some Civic count architecture I think those are uh good very good source of inspiration and indeed I think the M agent concept opens up a really big space um that everyone can participate and can can experiment with their their ideas and some very good architecture could emerge from this this kind of experiment and so definitely people should try to be bold and creative and think about no applications and and and experiment with them and on the other hand the more I kind of creative or novel uh then the the more unexpected uh failure can happen because if one of the agent doesn’t perform as expected it may affect the de entire system so it may be also good to start from smaller scale like for example having two agents in the beginning and try to put all the L language mod based instructions in one based agent and put other tools in in other agent uh and see how how how well it works if it already works pretty well then you actually don’t need to add more uh but we often find that as a complexity of the task grows because once they succeed for some simple tasks they always want to go for harder tasks that’s very natural and at some point you will find um it’s not enough to have a single agent to uh to address all the requirements uh they may forget instructions and miss some steps uh and then you may want to uh decompose the bigger task into smaller ones uh and add more agents into the FL and try different uh ways of uh like making them perform you can start with some more flexible way uh just see if they can figure out them uh by themselves if not you can try to add more control uh into the behavior or if you already know what are the exact right steps uh you can directly go go for the uh more controlled ways for example you can Define sequential uh chats uh you can define a series of steps uh of different chats you can make one chat starts after the previous chat finish and use the previous chat information to start the next chat um um and yeah so just just uh by by sending right you will understand that if you give the agent more autonomy uh autonity and uh you also have the chance to see unex fact the behaviors and um uh adjust it when you need uh on the other hand if you you use uh the pr steps uh yes they may uh follow the the the predent steps but they may also get more limited because if one step fails uh it may also not uh understand I mean so the agent May uh get stuck uh and not able to recover from the failure uh and in that case you may also want add the capability of having iterative conversation uh and give them some um automacy to to uh recover from the filer Etc um yeah so so I think having the uh right expectation of what language model can do and in what case you need to add exteral tools uh to ground their behavior is also quite important um and and one final advice is um think about uh this human input mode uh in autogen you can configure different agents with human input or or not uh it’s good practice to to really start with human in the loop uh and uh one benefit is that you can also make the human teach the agents using our teachability uh capability of the agent um so that unit can watch uh the behaviors of the original design agent and give them instructions so that uh after they remember that instuction next time for the similar task they’re able to use the previous teaching and perform more uh aligned with humans uh uh desire uh and only if you find that they have um uh you perform always expected you can gradually make them automated more automated we can also think about using autogen build agents to do the evaluation do the testing uh more automatically uh because you requires a large uh number of tests for for human to uh be confident that the system performs as well expected um but for for the for the langage model based applications it’s often hard to do this kind of evaluation uh because the uh output you’re in text form and uh they of has vagueness in terms of uh uh like how good performance is so that that’s also big pain heard from many applic Builders uh we are making some efforts for example building agent based tools uh for evaluation and for benchmarking uh these agents can for example uh check the logs uh try to extract the right criteria and or perspectives to analyze the quality uh and that score then automatically uh and eventually you can even think about using that feedback uh and provide that to other agents to improve their their performance uh but having this capability of U using agent to for example simulate different users behaviors uh have uh some diverse uh uh type of behaviors tested before you put system in actual use I think is also could be a uh good good ways to Leverage The um uh the large design Space multiagent Systems super helpful thank you um let’s take a look at let’s let’s look forward a little bit um aogen has has seen very strong success so far um what are you exploring next I know that for example you’ve looked at some brain inspired architectures um to improve reasoning and planning um is it a direction that you’re going to pursue further cu here’s what what you’re doing as the next yes exact yeah so that is that is one uh aspect uh let me try to give a high level overview first and then can dive into some of them uh um on the high level there are few important open questions they’re very hard questions for example how to design a optimal multi- agent workflow for specific application or task um and how to uh be build highly capable agents uh while ensuring safety and agency um yeah so these are some application uh sorry open questions and resarch challenges uh the entire Community is working very hard uh on solving them uh the effort can be categorized into different dimensions such as evaluation uh optimization learning teaching and interface uh I can give you some examples of of this Dimension um on of our evaluation I already mentioned that uh people are building agent based evalution tools and benchmarking tools uh they can uh often come into the aid uh to handle the large scale of testing or evaluation needed uh for for for system um and on for the interface uh this is uh where we’re making very rapid progress on uh further making easy even easier to build agent Bas applications um for example uh we build a sample application called aut studio uh as a no code interface uh for quickly uh prototyping the multiagent applications Define agents different Define different workflows and put them into uh quick cementation uh like by chatting with these Mar agents uh that enables a lot more users to exate with our ideas without writing code uh and for for developers uh for Pyon Developers for example we also offer some more high level uh interface so that it can easily try different conversation patterns I can mentioned earlier like n chat sequential chats group chat with fasing machine fased machine transition uh Etc uh and uh for uh for Learning and teaching uh that’s very interesting type of tivity want have agent have to make them improve over time and remember teachings from users or or from other agents um so that um no as people use them more often uh the speed agents become uh more uh capable over time and for um uh for the uh integration uh we we were like uh actually integrating with new technologies like customized models customized uh clients for using models uh newy like open ey assistant multi modality Etc um so so that uh we can incorporate vest technology into the M framework and offer them to to the developers uh the the C topic you talk about about this County architecture inspired from how human brain works uh is um is a good way to uh think about how to uh enable more complex task solving uh for example by decomposing task making plans uh monitor the Behavior Uh of Agents uh and replan at uh when needed uh also validate each step Etc um we recently had a had a recent new Milestone uh on a new Benchmark called Gaia uh it was a benchmark by meta for Marin progress of uh progress towards AGI because they they have very challenging problems often involve multiple steps in solving these tasks uh and we made some initial EXP M and our initial submission to that Benchmark turned out to achieve the number one performance in all both uh with a sign margin to compared to other approaches uh I think congratulations yeah uh thank you this is I think the first multi-agent attempt uh it’s very pre preliminary because we know there are many other improvements we could make uh but this basically that demonstrates the big potential of uh autogen using autogen to solve combat tasks um yeah uh and we we definitely wish to see more progress in this I’ve seen others efforts in the committee as well uh for example one uh Community member built a uh multi system to teach the agent to resist uh emotional manipulation of of these lection models uh they lb the both the teachability and the group chat uh the finan machine based group chat uh to build multiple agents uh and some of agents are able to perform the teaching to perform the learning and some other agents will uh reinforce and that learning Etc uh and so that’s um I think very interesting application in the security uh and responsible AI domain um yeah by the way I think this this security and resp AI is also another uh good promise application I’ve seen many different efforts uh because they they also leverage the uh capability of Agents playing different roles sometimes you can use them to do red teaming and others you can other times you can use them to remind other agents and to perform mitigation I think there’s a lot of interesting uh ideas there too that’s that’s uh that’s that’s great and and um and congrats on that once again okay so we just have a couple minutes um a couple of questions or one or two questions that are maybe less related to um agents specifically uh Curious like are there one or two resources you recommend to folks who are building in this space books or um content yeah uh I think uh andrean had the newsletter called batch I guess and and uh they also have a uh uh course offer many courses online so that could be a good resource to to check um uh also I think these days um uh so so for me because for me uh I usually just go to my our Discord server and see what people are doing with it so that’s a main kind of source of inspiration for me these days to see what uh other developers are building and often got got inspired by the IDE years and having conversations in the community is also a good way uh to learn from others and make progress together perfect and anything that excites you the most about the future of AI yeah uh so we we’ve seen a lot of interesting news recently uh including uh uh building videos or building uh softwares so I think we will see more and more such compex t tasks tackled by by AI um and uh definitely I see a future that um you know AI is more capable is increasing capable of uh solving more and more complex tasks and interesting I think two uh two directions one is the capability of the model themselves uh because some application indeed benefit from the advancement in capabilities and once we have new capabilities uh then what kind of Agents can we build from them uh they they will offer also uh new new ways of potentially of decomposing task and doing planning and attack addressing compat tasks uh another direction is um uh having uh to to to figure out the the right like I mentioned earlier to figure out the optimal architecture optimal multi- agent workflow so that we can has to leverage the different type of models different capabilities uh and for for different reasons uh some some reason can be for achieving uh or achieving high quality uh for the result but other reason can be reducing the cost uh and make it easier to deploy uh if we can make some of the agents uh using smaller but more specialized models uh that we can potentially uh make them uh cost efficient uh but also very good at performing certain certain roles or functionalities um so both advancement in mod capabilities and advancement in like agent uh design multi agent workflow design uh or complex AI system side is what I’m looking cool very cool all right thank you again this was super uh informative and really appreciate the time thank you for having me I really appreciate talking to you [Music]