Now Hiring: Are you a driven and motivated 1st Line IT Support Engineer?

Multi-Agent RAG

1724076595_maxresdefault.jpg

Multi-Agent RAG

Unleashing the Potential of Multi-Agent RAG for Advanced AI Applications

In the rapidly evolving field of artificial intelligence, the development and implementation of multi-agent systems signify a significant leap forward. Multi-agent RAG, or Retrieval-Augmented Generation, combines the power of multiple intelligent agents to enhance AI applications, enabling more complex, context-aware, and scalable solutions. Today, let’s delve into how multi-agent RAG transforms the landscape of machine learning and AI technologies, offering a multifaceted approach to tackling sophisticated tasks.

Introduction to Multi-Agent RAG

Multi-agent RAG harnesses the capabilities of multiple agents within a single framework, enabling them to work collaboratively or competitively to achieve specific goals. This approach leverages the strengths of individual agents, which can specialize in different tasks, thereby improving the overall efficiency and effectiveness of the system.

Understanding the Core Components

  1. The Role of RAG in AI

    • Retrieval-Augmented Generation (RAG) is a technique that combines neural text generation models with information retrieval methods. It enhances the model’s response by providing relevant context from a knowledge base, effectively making AI responses more accurate and contextually enriched.
  2. What Makes Multi-Agent Systems Unique?

    • Unlike traditional single-agent systems, multi-agent systems involve multiple interacting intelligent agents. These agents can be programmed with different capabilities and can either work together in a cooperative manner or compete against each other to foster robust solutions.

Building Multi-Agent Applications

Creating a multi-agent application involves several key steps:

  • Defining Agent Roles: Each agent needs to have a clearly defined role and set of responsibilities which align with the overall objective of the application.
  • Establishing Communication Protocols: Agents need effective communication protocols to interact with each other, share data, and make decisions based on collective intelligence.
  • Integration of RAG: Implementing RAG within multi-agent systems involves aligning the retrieval component to effectively support each agent’s requirements, ensuring that all agents can access and utilize the necessary data efficiently.

Key Advantages of Multi-Agent RAG

  1. Enhanced Problem-Solving Capabilities:

    • By dividing complex tasks into simpler sub-tasks and assigning them to specialized agents, multi-agent RAG systems can solve complex problems more efficiently than single-agent systems.
  2. Scalability and Flexibility:

    • Multi-agent systems are highly scalable. New agents can be added with specific skill sets as needed, and they can be reconfigured for different tasks without redesigning the entire system.
  3. Robustness and Reliability:

    • The decentralized nature of multi-agent systems provides a robust solution where the failure of a single agent does not cripple the entire system. This feature is critical in critical applications where uptime and reliability are paramount.

Applications of Multi-Agent RAG

  • Healthcare: Multi-agent RAG systems can be used to design personalized treatment plans where different agents handle specific aspects such as diagnosis, patient history retrieval, and treatment options.
  • Finance: In finance, these systems can be used for real-time trading, where agents can analyze different market aspects simultaneously, from global economic indicators to specific stock performances.
  • Customer Service: Multi-agent systems can enhance customer service by handling multiple customer queries simultaneously, each agent specialized in a different product or service.

Future Prospects and Challenges

As the adoption of multi-agent RAG grows, the technology faces both prospects and challenges. The integration of more sophisticated AI models and the expansion into diverse domains are promising prospects. However, challenges like the complexity in managing multiple agents, ensuring privacy and security in communications, and the need for substantial computational resources must be addressed.

Conclusion

Multi-agent RAG represents a groundbreaking approach in the field of AI, providing a pathway to overcome traditional limitations of machine learning models. By leveraging the collective power of multiple specialized agents, these systems open new avenues in creating intelligent, adaptable, and highly efficient AI applications. As technology progresses, the role of multi-agent systems will become increasingly integral in solving the world’s most complex problems, marking a new era in artificial intelligence.

[h3]Watch this video for the full details:[/h3]


Discover how to integrate multiple independent agents to tackle complex problems effectively using the latest frameworks like AutoGen, Crew AI, and LangGraph. We’ll dive into the innovative multi-agent systems, particularly focusing on the shared scratchpad approach in LangChain, and demonstrate building an advanced Agent Supervisor model. This system enhances problem-solving by coordinating agents, each with their own scratchpads, under a supervisor that manages the final outputs. Whether you’re a developer or just fascinated by AI’s potential, join us to learn, interact, and get hands-on with the future of collaborative AI technology. Click now to be part of this journey into multi-agent systems!

Event page: https://lu.ma/agentrag

Have a question for a speaker? Drop them here:
https://app.sli.do/event/wPDemMAc9nzV96DFmBzXz5

Speakers:
Dr. Greg, Co-Founder & CEO
https://www.linkedin.com/in/gregloughane

The Wiz, Co-Founder & CTO
https://www.linkedin.com/in/csalexiuk/

Join our community to start building, shipping, and sharing with us today!
https://discord.gg/RzhvYvAwzA

Apply for our new AI Engineering Bootcamp on Maven today!
https://bit.ly/aie1

How’d we do? Share your feedback and suggestions for future events.
https://forms.gle/6NNYtu7MiSUcnWAh6

[h3]Transcript[/h3]
[Music] [Music] hey whiz so agents they’re pretty dope and we’ve EXP before does that mean multi-agents are even more dope uh yeah Greg I think it does mean that uh you know they’re they’re multi- dope we’ve we’ve reached the next level of dopeness so you’re saying that we can build something dope today and then use multi-agents to even up the dopeness technically speaking all of that is true yes okay so we’re going to technically increase the dopeness of Agents wi I cannot wait to see what you’ve got in store for us we’ll see you back in a bit my man welcome everybody I’m Dr Greg that’s the whiz we’re talking multi-agent rag today we’re talking multi-agents today multi-agent Frameworks we’re going to root this discussion in the patterns of geni that you need to know if you’re going to build llm applications there’s a lot of complexity to untangle and we’ve got a short time to spend together today so if you’ve got questions along the way please drop them in the YouTube live chat or in the slido link that we’re dropping below we will prioritize slido at the end let’s go ahead and get right into it today we’re talking multi-agents we’re talking multi-agent rag we’re going to be using laying chain and laying graft to do our build today so there are some things we want to make sure that we get a handle on today and as we align ourselves to what we’ll get out of this session we really want to sort of get under and understand multi-agent workflows as an llm prototyping pattern of course we want to build one and we are going to walk you through exactly how to build a multi agent application that’s quite complex today so to get into this we want to sort of root ourselves in the patterns that we’re all so familiar with if you’re joining us for multi-agent rag today the patterns of SpongeBob here and then we want to extend it these are so important because they don’t go anywhere just because we add a bunch of agents to our applications let’s take a closer look when we talk about the patterns we have to start with prompting the definition of prompting is to lead to do something to instruct done well we might call this teaching or training even if we take teaching and training for far enough into an llm we provide One Shot Two Shot few shot examples we run out of context window where are we left we’re left with fine-tuning as a method to teach the llm how to act on the other side of the task specific spectrum of course optimizing how the llm acts is one part of the puzzle we also want to optimize what we’re putting into the context window we want to use as much relevant reference material and knowledge that we can get our hands on we want our applications to incorporate context well and of course rag is the focal point of so so many of our applications especially the ones that we’re actually trying to use to create business value today when we talk about agents what we’re typically talking about today is we’re talking about giving the llm access to tools of various kinds but not incredibly various there’s sort of a category a main category of tools that starts to connect some of our patterns back together again but let’s think about this fact that agents are a pattern what pattern are they well simply put they are the reasoning action or the react pattern and the way we want to think about this is we want to consider a typical simple agent Loop we ask a question the question is routed to our llm this is where the reasoning takes place llm might decide hey I know the answer already boom done don’t even need to pick up tools I can solve this with my bare hands the llm says or maybe we need to go in and pick up a tool now if you squint closely in you can see we have tools like archive tools like search like Wikipedia like is what is that Duck Duck Go right there now what are all these tools have in common we’re going to go get some information you might say retrieve some information and we’re going to collect it and try to then incorporate it into our reasoning that we’re doing about answering this question we might need to go back and grab another tool we might need to see what it gives us when we engage with it and then incorporate that back in to our reasoning before we give a final answer so the llm here is where we’re sort of doing the reasoning part of the reasoning action pattern and this is important now when we go to the tools when we go to the retrieval of information what are we doing we’re we’re we’re actually sort of augmenting the thread that’s going on with that our llm is considering and reasoning about with retrieved information we’re kind of doing rag aren’t we in fact I’m going to put to you today that today in most cases agents are just fancy Rag and we’ll see this as we walk through exactly what we will build today armed with the pattern PS of prompting of fine-tuning of rag and of Agents we can put together a more complex picture of what a typical multi-agent system looks like let’s think about multi-agents why would I need more than one agent you might ask well remember how the agent in our picture just a minute ago was doing the reasoning we consider that we might want our reasoning machines to be specialized right we want our reasoning machines to be Specialists just like out there in the world now if the reasoning machines are to be Specialists and I want a bunch of them where does that ultimately lead to in the context of AI one question you might ask is well does that mean that if I had an AGI llm that I could just use one agent I want to bring whiz up to the stage here for a minute to comment on this so if I’m looking at Specialists and connecting all up together isn’t it kind of in the limit that the artificial general intelligence is going to understand everything that all of the Specialists understand so it sort of makes the idea of multi-agents not necessary is this a crazy way to think about it whiz or is this technically sound no I think that’s probably true I mean eventually right when we have some AGI uh we could just have one agent do everything I mean there’s there’s a lot to be said about potentially you know you know this idea of expertise might not ever leave and so maybe we have you know specialized intelligences that are better than these uh generalized intelligences but I think the way that people use the term AGI right is uh means that we would only need that agent or that system right um we we wouldn’t need those Specialists because it would be no better than this AG I mean of course it depends on how you define AGI right I mean it’s like yeah okay okay okay all right let’s uh let’s get off our high horse thanks whiz thanks for the the two cents on that let’s let’s go back down to earth everybody because we have a job to do today let’s talk about multi-agent Frameworks because presumably we don’t have AGI today what we’re talking about when we’re talking about multi-agent Frameworks is we’re talking about using multiple independent agents that are each powered by a language model an llm let’s say potentially an slm in the future little small specialized language model potentially and we basically want to consider two things what are the agents and what are they good at and how are they connected now if you think too deeply about this you start to get lost in the patterns a little bit so we’re going to try to make it easy for you to understand why this is useful it’s useful because it allows us to do things a little bit more cleanly in short we can group tools and sort of responsibilities almost like job responsibilities together we can separate promps instead of having of course the infinite context window issue will tell you that you can sort of just dump everything in there and maybe you can but it makes it really hard to sort of debug exactly where things are going wrong and this separation of prompts can also actually not just provide a cleaner architecture but potentially even give better results and then just conceptually it’s going to be a lot easier for other people to understand what you’ve built now there are many ways to accomplish lots of things you might try to build I wouldn’t immediately jump to multi-agent in almost any case in fact I would love to hear if you’re in the audience today if you’ve heard of an actual use case that you’re connected to creating massive business value from a multi-agent use case these things are still rare today and they are difficult to implement but there are tools out there and some of the tools include things like autogen in fact some of the things we’ll talk about today in Lang graph were inspired by the auto paper this came from Microsoft and they call this a conversation framework we essentially want to allow these multiple agents to converse with one another that’s what autogen is all about you might have also seen crew Ai and what crew AI is all about is it’s all about sort of getting the crew of Agents together and operating in a real cohesive unit that’s crushing together just like if you’re on crew and obviously lots of people are using this stuff this is more of a low code solution now we’re going to use l graph today and L graph is all about quote building stateful multiactor applications this is like you can put many agents within graphs directed cyclic graphs that track the State of Affairs As you move through the graph we’ve talked about Lang graph before we won’t belabor the point but it’s all about adding Cycles to Applications built on Lang chain and in terms of cognitive architectures that you can leverage within Lang graph the one we’re going to focus on today is the router it’s going to be particularly useful now you might say what’s a router well the TLD drr on routers is that they choose the action remember that reasoning that we did in the beginning to sort of choose the tool you might choose the tool you might choose the rag system you might choose to go to another agent that is actually just a different prompt and so when we think about the flows of multi agent setup today these are the ones that you’ll see if you go to the Lang graph repo on GitHub there’s the multi-agent collaboration the hierarchical team and the agent supervisor when we do multi-agent collaboration we’re essentially trying to get two agents to share the same context and just as we heard in the autogen paper to have a conversation here we have a researcher and a chart generator as well as a router all three of these guys are agents but I’m going to sort of highlight the routers in our story the research agent is simply sort of designed with the prompt you should provide accurate data for the chart generator to use chart gener chart agent is designed with the prompt any charts you display will be visible to the user this is a quite simple setup the router decides which tool to call be it the chart generator or the researcher we can make this into a slightly different setup by thinking about our kind of router as being a supervisor and the supervisor might choose to go to any given agent to delegate specific tasks that the user asks now if we combine combine these two ideas we get into something that’s a little bit more hierarchical where we actually have different graphs nested within our top level graph where our top level supervisor stands so this is a supervisor that is a router and the research team and document authoring team here are also both represented as supervisor routers at the mid level all of these have prompts associated with them we’re going to simplify this slightly and use it for our build today we’re going to build what we’re calling aim editorial AI maker space editorial and it is a cut across the hierarchical team setup we just saw that combines the supervisor as well as the sort of collaboration idea what we’ll do is we’ll have a top level supervisor we’ll have a research team that’s going to leverage tavali search which is a search API specifically designed for use with llms and agents we will build our own custom rag system and then we will create an initial writer a researcher a copy editor and an editor that will make sure our writing is super dope in our document team at its core if we think about what’s actually going on here at the top level our supervisor is directing it’s deciding it’s instructing it’s delegating these slightly more specialized research team and document team supervisors are doing something similar we’ve got retrieval that we can do through tavali search or through our custom rag system and fundamentally each of the agents in our document team are using prompts so you can see here is that if we look close enough we can start to see why we might say something like agents are just fancy rag we have a goal today we want to write a dope blog on something relevant well what’s something relevant we want to talk about long context we saw this pretty sweet paper extending llama 3 is context tfold overnight as llama three told us they would do this over the next few months also shout out to gradient AI releasing the 1 million context length a week ago that was pretty that was a pretty gangster move and then this one is a formal paper on it though 8K to 80k with qor fine tuning what we’re going to do is we’re going to set up our research team we’re going to use tab search and we’re going to use this long context window paper to build a custom rag system we’re going to set up our document team and the wrer we’re going to tell it something like you are an expert writing you are an expert in writing blogs below are your files in the current directory the notetaker will tell it you are a senior researcher tasked with writing a technical blog outline and taking notes to craft a perfect technical Blog the copywriter will be our grammar copy punctuation editor very Tactical and then our dopeness editor is going to be an expert at punching up technical blogs so that they’re more dope extra lit more cool let’s see how we can set up these teams using Lang graph right now wiiz show us how to do it oh yeah okay so here we go this is going to be a notebook so I’m going to zoom in a little bit so you can first this is what we want right this is like the uh this is the the desired output right so we have some input to some kind of supervisor uh agent and then we receive output from it right it does the task does the thing so uh I think when it comes to the the goal here we want to combine two things right we want to combine kind of this idea of rag that we have and then we want to add this uh you know this potential agentic uh you know piece on top of it and the way we’re going to achieve this is in parts and we’re going to start with uh some basic parts and then we’re going to slowly get a little bit more crazier and crazier now uh this is a he it’s an adapted thing from uh Lang Chain’s own documentation they have a great hierarchical agent example we’ve made some changes here and there just to demonstrate how easy it is to uh work with these Lang graph systems uh and we’re just going to go ahead and get started and of course it’s based off of the autogen uh research that was done so first things first we need a bunch of dependencies it’s not it’s not crazy though right so we need Lang graph Lang chain Lang chain open Ai and L chain experimental and then of course we want hrant client or quadrant sorry Client pyu PDF and Tik token this is uh for those of you familiar with it this feels a lot like a rag dependencies and it sure is we’re also going to grab two a API Keys we have our open AI API key and our tavil API key now the tavil API key is uh something that you’ll have to sign up for um and on their free version you get like a thousand requests per unit time um and basically the idea is it’s it’s like Google Search right but uh through through a clean API um there you go so it’s free to to to trial and then if you want to go crazy with this you’re going to have to pay pay the piper as it were okay so the first thing we’re going to do rag right so I mean multi- agent rag if we don’t do rag you know feels like we missed the mark so we’re just going to set up some simple rag here uh just going to be over single PDF right so we we’ve got a lot of content on more advanced Rag and everything like that for this notebook that’s already quite long so we’re going to just keep it to simple rag um you know but the the way that Lang graph LC work you can extend this to however absurd of a chain or rag system that you want as long as it’s wrappable uh in in a in a python function uh and it can take in some text and it returns some text right so I think this is a very important piece of the puzzle all of these components that you see are are hot swappable right like we can change them however we’d like that’s kind of the beauty of Lang graph right so first thing we’re going to do put the r in rag we need retrieval uh we’re just going to load up the uh this long context paper from archive so I’ll show you here uh it’s just like you know this this whole context window thing is it the rag killer right all of this this other stuff right so let let’s write a little blog on extending rag or extending context Windows next we’re going to chunk it down to size this is classic we’re just going to do some uh some naive splitting nothing fancy going on here uh we’re going to turn our one uh document into 15 smaller documents let’s go then of course we need an embedding model right if we’re going to do rag we need to have some Vector representation of our text assuming that we want we care about semantic uh retrieval which in this case we definitely do uh we’re also going to create a quadrant uh back Vector store this power by quadrant quadrant is just a great Vector store I mean that’s the reason we’re using it that’s really it uh it’s very good at its job um and uh it can it can scale very well so even though this is clearly like a toy example um you know quadrant can handle very much non-toy examples um which is great and then of course we’re going to grab uh a retriever so we’re just going to modifier Vector store into a retriever thanks laying chain uh nice and easy then we’re going to do the a in rag which is augmented right so this is where we’re going to add our context to our question and we’re going to give some instructions talking about how it’s a helpful assistant it uses available context to answer the question and if you don’t if it doesn’t know how how to answer it it should say I don’t know and then finally of course generation because this task doesn’t need like a ton of like incredible reasoning skills we can just use gbt 35 turbo for this part um there’s there’s no need to use like gbt 4 for this and then we build a simple rag chain right so this is just an LCL chain it’s going to take some uh you know question pass it into the retriever to get the context and then it’s just going to pass the question forward to the next step which is going to pipe into the rag prompt which is to pipe into the chat model which is going to be parsed as a string so we can do things like this rag chain. invoke on a question what does context long context refer to and then we get a string response to context long context refers to a coherent text such as a book or a long paper that contains multiple independent text yada yada yada you get the you get the idea um okay so first of all there’s there’s some you know limitation to this particular approach so I just want to acknowledge those again it’s just an illustrative example but uh you know this is a specific PDF right so we’d love it if it could take Dynamic PDFs and it’s obviously very naive regag so we’d love for it to be a little bit more complex uh and you can do all of those things and as long as the ending result is an LCL chain nothing else will change right so if you want to Tinker around with this make a more involved rag chain uh you can do that absolutely as long as you can get the output uh you know as long as you can get the whole object to be an LCL chain you’re going to be just fine which is pretty dope uh then we’re going to make some helper functions we need some helper functions we’re going to do the same thing over and over again let’s just wrap it in a helper function right so first of all uh we’re going to create agent nodes so these are nodes with agents right so you’ll notice all this agent node does is it wraps calling the agent uh in a function and it takes what the agent returns and it names it uh you know or it adds to the state we’re we’re going to get into state in just a second but it adds to the state this human message with the output and that’s it that’s all it does right so this is the idea we want to wrap those nodes the reason we wrap the nodes is so that they’re they work as expected with Lang graph right where it’s going to take some state agent name and then it’s going to give us this uh this this object that’s going to be compatible with our state very cool so we have this idea of an agent node and we’re invoking an agent but how are we creating these agents right with a create agent helper function of course let’s go a few key things to keep in mind here number one uh you know we want to have kind of this boilerplate prompt on our system prompt for all of our agents right because all of our agents that we’re creating with this are going to be uh you know very very similar under the hood in terms of their promps this doesn’t have to be true right you can custom make each agent but for us it makes a lot of sense to just use the same boilerplate at the end of every agent right um your other team members and other teams will collaborate with you during with their own Specialties you were chosen for reason you’re one of the following team members and then this this classic do not ask for clarification right we we just want to go to the agent get some response based on the tools that it has and then send that up to chain so this is the idea uh of course we’re going to be able to modify that with a system prompt so we’re going to be able to Define more exactly what this agent is we just have this kind of suffix that’s on each uh agent prompt there you go okay next piece is Big we have our agent scratch Pad this is unique to this agent right here right whatever agent we’ve created this is unique to it right open AI function agent this is unique right so in our executive right this is It’s one executor which has uh or which is comprised of this create open a functions agent right and these tools which has its own scratch Pad now this is something that’s super important okay so each of these little sub agents are their own agent so already we’re in we’re in we’re in multi-agent before we even get out of the first graph right but the idea is that they all have their own scratch pad and we’re just going to populate the important stuff up the chain to the supervisors and this is the idea of how these systems can work so effectively right each agent’s going to be able to do like a bunch of stuff but that stuff largely just doesn’t matter to the supervisor agent right the super just like real life supervisors care about the output they’re like yeah what what do I get at the end here guy right so that that’s why we have this individual scratch pad for each agent and then we’re going to see uh another uh layer of that as we continue through the notebook so that’s that’s going to create all of our agents then we need this idea of a supervisor now I I hate to be the uh the the bearer of of of mundane news supervisor is just a router it it just routes from one thing to the next thing so it takes in you know it takes in current context and then it makes a decision where do we go next which which tool agent or which agent do we go to next right which worker do we go to next um and then you know if the answer is we don’t go to a another team member we’re just going to straight up go we’re going to finish we’re going to be done right so the idea of this particular uh Team supervisor is just to act as a router right so say hey this this looks like it needs this work done and then it gets response back H now it looks like it needs this work done gets response back and now it’s done right this is all it’s doing it’s a lot of code to get there but like basically this is just a function and we’re once again going to create this uh you know this open AI function uh situation this is all it’s doing right it’s not crazy it’s not insane it’s just straight up routing where do we go next so now that we’ve got those helper functions that’s just the helper functions it’s a bit of a doy notebook I know uh but we’re now we’re going to talk about the research team so remember our total system is going to be comprised of and we’ll just go well we’ll go back to the diagram for a second here this supervisor agent which is going to interact with this research team agent okay and this document team agent so what we’re going to talk about next back down the notebooks I know the scrolling uh you know not fing so sorry about that guys but just wanted to reference to that document so first things first we have a tool using agent what do we need to do we need to give us some tools right Batman’s got to have his utility belt or else he’s he’s not Batman so we’re going to start by creating a tool for tavil now uh you’ll notice we don’t really have to do anything here right we’re just pulling it from the pre-made Integrations from Lang chain tool but we can create our own tools if we want right so we’re going to show that next now this is so technically we don’t need to create a tool from our uh rag uh LCL chain because LCL components can be nodes in a graph however we’re just going to show it so you can see how that tool creation happens uh there’s no real uh reason other than that to do this uh you could just use the LCL chain that’s going to be fine as long as you make sure that the inputs are coming in correctly so you might have to put a formatting uh component on the top of your chain uh but the idea is here we’re just going to show it in the tool so you can see how easy it is to make the tools so we just wrap it in our tool decorator right this just modifies the function below and then we uh create an annotated parameter with uh you know it expects a string and The annotation is query to ask uh the retrieve information tool and then we give it this dock string this dock string is important right so one of the things you’re going to notice whenever you’re working with agents graphs laying chain AI engineering we’re always prompting right we’re always prompting all the time right this is this is a prompt the LM is going to see this and it’s going to use it as a prompt so remember when we’re writing those doc strings it’s not just just random text for humans right this is how the llm is going to interact with our system so it’s important to write clear dock string here and then all we do is return uh that that chain invoked okay so now we have our two tools our tavil Search tool and we have our uh retrieve information tool which is our rag pipeline next up we’re going to create some state so we’re going to add three objects under State we’re going to have messages which is just a list of messages so everything we’ve done up to this point team members that’s the members we have in our team unsurprisingly and then who’s up next right so this is going to help decide where where are we going next right who who am I passing the ball to next uh so this we just write about that a little bit there we’re going to be using gbt 1106 preview uh gbt 01 uh I can’t remember the rest of the numbers right right is exact but the the newer vers the January version is is actually a little bit worse than 1106 at this task for some reason it just uh it gets lazy I I think they attempted to fix that didn’t work so we’re going to use 1106 so this is our llm you’ll notice though that we are using gbd4 here this is no longer like gbt 35 is going to do we need a strong Reasoner we need a llm that could do a great job that’s why we’re using gbd4 here so now that we have our llm we need to start creating some agents so we’re going to create first our search agent which is going to be based on that to a node so now we have this search agent and we have it tucked inside of a node that’s awesome we do the same thing for our rag Pipeline and then tuck it inside of a node you’ll love to see it next we create our supervisor agent we’re going to pass in that same llm gbd4 we’re going to give it this uh text now remember this text in addition to the other text that exists in its boilerplate but the idea is that it’s going to be able to pass to these separate tools or finish we’re going to let it know which tools it has access to and then we can move on to the best part right making a graph we initialize our graph with the research team State graph we’re going to add the node that we created to our graph we’re going to name it search uh we’re going to add the research node which is the llm or the rag node right to our graph and we’re going to name it paper information retriever these names are actually uh pretty important they they have to be in this format they can’t have spaces and and this kind of thing so make sure that you’re naming these correctly and then of course we’re going to add our supervisor node so now we just have like three nodes chilling in a graph uh you know they’re not connected to each other at all okay so we’re going to then create edges the edges are pretty straightforward right if we’re at the search node we’re going to return to the supervisor node if we’re at the paper information retriever node we’re going to return to the supervisor node right these nodes all lead to back to the supervisor now from the supervisor dependent on what’s next in our state remember we Define this this next in our state up here right dependent on what’s next is going to determine where we go next so if it’s search we’ll go to the search node If It’s Paper information retriever we’ll go to the paper information retriever node and if it’s finished we’ll go to the end node now two things I want to be very clear about here right basically we can go to the uh search or paper information retriever nodes which are powered by those agents which have those tools uh and then they return to the supervisor or we can end in the graph now this graph has State and that state is its own state so now we have agents with their own scratch pads and then we have uh these nodes which represent uh those agents and the entire graph has its own state right so we’ve got two layers of of of uh kind of keeping information apart here right very important to to think about and then we just compile it and uh that part’s great and we set an entry point right we enter through the supervisor easy peasy and then we can use mermaid to display our graph it doesn’t look beautiful but it is right right so we have this idea that we can go uh we can go from our Json out function parser which is like you know where do I go next uh we can go to the paper information retriever which goes back to the supervisor agent or we can go to search which goes back and then this can also lead us to the end or finish so that’s the Mermaid uh image of our uh of our graph of our research team graph right now because we intend this to operate with another graph right we have to have some way to get in here and the way that we’re going to do that is through this enter chain and we’re going to create an LCL chain from our entire graph this is the beauty of LCL right this chain represents our entire graph rare whole graph uh but we could just straight you know just tack on another LCL component uh easy peasy and then we can test it out and we can see things like uh you know uh we enter the supervisor says we’re going to search we we do some kind of uh retrieval with search we come back and the supervisor says we’re going back to search we do that and then eventually the supervisor says hey you know what actually uh we’re going now we’re going to the paper information retriever right so we the the graph decided we would go to search twice and Pap for information retriever once then it it felt like it had all the information that it would need um dope okay now so that’s the research team side right we created a graph the graph does stuff we’re now going to immediately think of this as a single unit right this this entire research team now is just this this bit right here right where it does this thing it it we give it some query and it tells us which tools to use to research information and then eventually we come up with a final response that we’re going to pass back to our supervisor right so this is the research team supervisor this this next one’s going to this the CEO or however you want to think think about it so next up we have the document writing team the document writing team we’re going to go through this a little bit quicker it’s the same idea exactly except instead of tools that relate to search and information retrieval it’s related to document creation and document editing so we have our create outline tool which is going to open a file and put an outline into it and then save that file then we have a read document tool which is going to open a file and read it then we have our right document tool which is going to unsurprisingly open a document and write to it and then we have our edit document tool which is going to it’s going to blow your mind open a document and edit it right so we have these these few these few different tools that we can use right so we have the ability to create outlines which are going to be a document then we can read those documents we can write new documents or we can edit documents all awesome stuff to be able to do when you’re trying to write a blog right we’re going to create this state for our document writing uh uh team and it’s going to be exactly the same as our research team except we’re going to add this current files uh additional uh parameter and this what this is going to do is it’s going to just tell us what files currently exist in the directory it’s working in we’re going to also have this Prelude all this is doing is it’s saying hey by the way this is the current files you have right this this is how many files that you have and that’s it we create the graph it’s exactly the same as the one we created before but with some additional nodes right the idea is that every node goes back to the supervisor right so all all all paths from all uh you know uh of the the sub agents lead back to the supervisor uh and then the supervisor can send it to any one of those particular agents and that’s it right so this is this is the idea then we can look at this and we can see it happening right so we see you can come in here and it can go to the dock writer the notetaker the copy editor the dopeness editor and then uh eventually it can finish now one thing that I do want to just keep in mind when we add these uh add these tools up here right we are we’re going to for each of these uh entities right we’re gonna have access to specific abilities right so this is the idea is that we want to give our specific team members sorry about all the scrolling here again specific team members are going to have access to specific abilities and that’s important okay now this that’s all great so far next step right we’re just going to wrap this in the same thing we did before uh for our team our research team writer and then we’re going to see it work you can see here we ask it to write a short outline on linear regression write it to disk and what does it do it goes to the dock writer node which does exactly that and then we get a short outline that’s written to disk and then if we look in our this is this is the worst for sure but if we look here we can see there is a linear regression outline that’s created in text right in a text file in that temp directory that we pointed it to pretty awesome okay so that’s what we’ve done up to this point we’ve created our research team and we’ve created our uh document writing team and now we’re going to go back to Greg who’s going to show us how we can tie these two together into a truly multi- agentic experience back to you Greg awesome Okay so we’ve got our research and our doc team set up the ic’s are ready to do their work so you know what time it is is it’s time for the supervisors and the thing about the supervisors that’s so interesting we talked about this before they’re just routing they all have the same prompt you are supervisor tasked with managing a conversation between the following workers worker one worker two worker three worker four whatever given the following user request respond with the worker to act next each worker will perform a task and respond with their results and status when finished respond with finish okay seems simple enough so supervisors are just routers then right it’s like they’re just like asking for TPS reports or something and it begs the question right are they doing any reasoning are they taking any action what’s the role they’re playing these supervisors exactly I’ll leave that as a thought experiment for all of you watching but it is worthwhile to think about in the 21st century we know what the ic’s are doing we can see their output but for now it’s time to create these supervisors make sure the work is going to the right place being routed properly for both the research team and the documents team up to the meta supervisor who is oh so agentic at the very top whiz back to you to close it out muted sorry guys sorry about that uh thanks Greg but yes uh all we need to do is we need to you know get a new llm it’s just going to be the same same one right but then we’re going to create our supervisor node and the supervisor node uh thanks for all the reminders in chat guys sorry about that uh but the the idea is we have uh am I still muted uh if I’m still muted let me know okay good so the idea is we just need to create one more layer and all that layer does is it takes us from right so before we created two different graphs instead of graphs let’s just consider those nodes right so this new supervisor all it does is it tells us when to go to the research team or the blog writing team that’s it I mean can’t make the stuff up right this this is all it’s doing it like Greg said it’s a router we create new state right which is just going to you know we have less things we need to keep track of since we’re not so worried about the team members there’s only two team members and we uh we have our messages and then of course we have our next uh so that’s who we’re going to next and then this next piece is the is the best we only care about the last message going into the new uh into the new you know graph and we only care about the last message from that subgraph right so we can think of it this way we have this parent graph and we have this child graph right but the only communication between those two layers is going to be the most recent message from either of them which means that the parent graph or The Meta supervisor the ultimate supervisor right the one right at the top uh CEO whatever you’re G to call it it only sees the last piece of work from its from the research team supervisor or the blog writing supervisor right so this is huge right we only pass that relevant information this keeps this state very clean lets it be a very effective Reasoner and Powerful tool and then of course uh we need to build the graph well the graph is very easy to build because there’s only two nodes uh and they you know they both go back to the supervisor uh and then the supervisor decides if it’s going to go to the blog writing team the research team or it’s going to be done and we can finally use the thing and ultimately when we use this it’s going to you know send a it’s going to do this whole flow right and this whole flow is going to go through research team blog ra team research team blog ra team you know it probably won’t do that many iterations to be honest with you usually it does two or three but at the end we get this right this output just a full blog on the paper I mean listen is this the best blog that’s ever been created okay I’m not going to say that but it is a Blog it was created from the paper it did go through dopeness eded copy editing right uh we can see that this is uh you know pretty dope results are nothing short of revolutionary that’s pretty hype language right that’s uh that’s something that our dopeness editor helped with so this is the idea uh this this part’s very straightforward right we those each of those uh sub nodes right each of the sub nodes or subgraphs we just consider a node That’s The Power of lay graph it’s an entire agent graph but we’re just like it’s a node you know who cares uh and that is it uh so good with that we’ll pass you back to Greg so sorry about being muted guys thanks for calling me out in the chat and uh Al also don’t forget to like comment subscribe smash the notification Bell I know it’s kind of kind of silly but it does help we’re here every Wednesday we love talking about this kind of stuff and uh I’ll pass it back to Dr Dr Greg to uh bring us to Q&A ring that Bell baby yeah you crushed it wiiz thanks man so we got a gentic with the meta supervisor and now we can think about this idea of multiagent rag in the context of the patterns of generative AI that we’re using to build llm applications we saw how this all came together in a single build been through a lot in an hour together and we can also start to see why we can say things like agent ores fancy rag now remember these are useful because the grouping is very helpful the separating of prompts is very helpful the conceptual models are very helpful again let us know if you come up with any sort of musthave multi-agent use cases I would love to hear about them but the patterns the patterns the patterns they’re just everywhere tools all have prompts and supervisors or routers and searches retrieval it’s a lot to get a handle on and we hope that this helped you out today if you have questions please let us know in slido we’re going to start taking them now and we look forward to creating more multi-agent content in the future we want you guys to take this notebook and create your own blogs with it uh we will will like and sub to those and maybe we will if we can get this good enough dope enough Chris create our own AI maker space Auto blogger the aim editorial okay so two slido which platform is better for multi-agent rag Lang chain or LW index Lang chain boom okay all right and um can we get just a why real quick how come I mean stuff like LCL is just it it’s such a it’s such an effort multiplier right we make one thing we could just straight use it in the next thing uh yeah it’s it’s tough uh it’s tough to it’s tough to beat that right now yeah I love the second question so much seems that everything can be done with a single agent only difference is the forced sequence of agents of tools is there something else I missed Anonymous uh yeah I think I think maybe a little bit so there’s no forced sequence of tools here the agent is free to select which tool to use when in which order how many times uh yeah uh that’s the that’s the idea so I would say uh the different sequence of Agents is kind of where we get this could it all be done with a single agent maybe right so you could just add all these tools to One agent but the idea is that this compartmentalization is supposed to make uh the the llm has one fewer decision or sometimes four fewer decisions right if we’re using the four writer tools um right this is the idea is that we instead of choosing between like 12 tools it’s choosing between two tools or three tools or four tools and that is supposed to make it better yeah okay yeah I go back to the child at the grocery store which kind of mustard do you want sweetie do you want it to have a hundred different mustards to choose from or or three and you know I think um it is a great question to always ask though can it be done with a single agent can it be done with no agents of course we were doing multi-agent rag today so we used multiple agents next question is it possible to share variables like dicks data frames or any other between agents instead of just making them communicate with natural language yeah yes absolutely uh so we can do that by passing different parts of State different components of State as you saw in this example we only pass the last message into state but we could add more things and and and add even more State and I think uh you know that’s going to be a decision that you need to make depending on your use case but yes the answer is you can absolutely pass I’m not gonna say whatever you want because that’s of course literally not true but you can pass basically whatever you’d like okay nice nice nice okay so when dealing with multi-agent rag it gets hard to site or Source responses in line is there a an effective way to do this across all the rece retrieved sources in line that’s yeah okay so for citation that’s a little bit harder uh you could add like a state that just keeps track of the various sources and citations and then populate those at the end in some kind of dump uh that would be the uh that would be the base way that I would want to approach this uh if you want to be sure that you’re citing everything that you that you can some of these agents aren’t going to produce citations because they’re not really looking for those things uh but yeah with State basically you’d want to manage that context as you’re passing it around you can you can you can add that fairly straightforwardly okay can agents work in parallel uh yes of course yeah so uh the the idea would be just to make sure I understand like you can some of these flows can be parallelized right so if you need to search two different tools you can search them at the same time and then synthesize a response once you receive a response from both of them right so uh that’s already built in through LC I believe it’s coming to Lang graph uh soon TM uh but for right now it’s built into to the LCL components and then I believe it’s going to enter into uh Lang graph soon enough okay and what are the techniques or design pattern patterns to make our system faster and more responsive this multi-agent setup can be potentially slow it can’t it oh yeah for sure it’s going to be slow I’m I’m not going to uh you know tell you that it’s going to be fast you can make it feel faster using a lot of streaming right so streaming the responses uh is going to mean that the time to First token is very low but it’s still going to take the same amount of time to generate the full Final Answer um so it is going to be something that takes a little while especially when we have like these uh six to seven different calls also one one thing to touch on from that point of view right this is where a tool in integration like Lang Smith which we didn’t touch on in the notebook but is easy to integrate uh comes in and answers a lot of the questions we’ve seen in chat how do we know how many token how many calls what path does it take all of those can be added uh or or tracked through uh through Lang Smith if you if you use that integration yeah and I just want to sort of mention shout out to uh Garrett uh big homie in our community he’s building deep writer and if you want to talk about how it’s slow and potentially expensive to do multi-agent stuff he’d be a great resource for you to follow and to start a DM thread with he’s all about all of this and constantly building every day so it looks like we’ve got a lot of other questions but we are about at time we will collect these questions and we will try to make a few posts in the week to come on LinkedIn so give us a follow there but that’s it for today we wanted to make sure that we end on time we’ll be back with more multi-agent stuff soon you can count on that thanks so much whiz for walking us through that that was incredible we will wait on publishing our first blog until we think it is truly dope enough huh and let’s go ahead and close it out for the day if you guys enjoyed this and you don’t know AI maker space yet we’d love to see you on Discord real soon we’re building shipping and sharing with folks all the time and we’d love to have you as a part of our community starting now you can start learning for free of course on YouTube we’ve got an open source course on llm Ops that we taught last year we look forward to open sourcing a another course here very soon and we are always running our boot camp courses our Flagship one is the AI engineering boot camp it is an eight-week course that walks you through everything from your first llm application through the patterns that you need to Leverage and build up to multi-agent Frameworks we are starting our next cohort on May 28th it’s kind of a high bar and quite a bit of friction to get in so apply now and start working your way through the AI engineering boot camp challenge to check out more events if you aren’t familiar with us check out our awesome aim index on GitHub you get direct access to all of the code there’s always Concepts and code with every event that you will join on YouTube again we’re back every week on Wednesday same time same place we hope to see you again soon like and sub and in the meantime keep building shipping and sharing and we will most certainly do the same have a great week everybody we’ll see you soon