Unlock AI Agent real power?! Long term memory & Self improving

Unlocking the Real Power of AI Agents: Long-term Memory and Self-Improvement
In today’s rapidly evolving digital landscape, Artificial Intelligence (AI) agents are increasingly central to our interaction with technology. From virtual assistants to advanced chatbots, AI agents have become an integral part of our day-to-day online engagements. However, a pressing question remains: Can these AI agents continually improve, learn from past interactions, and remember user preferences over time? Let’s dive into the transformative potential of long-term memory and self-improvement in AI agents.
The Challenge with Traditional AI Agents
Traditionally, AI agents are designed to operate without long-term memory, handling each interaction as a standalone event. This approach implies there’s no learning curve; an AI interacting for the hundredth time performs no better than it did on its first. This limitation often leads to user frustration, as the AI fails to recall previous preferences or instructions, like dietary restrictions or favored news sources.
The Power of Long-term Memory in AI Agents
Integrating long-term memory into AI agents can significantly enhance user experience. Imagine an AI that remembers a user’s dislike for fish or their preference for news from certain sources. This capability doesn’t just streamline interactions by reducing repetition—it builds a more personalized user relationship, much like interactions with a human assistant who remembers preferences and dislikes.
Case Examples and Applications
- Customer Service: AI agents with long-term memory can provide more personalized support by recalling past issues and preferences, reducing resolution time and increasing customer satisfaction.
- Healthcare Management: AI can track patient treatment plans, medication schedules, and past ailments, offering reminders and personalized healthcare advice based on historical data.
- E-Commerce: Imagine an AI shopping assistant that remembers past purchases and can suggest new products aligned with the user’s tastes and past feedback.
Enhancing AI with Self-Improvement Capabilities
Next, let’s consider self-improvement—AI’s ability to autonomously update and refine its algorithms based on feedback and new data. This feature is akin to how humans learn from mistakes and incorporate feedback.
Continuous Learning and Adaptation
An AI equipped with self-improvement algorithms can adapt to new tasks and environments without extensive manual updates. This adaptability is crucial for scaling AI applications in dynamic fields like digital marketing or content curation, where consumer preferences and trends rapidly change.
Real-World Implementation and Its Impact
Deploying AI agents capable of both long-term memory and self-improvement can revolutionize how businesses interact with customers, offering a more responsive, understanding, and efficient service. For instance, an AI travel assistant that remembers a user’s travel preferences can automatically suggest hotels, flights, and itineraries based on past trips, improving service quality and user satisfaction.
Barriers and Considerations
Despite the clear benefits, integrating long-term memory and self-learning capabilities in AI does pose challenges, including:
- Data Privacy Concerns: Storing personal data over time raises significant privacy issues. Ensuring compliance with global data protection regulations is paramount.
- Increased Complexity: Developing these capabilities requires more sophisticated algorithms and structures, increasing the complexity and potentially the cost of AI systems.
- Performance Optimization: Balancing memory usage and speed of retrieval with accuracy and user experience is a technical challenge that must be carefully managed.
The Future of AI: Toward More Intelligent Agents
As AI continues to evolve, the emphasis is shifting towards creating agents that not only perform tasks but also understand and adapt to user needs through memories and learning. This advancement could lead to more profound interactions between humans and machines, blurring the lines of what AI can achieve.
Conclusion
The integration of long-term memory and self-improvement mechanisms within AI agents represents a notable shift towards more intelligent, adaptable, and personalized technology. As we continue to develop and refine these capabilities, AI agents are set to become even more integral to our digital lives, promising an era where technology is not just a tool, but a learning and evolving companion.
[h3]Watch this video for the full details:[/h3]
How to build Long term memory & Self improving ability into your AI Agent?
Use AI Slide deck builder Gamma for free: http://gamma.1stcollab.com/aijason
🔗 Links
– Follow me on twitter: https://twitter.com/jasonzhou1993
– Join my AI email list: https://www.ai-jason.com/
– My discord: https://discord.gg/eZXprSaCDE
– Autogen teachability: https://microsoft.github.io/autogen/blog/2023/10/26/TeachableAgent/
– Get AI Agent Long term memory source code: https://forms.gle/JwM29rGtjZFf26MF9
– Deploying AI: Build long term memory from scratch: https://www.youtube.com/watch?v=oPCKB9MUP6c&ab_channel=DeployingAI
⏱️ Timestamps
0:00 Intro
2:16 How long term memory work
5:41 Example: MemGPT
7:17 Example: Support agent self improving
8:03 Example: CLIN – Continuoually learning language agent
10:49 Gamma AI co-pilot
13:14 Implementation methods
14:26 Autogen teachability step by step guide
16:48 Demo
17:52 Autogen teachability break down
👋🏻 About Me
My name is Jason Zhou, a product designer who shares interesting AI experiments & products. Email me if you need help building AI apps! ask@ai-jason.com
#gpt5 #autogen #gpt4 #autogpt #ai #artificialintelligence #tutorial #stepbystep #openai #llm #chatgpt #largelanguagemodels #largelanguagemodel #bestaiagent #chatgpt #agentgpt #agent #babyagi
[h3]Transcript[/h3]
one question that I got asked a lot is can the AI agent gets better and better over time to learn from its past mistake and interactions if the agent you are building is similar to the structure that I show here the answer to this question at default is no because most of the time the agents we’re building today is dataless which means there’s no real difference between the agent running for the first time versus a 100 times because it has zero contact about what has happened in any other session and this kind of sucks if they had some past conversation with user where user already expressed certain preference like I don’t eat fish the next time the user talk to this agent they expect the agent remember the preference that they give before and it is really bad experience when agent forgot every time when we talk to agent it is almost feel like start from scratch and that also means it is quite hard for you to train agent on the specific standard procedure for different type of tasks for example if the user ask agent to give a summary of news and this particular user don’t really like a specific data source like CNN even though the user give very explicit instructions the next time when it get agent to do similar task it still need to give this instruction again again cuz agent want really to learn about this specific sop and most importantly after we deplo AI agents in real world there will be millions of different edge cases for example I was building a meeting schedule agent who can coordinate time and book meeting for the user and I was expecting user input to be like how about next Tuesday 9 a.m. C but after deploying real world people were prompting the agent in many different ways and there were so many complex scenario that we didn’t expect before like the user can be in multiple different time zones across a period of time if all those iterations are rely on human to come in and make adjustments this human driven agent iterations is almost become a bottleneck for the agent to deliver great performance but what if we can actually get agent to learn from his past interaction remember users’s preference and even update his own system prompt and workflow that can deliver pretty amazing results cuz imagine you can build a marketing or sales agent who can even run AB tasks by itself with different prompt style and based on external or human feedback to automically reflect and improve its workflow and prompts so the ability to have long-term memory and learn from those memories is really powerful and what’s really fascinating is this exactly how human learn as well so people normally learn new skills and knowledge through this stre that process it first require us to pay attention to the specific things that we want to learn and after we pay attention to specific data that we receive then we will go through a process where we will encode this data into our brain and replace this data many many times through a process called consolidation and at the end of this this piece of information became a long-term memory that store in our brain but there’s one cavier if there’s some knowledge in your memory that you haven’t really retrieved or used for a very long time those information will fade away from your long-term knowledge this is where AI agent became really interesting because this won’t be a problem for AI agents the data will always be stored there and they can be retrieved for the agent anytime and this means the amount of skills the agent can handle can grow exceptionally more than human so instead of just a simple conversation between the user and agent we can add a new workflow to replicate the process of creating storing and retrieving long-term knowledge so you can have a new agent maybe called knowledge agent where it will be looking at the conversation between the user and agent and try to decide is there any interesting information that should be stored for later retrieval and then this knowledge agent can summarize and extract specific information store in a vector database so that the next time when this agent have similar situation it can try to do a vector search and retrieve relevant information as a simple example if there’s a conversation between the user and agent where the user SP said my name is Jason and I don’t really eat a fish apart from the agent answering the question there could be a knowledge agent behind the scenes look at this conversation and first they ask is there any information in the message that is worth saving as knowledge and if yes it can trigger the second tool or process to abstract learning and save that as a knowledge into our database so it can call a function called create knowledge where it can pass on the knowledge type to be a diary requirement and detail is no fish so that next time when the user come again to say hey prepare some meal plan for me the next week it will be able to retrieve relevant knowledge and then append this knowledge together with the user query pretty much like a rack so that the real agent will receive a message says the user says prepare the meal plan for me and here some some past contacts where I don’t really eat fish so that this time the agent will be able to respond with this preference in mind and this at high level how you can build long-term memory for your agent system and this probably a oversimplified version and behind the scenes there a huge amount of optimization that need to happen to make this production ready for example it probably take a lot of time to do this process for every single message come in and you don’t really want to add too much latency to the user experience so that you can add optimization have a cheaper and faster model to firstly check is there anything interesting that wordss to convert into knowledge if no then skip this knowledge process if yes then go through this process and same thing when the agent receive a new query from user it can also have a cheaper and faster model like hiu or mro to quickly check is there any relevant information that require a retrieval this time if yes then do a PO rack if no then just answer the question right away and as the user has more and more interactions this knowledge base will became huge too so we can do additional optimization like if the knowledge and data is not used much for the past 6 months then move that to a co- storage so you can reduce the vector database cost and those are just some simple implementations there was one project last year called MIM GPT which represent for memory GPT so prompt token of large langage model in M GPT will be break down into three parts one is a system instruction which is like system prompt that don’t really change the second is working contact is where they will retrieve from the L longterm memory which they call archive storage and they have a Q manager to achieve some sophisticated prioritization about what information to be put into the prompt token for example in a conversation like this where the agent asked how was your day today and the user said my boyfriend jams baked me a birthday kick then the agent will trigger a function to adding some working contacts which is birthday is February 7th and boyfriend name is gems and in later conversation where the agent asked did you do anything else to celebrate your birthday and the user says yeah we went to Six Flags then agent will try to retrieve past conversation that is related to Six Flag where the user actually mentioned Jims which is their boyfriend and I actually first met at Six Flags then agent will be able to generate response said did you go with gems it’s so cute how both you met there and if a couple months later where the agent was asking how gems doing any special plan today and the user respond actually James and I broke up then this agent will start updating it knowledge based where update gems as X brend instead of brend so you can see how this longterm memory really change experience for the user and this similar setup is also introduced in chat GPT for some beta user where they also have very simple and basic memory management and this also much more than just remember users’s preference I was using similar method for enhancing the customer support agent so if you ever build customer support agent One Challenge is that in real world the question user asks can often require knowledge that not exist list in the original data sets so what I did was Implement a process where if a user ask a question their customer support agent can’t answer at the beginning it can escalate to the manager which is human and the human can give instruction back to agent and support agent will be able to answer users questions but behind the scenes it will also try to extract this new knowledge and update into their own knowledge base so that next time when something similar happen it can retrieve this knowledge to answer the question and on the other hand I remember I saw a part last year where they showcase a truly self-evolving agent system it is project that has short name called c n which represents continuously learning language agent so they put agent into a simulated environment where an item in this world actually follows the real world science and physics and the AI agent in this world can interact with different item like if you put a fire on the wood then it will fire up but if you put a pot of water on this firewood then the water will boil so it’s simulation environment for the AG to interact with the word and the goal of this project is they want to build a agent system who can continuously learn about this word by interact with it CU once you can build such continuously learning agent system you can just put that into any digital simulation world like in a totally different game like Minecraft or GTA you can just start learning by interacting with the world and what I found really interesting is the way they set it up so this agent will be given different type of tasks like it might be given a task called grow and orange then the agent will start doing this task by bringing down into different actions they can take observing results and decide next step and even though in the end for the first try the agent didn’t really complete this task but there are certain progress it made for example when it went to Kitchen it actually found a seed which is necessary to Growing orange then it will try to reflect in the end of this trial and get a learning that going to Kitchen may be necessary to find the seats and this information will be fair to agent the next time it tried to complete the same task so this time it will retrieve knowledge that going to Kitchen may be necessary to find a seed so that it go to kitchen right away and in the end in the second trial it actually complete task and again it will try to reflect gener knowledge this time it find not only going to Kitchen may be necessary to find seeds but also moving seeds to the pot may be necessary for planting the seeds and with this new knowledge it can complete the task even faster next time and with similar methods the agent will try to complete similar task to many different environments for multiple times you will look at all the session it have every down and Abstract General learnings that can be adopt across different task and different environments for example it might get learning from different sessions initially using a lighter on the metal pot should be necessary to heat the water in the pot in a different trial in my find turning on the stove should be necessary to create a heat source and with those two learnings it will try to generalize a new learning that using a heat Source like stove or lighter on the container should be necessary to heat a substance so with this process you can see see that agents start really developing an understanding of how the world works as well as abstract learning that can be used across multiple different task and different environments so we have talked a lot about different concepts and methods in term of creating agent long-term memory I’m going to show you a quick example of how can you create long-term memory into your agents in just 10 minutes but before I dive into this I think most of us here believe AI going to fundamentally change how we use software but actually design a good AI native product is really hard one of the platform that have the best AI native experience in my opinion is gamma so gamma is an AI native slide deck and website builder where they resin the whole workflow of building slide deck and website and big AI into every part of the journey the part I love the most is how they design experience where the AI agent and human actually collaborate together here is a quick example so you can visit gamma. apppp to create account for free and let’s say I want to create slide deck about history of large langage model I can just click on this create with AI button select generate select the presentation but you can also choose website or documents and type in the history of large language model and click generate outline so instead of getting the AI to creating the whole slide deck autonomously they introduce a aligning stage where the AI will firstly propose a list of outline of the slide deck and the user can just come here and make Chang then you can also set up how text Heavy it should be as well as the image source you can even click on advanced mode where you will have a lot more control like voice and tones as well as additional prompts that you want to add in if everything looks right I can click on continue choose a same then you start getting to this amazing experience where the AI is actually creating the whole slide deck in front of your life where it will draft the whole content and also insert image and you can see the content here is not just simple placeholder it actually write content based on the instruction it was given and in just 10 seconds a beautifully designed slide deck is already created and if I want to make change I can either do it manually or I can click on this addit with AI button where they have this co-pilot build in for them slide that I want to make change and then say I want to turn this into a timeline view and add a bit more details then boom it became a timeline view that is beautifully deled as well and you might also want to update some image where I can just click on image and click on edit I can just change the promt that I want or I can just search cross web and same same for the tax as well if the taex here is a bit too technical I can just say rewrite this slide content for a 5-year-old and now all the content is much easier to understand with a lot of analogy so I think gamma set a really good example of how a AI native experience should look like I definitely recommend you go and check out you can click on the link below to try out gamma for free now let’s dive into how can we Implement longterm memory into your agents the good thing is it’s actually very easy there are many different ways you can Implement long-term memory into your agent in less than 10 minutes including some open source version and I’m going to quickly show you so they are host solution like zap which you can basically use their API and point SDK to store the long-term memory and they really optimize for Speed and memory index to improve the memory retrieval accuracy but there are some cost coming with using zap if your volume become bigger and bigger on the other hand there are video from deploying AI which is not AI YouTube channel where he showcase a very detailed how to create a memory agent from scratch where the agent can just have his long-term memory about users’s preference so I definitely recommend go ahead watch if you want to implement from scratch but if you’re using framework like Auto already this extremely easy way for you to add La memory to your agent because they introduce a concept of teachable Agents which is very similar set part to what we have discussed so far basically it a special ability that you can add to any autogen agent and there will be a tax analyzer agent who can review the conversation extrapolate knowledge and save to the vector database as I mentioned the setup extremely easy I’m going to show you how can you add longterm memory to your auto agent in just 5 minutes so I open Visual Studio code and then add a new file called oi config list inside I will just paste in gbd4 model so if you’re using autogen and you have a list of different models you can just save your model configuration here to be reused and then I will also create a file called EMV this is where we’re going to save the environmental variable like open AI API keys so I will put open AI API key here and also I’m going to add another one called tokenizer parallelism to be false so this is something we’re going to use later for those teachable agent to remove some noise in the conversation and after after doing this I create another file called app.py so first let’s making sure you install the teacher B which is a new ability that you can add to your agent for longterm memory so I’ll do pip install teachable and after I finish I can close this our first to import a few different libraries we’re going to use from autogen and one of them is this one called ability and this new ability that they add for longterm memory and if you are seeing something similar like M which is showing some arrow for the package we’re importing that means you probably are on the wrong environment and if you’re on Mac you can do command shift p and do select interpretor and choose right one you’re using for me it will be this user local being python stre after you choose the right environment then this warning will go away and also I’m going to load Dov and this will basically rate the environment variables that we save inmv file and next I’m going to load the large Lage model config so that we can Define the model for the agent to use and next step is I will create an agent called teachable agent and as you can see here this is just normal conversible agent teachability is an ability that you can add to any agent so I’m going to instance a teachability object so t equal to teachability which is the one that we import from this new auto library and reset DB to be false this means every time we will reuse the existing knowledge database that we created before and then this is passed to the knowledge database then I try to add this teachability to the teachable agent that we created before and that’s pretty much it now we can add a user proxy agent and start the conversation okay looks like we have some Arrow because at default it is running on Docker and we didn’t have Docker running so you can either open Docker if you already have the installed or we can go back to Dov file and then set autogen use Docker equal to false and save this and then let’s run again python app.py okay great so you can see they ask for my information and also create a new file called tempt which inside here it has a chroma Vector database created temp this is where it will create a vector database for us to store the longterm memory so I will say Jason I don’t eat fish and as you can see here a database actually has been created it says I have noticed that you don’t eat fish I remember this information for future interactions especially doing me stretches and if I exit and then try to run again python f.y this time I’m going to ask give me the mail plan for the next week and you can see so it generate a mail plan for the whole week and I double checked this no fish involved in this smil plan which is pretty good and I can even ask why there is no fish so it remember my name I didn’t mention my name in this conversation at all and then it says I didn’t include any fish in the mail plant because you previous mentioned that you don’t eat fish so here we go we got this agent working in just 5 minutes that actually remember your preference and can learn from the past interactions and we can learn more about how it work exactly by command and click teachability so you can see this’s one class called teachability and if you scroll down there are a few different components and if you command click memo store you will see that memo store is a class that it created for the agent to interact with the memory knowledge base which database it pack a few different functions for updating retrieving creating the database and at default it is using the chroma if you want to use different type of DB you can create a sub class for this store if you scroll down you can also see that it create a text analyzer agent if you command click on that you will see that this text agent is a subass of conversible agent so the user will give it a text to analyze instructions and please follow instruction and give the result back so this is pretty much it does it will be given some an instruction and it will try to analyze return the result based on so I’m going to close this and get back to teachability and you can see that it also add a new system prompt to the agent that says you’ve been given special ability to remember user teachings from PR conversations there’s one function called storage so this a function that will be triggered at the end of each agent session where we will pass on every single message line by line to this function to decide if a message contain any interesting information that should be stored into our knowledge database so it refers to send the message to the text analyzer and ask is there any part of text that ask agent to perform a task or solve a problem just return yes or no if the answer is yes then ask text analyzer to now briefly copy any advice from the text that may be used for similar but different task in future if no advice present just return no if the response is n which means it actually has advice and it will ask text analyzer again to briefly copy just a task from text then stop don’t solve it just include any advice so this will basically extract actual advice and they also try to ask test analyzer to figure out what type of task it is because in the end it actually want to generate task advice or problem solution pair in the vector database so this one is about generating the title or name of task itself then it will save this pair to our database and on the other hand it’ll also try to check if there are any facts or information that need to be learned to our database so it will do something very similar but different prompt this time it will ask does the text contain any information that could be committed to memory answer just one word yes or no and if yes imagine that user forgot about this information in the text how would you ask for this information so this will B B come up with a question that we’re going to store in the database and then it will ask a text analyzer to generate the answer itself in the end save this question answer pair into our database so this is storage and this function is pretty heavy as you can see CU it called the text analyzer agent multiple times but it only be wrong once in the end of conversation and this is a function that actually will be called every time to check if there’s any information that need to be retrieved and sent to agent as part of context so again it will sent to text analyis to check does this user query ask agent to perform task or solve problem if yes then it will try to extract the actual task context and also quite interesting it were asked to come up with a generalized task title because we’re going to do a knowledge retrieval so a more generalized task title can lead to more accurate retrieval result and then it will call this retrieve relevant memos and append this results after the user query as part of context so this is basically how it achieve this teachable with longterm memory it has few quite interesting abstraction about how to do this retrieval and how to decide whether this new knowledge to be saved if you’re interested you can even customize this class a little bit based on your own needs for example as we mentioned before you can actually do a lot of optimization to reduce the latency and saving the cost but this is how easy you add long-term memory into your agent today as you can see I definitely think this long-term memory is ability that we should start more agent Builder should include into your agent stack and very interesting to see if this actually start any new interesting use case I’m going to continue posting interesting AI I’m doing especially with AI agents so please comment and subscribe if you want to get updates thank you and I see you next time