Build Anything with Llama 3.1 Agents, Here’s How
Build Anything with Llama 3.1 Agents, Here’s How
Build Anything with Llama 3.1 Agents: A Comprehensive Guide
Unlock the potentials of AI with the new Llama 3.1 model, the revolutionary tool that is leveling the playing field in technology. Whether you’re a tech enthusiast with no prior programming knowledge or a professional looking to streamline processes, this guide will walk you through the steps to harness the capabilities of Llama 3.1 for building sophisticated AI agents. By the end of this guide, you’ll also learn how to run these models locally, ensuring your data remains private and reducing dependency on paid APIs.
Introduction to Llama 3.1
The release of Llama 3.1 marks a significant milestone in the field of artificial intelligence. This model, available in its largest iteration of 405 billion parameters, stands out by being open source while matching the prowess of top-tier, closed-source alternatives. For the first time, the power of high-quality AI tools is accessible to everyone, offering the same competitive edge previously reserved for those with access to closed-source technology.
Getting Started: No Installation or Programming Experience Needed
Starting with AI building might seem daunting, but Llama 3.1 simplifies the process. The first step involves setting up a workspace that requires no prior setup or installation. This can be done using a platform like Google’s free Colab tool, which allows you to write and execute Python code through your browser.
How to Set Up Your AI Building Environment
- Navigate to Google Colab: Create a new notebook on Colab, empowering you to run Python effortlessly without dealing with complex installations.
- Accessing Llama 3.1: Through platforms like Grog and Perplexity, you can access various models of Llama 3.1. Even those with substantial computational power, such as the 405 billion parameter model, are at your fingertips through these tools.
Utilizing AI-Driven Code Assistants
Even if you lack programming expertise, AI-driven tools can generate the necessary code for you. Simply paste documentation or other technical specs into these tools, and they will provide you with workable scripts to run in your Colab environment.
Key Features and Capabilities of Llama 3.1
Llama 3.1 offers a range of features that can significantly enhance your project development:
- High-Speed Performance: Experience blazing speeds of up to 1,233 tokens per second with platforms like Grog, significantly faster than many competitors.
- Diverse Applications: From basic queries to complex problem-solving, Llama 3.1 adapits to a vast array of tasks.
- Privacy and Cost Efficiency: Running models locally on your machine means more control over your data and no ongoing costs for API usage.
Building Your First AI Agent
Understanding the documentation is crucial. Developers’ sections in tools like Grog contain comprehensive guides and API keys necessary for connection and operation. Here’s how to kickstart your project:
- Generate an API Key: Create a new API key in the Gro developer portal to begin making API calls.
- Install and Import Libraries: Use Python commands (e.g.,
pip install grog
) to set up your environment in Google Colab. - Implementation Using Gro: Leverage the fast, efficient, and cost-free features of Gro to implement your AI functionalities.
- Error Troubleshooting: If errors occur, AI-driven tools can help diagnose and solve these issues, guiding you through code adjustments and providing real-time fixes.
Expanding the Functionality
Shift from using basic models to more advanced versions such as the Llama 3.1 405b. While some platforms may initially not support the newest models due to their recency, alternative APIs and additional configurations will allow you to leverage these powerful tools as they become available.
Running Llama 3.1 Locally
By the guide’s conclusion, you will not only understand how to use Llama 3.1 in cloud-based settings but also how to set it up locally. This allows full control over the models while avoiding extra costs:
- Downloading the Model: Download Llama 3.1 directly to your machine from the official repository.
- Local Setup: Install and configure the model on your local server or PC, ensuring you meet the hardware requirements, especially for high-parameter versions.
- Execution and Management: Run the model locally, manage its operations, and utilize its capabilities for personal or professional projects without relying on third-party API providers.
Conclusion: Welcome to the Future of AI
The introduction of Llama 3.1 opens a world of possibilities for developers, tech enthusiasts, and businesses. By following this guide, you can start utilizing this advanced AI model to create intelligent agents tailored to your needs, enhancing both personal and professional endeavors. Remember, the future of AI is not just for programmers but for anyone willing to explore its potentials. Join the AI revolution today and start building with Llama 3.1.
[h3]Watch this video for the full details:[/h3]
Learn how to make $$ with AI Agents: https://www.skool.com/new-society
My Google Colab: https://colab.research.google.com/drive/1o1GTlZyXw77NamwZAp2hFPvUaafwAGTP?usp=sharing
Groq API: https://console.groq.com/keys
Together.ai API: https://api.together.ai/settings/api-keys
I’M HIRING! Do you want to join my team? Apply here: https://forms.gle/2iz4xmFvDCGnj2iZA
Follow me on Instagram – https://www.instagram.com/thedavit
Follow me on Twitter – https://x.com/DavidOndrej1
Please Subscribe.
In this video I teach you how to build AI Agents with the new Llama 3.1 models, including the 405B version!
[h3]Transcript[/h3]
my name is David Andre and in this video I’ll show you how to build AI agents with the new llama 3.1 model including the biggest 405 billion version you can do this even if you have a bad computer and even if you know nothing about programming and at the end I’ll show you how to run the Llama 3.1 models locally so you don’t have to pay for any API providers or share your private data with anybody now the reason why llama 3.1 is such a big deal is that it’s the first time when an open source model is just as good as all the closed Source models the future is happening right in front of our eyes if you aren’t doing anything about it if you aren’t building AI agents you’re falling behind so I urge you to take the first step and make something to make this as easy as possible for you I’ve designed an entire Workshop that will teach you step by step how to build your first AI agent which is now available inside of my community if that sounds interesting make sure to join now the link is in the description to build our AI agents we’ll be using Gro if you’ve never used Grog it’s basically a free AI tool that gives you super fast inference and check this out insane speed 1,233 tokens per second way faster than chgb unfortunately Gro does not offer 405 billion version just yet but what does offer 405 billion is perplexity so if you have replex 3 Pro you can go into the settings AI model and you can select llama 3.1 405b and actually we’re going to be using it to help us build the agents you don’t even have to be a programmer nowadays you can just use the AI tools to write the code for you first first off go to this domain cab. research. google.com and create a new notebook this is a free tool from Google that lets you run python without having to install anything so how do we get started right the secret is to use documentation documentation is your friend so when we’re in Gro at the bottom you can see Gro for developers click on that and this will bring you to the playground and on the left we can see documentation there is everything you need to know about how to run the models you don’t have to remember any of it just use the documentation okay so let’s follow the steps right first we need to create an API key super simple the best part about Gro is that it’s completely free push through the friction and start building the agents now because this is the best time to get into AI agents all right so let’s create a new API key let’s name it uh subscribe if you’re watching this please subscribe and let’s copy this okay let’s go back to our Google collab and I’m going to add it as a comment the only two things you have to understand about collab is code cells and text cells so what I did I added a new code cell so click on that beautiful so we cre the API key as they told us let’s go back to documentation and see what’s the next step okay we don’t have to do this actually because we’re using Google collab so click on Python and then we have to do pip install Gro now say you don’t know anything right what you have to do is simply take a screenshot and I’m going to paste it into perplexity help me implement this in Google collab this is the beauty you can use the AI models to do the programming for you you don’t have to know how to use Google collab you don’t have to know how to use any of of the apis just pull up the documentation feed it into an AI model and have it do the coding for you so all I’m going to do is click on this copy button and go back to collab and simply paste this and run this line of Code by the way you have to make sure that you see this green check mark next to each of your code cells that way you know it executed successfully beautiful in 6 seconds we installed Grog so now we can work with it let’s go back to the API and see what the next step is so we need to import Grog let’s do that and I’m just splitting this into individual code cells so that it’s more organized and easier for you to see now we have to create the client right so let’s copy this part of code and here we set the API key but since we are not going to use environment are variable which by the way you probably should but I’m just making it as simple as possible for you I’m just going to set it directly right so remember above this is our API key for Gro let’s copy this and put it right here paste this and then run this cell now this is the part where we actually call the API to get the chat completion so let’s copy this part and just make sure that this matches up with this if you try to re rename it for example if you do like Gro client right you have to rename everything all the references of that variable and you also have to re rerun the cells anytime you change something so let’s run this and we should hopefully fingers crossed see the AI completion which means that we’re successfully communicating with the gro API now in this case we’re using this this model the Llama 3 8 billion which is the old one and let’s be honest we don’t want to use the old model we want the new model so how do we do this well fortunately everything is given to us in the documentation so let’s go back and simply we have to click on models and as you can see we have all the supported models that Gro offer us there is quite a bit actually um so right now we’re using this one I think we want to use the 3.1 new models right so we have the 405b and then the 70b so let’s create variable for this I’m just going to do it above create a new coding cell above this one and I’m going to do llama 70b equals and then we have to pass the exact API name that Gro gives us in the documentation I’m going to copy this exactly paste it in here and then I’m going to create a second variable for the 405 right so Lama 405b Gro uh naming so execute a cell and now we can remove this one because we don’t want the 8B and we can just call it the variable directly so let’s copy this and boom and now we’re going to be using the Llama 3.1 70b model which is the second best open source model in the world right now and as you can see we got the response but when when we try to run the 405b you’ll see something curious happens and I think this is because it literally released few days ago we get an error and to be honest uh as you can see the model does not exist or you don’t have access to it I just think Gro is probably still working on implementing it it because um it’s so new so there is a high chance that when you watch this you can use the 405b but in a second I’ll show you how to actually use the 405b right now even when Gro doesn’t have it implemented so let’s just switch back to 70b that way it works again if you’re confused at any step what you should absolutely do is take a screenshot go into perplexity and you can just say uh control V paste it in and explain what this code does so many of you are still not using the AI tools to their fullest potential which is a massive mistake you can build anything just by using AI tools all you have to do is have an idea and be willing to go through some discomfort push through some friction to actually build it because you can you don’t have to do any of the coding like oh yeah if you’re trying to do something super Advanced some very complicated project then yeah a models cannot do everything for you but for simple and medium difficulty applications AI models can do all the coding absolutely so as you can see it explains exactly what the code does so even if you didn’t understand it or if something is confusing take a screenshot paste it into Chad gbd perplexity doesn’t matter and have it explain it to you okay so now that we’re getting Gro completions let me show you how to actually get the 405 billion model so to do that we have to go to um llamas well we don’t have to do this but I’m just going to show you this is the official llama.com article which tells you all the stuff about about the new model right everything you need to know possibly the differences you know how to download the models keep capabilities making llama your own how to fine tune it which by the way that video is coming soon a lot of you want to know how to fine tune the Llama 3.1 model especially the biggest 405 billion version stay tuned because that video is coming soon so make sure to subscribe so you don’t miss it okay now here is the quick start with the partners you can see everybody who’s um running this model all the cloud providers right we can see see that Gro is in here real time inference but uh what we need to do is scroll even further and see the pricing right here model pricing so this this is where if you’re deciding where to run it because you know running it locally the 405 billion version It’s kind of challenging at the end I’ll show you how to run the 8 billion version and 7 billion version locally because those you actually can run on consumer Hardware 405b you need to have like multiple gpus and some pretty solid setup anyways this is where we can run the 405b you can choose any of these to be honest but I’m going to be using together. a so let’s click on this and this is very simple because they give you $5 of API credits for free so again you don’t have to buy anything you have to you don’t have to even enter your credit card you can just use together. a to run 405 billion version so what we should do is click on products and then inference and simply uh click on start building now and this will urge you to log in again you can just log in with your Google takes 10 seconds okay I already logged in so we can see this and on the right this is the playground on the right we have the version of the model so naturally we want the 405 billion version okay so let’s click on that and then we can send it a test message like um who are you say say something simple right I’m an artificial model known as llama okay so it is responding beautiful but right here you can click on API and again you don’t have to know how to integrate the apis because all of them tell you how to do it so when I click on API boom Python and I have all of this stuff laid out for me literally couldn’t be any simpler so let me just copy this from together import together and I’m going to create a new code cell right here and when I do this we’re going to get an error and say you don’t know anything like oh my God no module name together some of you probably are picking on like what is the issue but let’s assume I don’t know anything and I did this on purpose by the way just take a screenshot and go back to perplexity or whatever and say why am I okay tell me how to fix this error and see if perplexity can figure this out right so it’s analyzing the image to fix this error you need to install the together package pip install together boom so we simply run pip install together just as we did pip install uh wrong collab just as as we did pip install Grog earlier so let’s go above this code cell and do a new code cell and we have to install the together package by the way quick tip if you don’t want all the you know stuff uh all the text coming from it you can do Dash Q which makes it quiet it makes the installation quiet so it doesn’t you know fill up the entire Google collab with nonsense and then you can actually rerun this code cell as well which will execute successfully beautiful so we install together and then we import it together AI let’s follow their instructions right the next we have to set up a client so let’s copy this line of code you might be noticing patterns with um Gro it’s literally the same thing I’m going to rename it to together client because we’re using multiple AI providers API providers so I just want to have it organized you know that way there’s no chaos and as you can see it’s the same exact thing they want to set it as an environment variable I’m just going to set it directly as a string to make it super easy for you guys so we need to get the API key if we don’t have the API key guess what um we should see it somewhere here let’s see uh uh settings and by the way this is in real time so you can see me figure this out and Beautiful on the left we see API keys so click on that and you can do regenerate API key boom and this is a new one so let’s copy that copy this button API key copied let’s go back and set it directly so as you can see this is like I’m not familiar with together Ai and I was able to figure it out in 3 seconds by going to the settings and finding where the API keys are so let’s go back to the playgrounds chat now unfortunately this reset so let’s Al again select 405 billion and know actually this is good because we can set the parameters right so output length let’s do like uh 1,000 why not temperature we want to do who up to two no no no let’s do low 0.1 is good top let’s do like one here and you don’t have to understand these three but basically temperature is how random the model is so I want it to be pretty consistent here and output length is how many tokens it should output so this is like plenty we can even lower it to like 700 I guess and this is actually updates it in real time in the API so as you can see right here we see all of the stuff so when we send a message in UI let’s say teach me um the basics of machine learning answer in short okay it’s responding and as you can see it’s slower than Gro because this is the 405 billion model so as you can see this is 60 tokens per second earlier we had like 1,000 tokens per second which is just insane so there’s always a trade-off when you use the smaller models you get faster responses and they’re cheaper to run but they’re not as smart as the large models which obviously um are more expensive and slower okay let’s go back to API and now as you can see we have updated um this part the response part okay okay so by the way uh we set the API key so let’s just continue with what they’re telling us let me close this part you can close this as well okay so I’m going to copy this long part of code and again I’m not I’m not even attempting to remember all the API syntax I’m just copying what they tell me this is the beauty of documentation it is literally your friend let’s go back paste it and this long part is just the response right this is the response from the assistant so we don’t need this we can delete this part because we want the assistant to respond from a new now actually what we can do is we can set a system prompt so as you can see this is the user prompt actually let me undo this so this is user this is the prompt you normally send right as you can see this was my prompt in together when I asked it something and this is assistant which means response of the moral but there’s also a system prompt so let me let me show you how to do that you can do roll system so we copy this it’s the same syntax so I’m just going to copy this part do comma paste it below and then we change here to system and this is our system prompt right now so here you can say you are a helpful assistant always answer in short use bullet points whatever whatever you want your system prom to be just set it in here and now when we run this fingers crossed we should get the 405 billion model responding and we don’t let’s see got an unexpected keyword argument top K why is this unexpected okay let’s comment this out let’s see what happens uh I don’t know what’s going on here we’re getting some errors roll system did I set the system prom wrong h r system content completions oh yeah basic mistake very basic mistake look at this together client I renamed the client and then I didn’t use the new name that’s embarrassing but hey at least you show shows you that you can make errors easily so we have to make sure that we use the same name of the client so let’s run these two cells and now it should work it does not work why not generator object has no no choices did I co copy it correctly from here choices zero message content seems like I did seems like I did or wait did we update the API key maybe maybe I’m using the wrong API key actually settings API Keys No this is the correct one right yeah this is the correct one hm let’s try to debug this in real time by the way the Google collab has one interesting uh button explain error you can use actually Gemini which is Google’s AI model to explain the error you’re getting so let’s see if this solves it and again uh response Stream True parameter okay maybe we cannot stream it so let’s try to comment this out and let’s see if this works and it works okay so apparently we cannot stream the response which means seeing the tokens flow you know step by step we we just need to wait until the entire message is generated so thank you Gemini for helping me solve this and again don’t don’t get stuck just use AI tools to solve errors and keep pushing forward like you really should have to see yourself as the CEO or or the manager with AI agents and AI tools working for you you just have to direct them so again this is I think this is useful seeing that I get errors and how I solve them in real time using AI tools okay so we are using 405b we’re getting successfully responses from 405b just one thing you have to know is that it’s significantly more expensive than 70 billion now together AI gives us free uh $5 right in terms of API credit so if I go into settings and then billing as you can see I still have $4.98 uh free so I even didn’t even upgrade to the paid account which is great so far everything we’ve using is literally free so anybody can do this but let’s go back and uh what I want to show you is the pricing so click on models and you can see that 70 billion is less than dollar for 1 million tokens but 405 billion is $5 for 1 million tokens so obviously when you’re trying to use the biggest model it’s going to be a lot more expensive than the cheaper models so with together AI you can only generate 1 million toets only obviously that’s still plenty to test it out and use it for yourself okay so now that we’re successfully using the 405 billion model we actually uh let’s Implement some agents like let’s build this into agents right cuz right now now they don’t have much agentic ability we just getting a response and then we getting a response again it’s not integrated into any teams of Agents there is no one flow there is no one goal which is what they’re going to do now so what I want to build is I want to build a simple cold email writer so and like say you want to start a new business say you want to make money with AI the first step is getting clients obviously cold emails and sending personalized looms is one of the best ways if you don’t have a lot audience if you don’t have money to run Facebook ads right you have to do cold Outreach so we’re going to build a cold email agent that’s going to do research market research on our company on the industry that we want to Target and then write a personalized cold email for us so let’s do that right now so what I want to do is I want to use the Llama 405 billion the smartest model to make the decisions and then we’re going to use actually perplexity to get the real- time information from the web so obviously um if you’re using just an llm like for example here we’re using just the 405b it doesn’t have access to the internet so to make your agents a lot more powerful to unlock so many different capabilities you want to give them real-time web access that should be one of the tools that they have right and perplexity actually offers really good API for this which by the way you also get $5 of credits when you have the pro version proxity is really a great deal to be honest so yeah you get the $5 here and then you get your API key right here so again we can just go to the docs right here you don’t have to remember any of this just use what they tell you click on get started how to get the API key blah blah blah and this is the code this is all you need so uh scroll down create a new code cell and let’s get user input first we have to install open AI boom let’s follow exactly what they tell us client copy the client I’m just going to to enter and split it onto separate lines so your API key right here and we need to get our perplexity API key which again go into the settings if you have flexity Pro and then uh click on API and copy your API key right here let’s set a string it has to be a string and then inside of the string let’s run it and we’re getting the client oh I made a mistake accidentally deleted the second quotation mark open air is not the oops you have to do for openi it’s always better to just use the documentation and actually what I’m going to do is I’m going to rename this to openi client because we’re using multiple apis just to have it organized just a quick tip writing is if you don’t want it to search the web which is what we’ve been doing so far if you click on all it will search the web I’m just going to say find me the latest or find me the official open AI documentation let’s just do perplexity why not right and it’s going to give us the link to openi dogs boom let’s click it when we click on docks right here on top of right docks we can see the developer quick start and since you know proxity told us in the dogs let’s actually go back to the dogs right here it told us that it’s using the open AI client compatible so let’s go to open AI since we’re using the same syntax and let’s use their their way of doing it right so this is much simpler what I’m going to do is I’m going to Simply do messages and then set this uh replace this with the stuff I copied from open documentation and again same mistake as earlier I renamed the client we have to rename the client and boom we’re getting a response the only problem is that we’re streaming the entire response I mean we don’t need to have it stream that’s the thing I’m just trying to be fancy so maybe we should stop being fancy and just um use the completion without streaming and now it’s generating okay beautiful as you can see this model is online so we can ask it about the latest news right for now we know that we have successful perplexity integration and we have um 405 billion so let’s use this together right actually why not let me show you how to use clot you know I’ve showed you already multiple different AI tools let’s describe the project right now okay or I’m going to show you a little hack you can do file you can do download and download it either as a notebook jupyter notebook or a python file file so I’m going to do a jupyter notebook and then I’m going to Simply drag it into CLA give it access basically it sees my Google collab you know I exported it it will see all of the cells I’m going to say um start by transcribing all of the cells in the attached file um Google collab do not add any text simply transcribe the cells into separate code blocks boom let’s send this and it will analyze the file and as you can see uh it’s yeah I was I was worried about a second that I forgot that our first cell is literally about the code so yeah right now it’s transcribing our entire project and this means it will have exact knowledge of what we’re doing beautiful so now when I’m trying to reference it it will not hallucinate as before because before I didn’t give uh llama free my file which was probably a mistake on my part as you can see like most of these mistakes like every single error We’ve ran into was error on my end not on the AI side so when something isn’t working you are probably want to blame stop blaming the AI see if you can write a better prompt or if you made a mistake yourself somewhere okay now that it has the knowledge let’s let’s describe our actually I don’t want to rewrite it I’m just going to use my previous prompt uh um I’m going to use my previous promt right here let me copy this great now that you have complete knowledge of what my code looks like then I’m going to paste this boom boom boom okay I’m going to correct my mistake before when I try to ask it to do everything at once and I’m just just going to tell it to do one thing right so uh step step one create a detailed okay create an outline of all the the required steps we have to do before we have an MVP of this team of Agents step two um give me stepbystep instructions for only the next step just choose one just choose one thing we should do and let’s focus exclusively on that you don’t have to correct typos and I’m going to do follow these steps in order and let’s send this prompt hopefully we get a better response because last time I tried to tell you to do everything at once and that was a mistake by the way this is probably going to be too advanced right error handling we don’t have to do that at all um why would we install a new notebook okay let me adjust my let me adjust my prompt right let’s work with with this um Google collab notebook no need to start a new one let’s build on top of what we already have see the code cells transcribed above this is a good thing that you can edit prompts because you see that you know the AI is responding in an incorrect way you can just edit it and simplify it right and adjust it in real time anyways let’s see focus on single St creative function to get user input I don’t know why it’s doing these uh okay let me call it out right stop trying to be perfectionistic this is just a simple demo no need to implement error handling and stuff like that obviously if you were doing it in production you would but right here I just want to make it simple the goal is to build a simple C email writer team of agents that works remove all the non necessary steps and let’s focus solely on the fundamentals it’s trying to be Advanced it’s trying to fix everything at the same time so yeah let’s use this generate okay wait what generating an agent to generate generate search query okay so it’s using perplexity to do this uh it does not need to use perplexity Okay we okay let me remind it remind this of this code I’m going to copy this code boom we will not use perplex this is what I mean by being manager you cannot let the model throw you around you have to be in charge we will not use perplexity to generate the search queries that will be done by the 405b model which is which is um this piece of code we will only use perplexity API later to actually do the market research part and browse the web Step One update the outline based on what I just said step two give me instruction ction for the next for what I should do next you have to really be careful because the models can point you in the wrong direction you have to be have a vision of what you want to build and then again you cannot go from a manager to the intern you have to stay at the manager and like hold the frame hold the vision and be in charge of the project right so let’s see I apologize blah blah blah step two let’s focus on creating an agent to generate search queries right so this is going to be okay so what it basically did is basically it wrapped it into a function and again you don’t have to understand what this is because the AI is doing the coding but I’m just explaining it right so response choices blah blah blah content split okay generate each yeah let’s just copy this I’m going to leave a try let’s copy this let’s go at the bottom generate a new code cell paste it inside what it did is I guess it tried to separated right um separated on new lines your AI assistant generates market research creates specific search queries okay for researching Target all right so this is um should be be interesting we can actually rename this to uh we caname this to agent one target this is going to be the target which is the user input right so we have this right here so get user input we have to call this yeah we have to call this and then save that basically so let’s test it out let’s test out if this works let’s do agent 01 and then get user input let’s see if this works boom let’s change it up let’s do um digital artists in La whatever and boom it did work wow that was actually fast top digital artist in Los Angeles digital artist studios in La hiring freelance artist Los Angeles digital art market trends and statistics not not too bad not too bad so our first agent Works which generates the free queries based on our user input so uh the next step is I guess to ask CLA what the next step is okay this this works what is the next step all right let’s see what Claud tells us to do next and by the way doesn’t matter what Chad Bo you’re using you can use Chad gbd perlex City meta AI Gemini it’s up to you okay create a function to perform web searches using perplexity that’s nice that’s nice let’s copy this okay we’re getting somewhere this is good so let me actually separate this into two cells so it’s simpler so here we Define the function where we’re using perplexity and then we get the response and then example usage list an empty list and then we Loop through it generated queries which I guess we should save into here okay so gen ated queries we need to save that as a variable let’s choose what we want to Target This Time Again up to you if you have a business in any industry that’s why I’m leaving this General so you can apply this to any situation or let’s say you’re trying to sell this to another business right you’re you’re trying to sell like C readymade cold emails for somebody else doesn’t matter you can use this with any company or any business this will help you write solid personalized Cod emails or personalized looms which is what I recommend of getting your first client but anyways more on that in the community so let’s do I don’t know let’s do video editors in Singapore okay Enter we should get the search queries okay saved into this variable then we R rerun this again uh actually we didn’t need to do that but whatever we have this variable now then result we save a result perform web search this should actually let’s rename this to um web search agent web search agent and we obviously have to update this boom save and then let’s run it let’s see if this works search result at append result and we should actually probably H print this out so let’s search results let’s see how it this looks boom yeah this is a long um long output because this is combining how many five uh we should probably like check this before running this generated queries so yeah okay three yeah three different searches so okay let me by the way just a quick tip in Google collab when you just type the name of the variable you can see what’s stored inside of it right so here we call the first agent to generate these three queries based on the user input so uh uh yeah this is going to do some market research maybe we can give it more instructions even right so what I’m going to do here is I’m going to turn this into a multi-line string by adding two more quotation marks so it’s three quotation marks uh so we can split it into multiple lines right makes it more organized all right so you are an AI assistant that writes effective search queries for market research let’s do create five specific search queries based on the given Target industry and then this is um here is the industry here’s the industry SL compan to perform market research on and then we do I don’t like delimiters like this this is just good practice prompting because this is a separate um token so it just tells the it tells the AI that this is you know something else it’s something different it sees it separately it’s kind of visually like we split it into new lines or whatever same idea but basically here let’s let’s try to be more exact let’s not even uh up for interpretation right query one let’s do query one typical pain okay pain points problems faced by this Avatar okay query 2 let’s do biggest companies in this industry oops I ran it query free we can do like um query free uh how companies in this industry uh get clients or leads okay query 4 where to find companies in this industry online something like this right give it more instruction so it’s not random and then it should output the four queries for each industry so let’s run it again uh I’m just going to use the same example video editors in Singapore and let’s see what the output is okay common challenges faced by freelance editors in Singapore top yes as you can see these are much better searches which is going to greatly improve our web search agent okay let’s update this real estate agents in um let’s do Cape Town whatever whoa this is bad I need to somehow make it concise right your assistant that writes concise search queries for market research create four short okay now this should be better because I’m I’m specified that it should be short and concise right real estate agents uh Cape Town hopefully yeah these are much shorter much better and then we have our web search agent and then it’s going to Loop through all five which actually we should probably print it out right should probably just print what query it is let’s interrupt this and I’m going to add a simple statement print query so that we see you know like searching for query searching for and then that that and we’ll get like more yeah biggest pain points agents and then we’ll we should see the updates happening in real time so we know which part of the loop we’re in this is nice it’s nice to see what the agent is doing right how real estate agents in Cape Town find new clients blah blah blah these are all our four search queries and then we should get the search results just output it um as a big block of sech results yeah okay so let me split this again change my mind we don’t need to have it visible okay so we have the search results what’s next what is next I’m just going to say uh I’ve implemented both okay whatever I’m just going to copy it both and then show it the agent and then show with the cuz I need I need to know that um the naming I changed the naming of it what is the next step so we’ve made significant progress progress we see how we can get the four search queries that are much more specialized because I improved the prompting then we get perplexity to search the web for each of the queries and then we have search results now we probably need the agent to take that input and just summarize it into an effect like synthesize an effective email from that okay let’s do that okay who great job the next step is to create an agent advise called email yes I mean I don’t know why I’m so excited I just described that earlier but yeah um the final will be component of our simple agent team for writing here’s what we’ll do create a function that takes the original Target and the search results as input use the 405b model yes we I do want to use 405b for this but actually we can use wait we can use Gro somewhere right we can use the gro for this probably just to have it faster I’m going to like um remind it of the Croc uh context and show it how to use that actually let’s use the gro API for this last part here’s the syntax again boom send it right here update the instructions you just gave me based on this you’re right let’s use gr API for COD imagination here’s the updated code wait so okay create a function that takes the original Target in the search results as input use the 405b return the generated email so then takes okay takes the original Target as input so two inputs into our function called email agent Combine results into a single string combined results um wait a second oh yeah whatever it takes the search results that we get here why are we so up yeah search results and then combines them into single string chat completion Gro client blah blah blah and then example usage okay so again let’s split this into two different cells here we Define the agent so we we’re using Lama 70b you know obviously uh for faster generation and then we need to pass these two parameters which we are fed using an F string into the prompt Ral email 4 boom based on this search and then again this should probably be the limited so it’s like separated uh whatever let’s not be too per perfectionistic here again this is obviously could be improved so much I’m just trying to make it simple for you guys so you can see what’s possible with llama 3.1 with Grog with AI agents I mean again once you add perplexity the number of Agents you can build increases so much because you have realtime access to the web so all of that you can do just by following these steps and obviously CH feel free to change this to your own use case so let’s see C em do I read The Prompt I probably should read The Prompt right your a assistant that wres personalized C emails based on Market Research create a consise and engaging called email write a called email for based on the This research oh okay these PRS are very very basic very um could be optimized much more but let’s see if llama 70b can do it well with these concise proms gold email equals then we use the cold email agent with Target and there search results I mean Target um wait is that really saved as a Target I don’t think so we haven’t yet the US input needs to be saved and okay we the user input get user input yeah I didn’t do it the best way so let’s let’s take this and let’s save this into a variable right well the good thing is we have it in function so we can call it many times so okay we can say uh user input equals get us input so let’s change it up Let’s do let’s do uh sonas in okay Sona Studio okay what is it called Sona studio in London enter then we again need to call our agent 01 which is going to be whatever probably rename this to query agent uh uh query agent whatever let’s rename that new code cell and then we save this what do we to save the generated queries yeah gen equals query agent user input nice let’s do that and then let’s also do the same here actually we can check the user input how it’s saved so yeah exactly and then here the reason I do this again it’s it’s in Google collab that just shows you what’s in the variable right you you just need to type the name of the variable in new line beautiful so these are our queries and now we need to call the what’s it called web search agent again so let me copy this part of code use this again and we are finding the pain points for Sona studios in London so beautiful while this agent is working let’s see this is going to take our Target which should not be Target it should be user input and then search results which is going to be correct and generate it C email and then C email let’s see should be fast with the gro boom here’s a concise engagement yeah so this should definitely not be in the C email boost your Sona studios Okay so let me just update this again I’m going to do this into multi-line rink that way we can have more room to work with you are you are an expert called email writer okay your task is to write short concise and personalized called emails based on the market research given to you okay and then we go do not output any other text only the C email itself okay and then we maybe we can copy actually these four areas make sure utilize all four areas all four areas of the research and then we just do pain points okay be like probably just a inside of a parenthesis doesn’t matter this is just a simple PR engineering let’s see how can we improve the cold email right uh uh focus on describing what the target Avatar will get add an appealing guarantee keep the email concise and um concise and use plain English so it doesn’t use like corporate stuff right right called okay and then here we just do we can do here is the target Avatar boom and we do can do a new line character actually like this or like this uh here is the what did I do Market research oh yeah I pressed end okay this should be much better prompt let’s see update and then obviously we have to update everything below it so okay now we’re searching seems like there’s no industry whoa whoa whoa wait what happened enter enter oh yeah I didn’t enter anything that’s a mistake let’s let’s try again again let’s do man I’m running out of ideas here Jesus kind type of companies construction company in Salt Lake City I mean I don’t know let’s let’s do a bigger city Let’s do let’s do Austin Texas whatever okay Austin Texas then this needs to rerun boom successful this needs to run and this is going searching biggest paint points we should probably this this should be also much better you know because these search results are still kind of I mean it’s going into perplexity so it’s it’s good perplexity should handle that nicely okay this is the last sege results then we should trigger the cold emo Rider which should be fast given we’re using 70b from Gro but we’re passing in a lot of tokens that’s true okay this is uh let’s see and we’re still getting this line that’s crazy okay only output the email itself no other text I guess I need to clarify this 100 times okay let’s rerun these parts let’s see yeah this part is working well but the Cil is still including some extra extra characters that we do not want right all right let’s see again I could spend many hours on just the prim gening yeah this is the issue the Llama is not um yeah Lama 70b is not as good as the 405b obviously so maybe we should actually use 405b for this 405b will obviously be better at instruction following than 70 billion model that’s always the case when you’re using um smarter models so let’s copy this and we’re going to replace our agent to use the together API so let’s save this hopefully no errors fingers crossed temperature should be kind of low here why is 0.7 uh 0.1 yeah let’s do 0.1 and save that and rerun rerun okay we don’t need to hold email agent we don’t need to rerun this we don’t need to rerun this we just need to rerun this part let’s see what happens it should be a bit slower hopefully better though okay nice so obviously this line is our line This is added from our print statement we could honestly comment this out we can hly remove this but yeah what I’m trying to say is that finally it’s not adding any garbage right it’s not adding any sentences and that’s honestly a good lesson if you want to use smarter models if you want to use I mean smaller models they will not be as good at instruction following as the bigger models but it will also be you know bigger models are more expensive so yeah beautiful we don’t get any like here’s your email we get U the subject boost your construction business in Austin with proven Solutions deer owner’s name as a construction company owner in Austin you’re likely no stranger to challenges of Labor shortages communicate oh my God this is nice it’s facing the biggest issues so it clearly demonstrates that you understand our team has worked with top construction companies in Austin including okay this is straight up a lie so seems like um moral doesn’t uh doesn’t mind lying but what you should say is that you know I’ve studied the top construction companies Onin like DPR and White Construction blah blah blah right you shouldn’t make claims that are not true obviously we’re confident in our approach that we guarantee 20% increase in profitability 15% yes specific measurable guarantee within first six months nice this is a solid Cod email I mean obviously it’s not perfect it could be improved a lot it personally I would make it more concise but if you if you have no C experience this agent will help you tremendously the last step as I promised at the start is to show you how to run these models locally on your computer to do that let’s go to ama.com ama.com so just you know to make it visible let me make it clear like this and in here you need to download it so click on download and select your operating system currently I’m on Windows but I also run it on Mac OS um works fine it doesn’t matter which operating system you have it will work either way so click on download and this should take a few minutes then what we have to do once you download all this is confusing for a lot of people it doesn’t have a user interface you simply use it in the terminal so if you’re not a programmer you’re probably afraid of the terminal right but there is no reason to do that there’s no reason to be afraid of the terminal what you simply have to do is um on Windows you can do like Windows R CMD right or if you have cond you can do anaconda prompt and use K doesn’t matter just open up any terminal and if you are completely clueless you can just go into proxity any AI tool and say like I’m on Windows how do I open the terminal right if again don’t let um stuff stop you don’t let friction stop you just use the AI tools to help you out method one using the start menu method two use the Run dial box click on start menu type CMD or command prompt you know Windows R CMD using file explorer so as you can see there are four different methods in here how to open the terminal simply use the AI tools ask the AI tools hopefully you can write a simple prompt in plain English if you can great job you can build essentially anything because there are so many people that will use the slightest friction as an used to not do stuff which is annoying so uh let’s go back once you open a terminal any terminal will do obviously first you have to download o Lama the second thing you have to go is to models on the top right click on models click on llama 3.1 and select the version you want to run now you have to be careful because there are three versions right 8 billion which is the smallest one 70 billion which is the middle one and 405 I’m just going to be honest with you none of you are running 405 unless you have some crazy setup with like 16 uh you know 309s or like 8 409s you are not running 405 okay 70 billion I’m also going to be honest with you 99% of you are not going to be run this model you need a lot of vram or if you’re on Mac it has the share memory you need at least $5,000 computer with tons of vram to run this model but most of you should be run able to run this 8 billion model locally so what you need to do is just click on this button and copy this again you don’t need to remember anything just copy what AMA gives you I’m going to copy it go into my terminal and paste it inside and hit enter now as you can see it’s loading and this will say pulling manifest if you’re doing this for the first time it will start saying pulling manifest and then downloading the model here you can see the size of the model in gigabytes this is 4.7 so this should take like 5 10 minutes to download for the first time it will say pulling manifest and then downloading progress because if you’re running it for the first time you have not downloaded the model yet but uh once you download it which is what I already did you’ll see this sent a message and you don’t have to download it ever again so this is only for the first time and then you can just use it anytime so you can say like um teach me about python keep okay teach me about Python answer in short whatever here are some key points to get you started as you can see this is much faster because this is the smaller version 8 billion right and right now I’m on windows so here I have like what 32 G 64 G I don’t know whatever I I can run this pretty comfortably on a 3070 TI um 8 billion model on my MacBook I can run the 7 TV but that’s because I have 128 gigs of RAM so just go with the 8 billion model if you have a really good computer you can try 70b and as you can see we’re using Ama right here and this is the Llama 3.1 running on my machine and if I pull up task manager you can see um how it taxs as my computer right so if we go into performance and especially GPU we can see that it’s being um you know oh actually OBS I’m recording so the GPU is already being blasted but let’s say another message uh write a long uh explanation of let’s say something Advanced like uh let’s do the string string theory in physics okay hit enter and now the model is generating let’s see how the GPU is being blasted even more boom 95% and obviously if I wasn’t recording I could probably get even faster responses but yeah that’s the downside so it’s as you can see this is the command problem it’s using some Ram but mostly GPU and as you can see when it finish generating the 3D part drops significantly um so yeah this is how to run llama 3.1 locally the 8 billion version if you have a good computer try the 70 billion version but you need the 405b to host it somewhere obviously as we saw earlier meta has plenty of options on all the cloud providers to host it or you can just use together AI as we did in our agent now if you want to have access to all of my agents and code prompts everything I’ve built in the past as well as all the stuff I will build in the future then make sure to join the Community obviously you’ll get access to the AI agents Workshop which will teach you everything you need to know to build AI agents and in the next four days we’re releasing the make money with AI agents training which is going to teach you how to make your first ,000 with AI and on top of that at month two you get a oneon-one call with me that’s right you get all of this just by joining the community so make sure to do it before the end of the month because after we release the make money with AI training the price is going to double so join now the link is in the description thank you for watching