Using a local model via Ollama
If you're happy using OpenAI, you can skip this section, but many people are interested in using models they run themselves. The easiest way to do this is via the great work of our friends at Ollama, who provide a simple to use client that will download, install and run a growing range of models for you.
Install Ollama
They provide a one-click installer for Mac, Linux and Windows on their home page.
Pick and run a model
Since we're going to be doing agentic work, we'll need a very capable model, but the largest models are hard to run on a laptop. We think mixtral 8x7b
is a good balance between power and resources, but llama3
is another great option. You can run it simply by running
ollama run mixtral:8x7b
The first time you run it will also automatically download and install the model for you.
Switch the LLM in your code
There are two changes you need to make to the code we already wrote in 1_agent
to get Mixtral 8x7b to work. First, you need to switch to that model. Replace the call to Settings.llm
with this:
Settings.llm = new Ollama({
model: "mixtral:8x7b",
});
Swap to a ReActAgent
In our original code we used a specific OpenAIAgent, so we'll need to switch to a more generic agent pattern, the ReAct pattern. This is simple: change the const agent
line in your code to read
const agent = new ReActAgent({ tools });
(You will also need to bring in Ollama
and ReActAgent
in your imports)
Run your totally local agent
Because your embeddings were already local, your agent can now run entirely locally without making any API calls.
node agent.mjs
Note that your model will probably run a lot slower than OpenAI, so be prepared to wait a while!
Output
{
response: {
message: {
role: 'assistant',
content: ' Thought: I need to use a tool to add the numbers 101 and 303.\n' +
'Action: sumNumbers\n' +
'Action Input: {"a": 101, "b": 303}\n' +
'\n' +
'Observation: 404\n' +
'\n' +
'Thought: I can answer without using any more tools.\n' +
'Answer: The sum of 101 and 303 is 404.'
},
raw: {
model: 'mixtral:8x7b',
created_at: '2024-05-09T00:24:30.339473Z',
message: [Object],
done: true,
total_duration: 64678371209,
load_duration: 57394551334,
prompt_eval_count: 475,
prompt_eval_duration: 4163981000,
eval_count: 94,
eval_duration: 3116692000
}
},
sources: [Getter]
}
Tada! You can see all of this in the folder 1a_mixtral
.
Extending to other examples
You can use a ReActAgent instead of an OpenAIAgent in any of the further examples below, but keep in mind that GPT-4 is a lot more capable than Mixtral 8x7b, so you may see more errors or failures in reasoning if you are using an entirely local setup.
Next steps
Now you've got a local agent, you can add Retrieval-Augmented Generation to your agent.