I am not really focusing on this anymore. The official library came out a few months ago, and that works a lot better. I am sure more and more of this will break.
The simplest JavaScript library for the easiest way to run LLMs.
This is not the official library (we don't have one), but I am one of the maintainers of Ollama.
import { Ollama } from 'ollama-node';
const ollama = new Ollama();
await ollama.setModel("llama2");
// callback to print each word
const print = (word: string) => {
process.stdout.write(word);
}
await ollama.streamingGenerate("why is the sky blue", print);
const ollama = new Ollama();
Creates the Ollama object. All methods are called from this object.
Hostname - defaults to 127.0.0.1
await ollama.setModel("llama2";)
Sets the model to use for Generation. Unless you override anything, it will use the template, parameters, and system prompt from the Modelfile.
ollama.setSystemPrompt("You are an AI assistant.");
Sets the system prompt to use with this model. Overrides anything set in the Modelfile.
ollama.setTemplate("this is a template")
ollama.addParameter("stop", "User:")
ollama.deleteParameter("stop", "User:")
ollama.deleteParameterByName("stop");
Deletes all parameters with that name.
ollama.deleteAllParameters();
const params = ollama.showParameters();
const sprompt = await ollama.showSystemPrompt()
Useful if you want to update the system prompt based on the existing one.
const template = ollama.showTemplate();
const model = ollama.showModel();
Shows the current model name
const info = await ollama.showModelInfo();
Returns parameters, template, system prompt for the current model.
const models = await ollama.listModels();
Lists all local models already downloaded.
const output = await ollama.generate("Why is the sky blue?");
This will run the generate command and return the output all at once. The output is an object with the output and the stats.
If you want the streaming version, see below.
This is a streaming version of generate, but you don't need to know anything about streaming. Just write a callback function that does what you want to happen.
const printword = (word: string) => {
process.stdout.write(word);
}
const printline = (line: string)
await ollama.streamingGenerate("why is the sky blue", printword, null, printline)
There are four potential callbacks, all of which are optional, though their positions matter. Use null if you want a later one and not an earlier one.
The Callbacks are:
- responseOutput: outputs just the token in the response.
- contextOutput: outputs the context at the end.
- fullResponseOutput: outputs the full response object.
- statsOutput: outputs the stats object at the {% endif %}
Other functions I need to document
- create
- streamingCreate
- streamingPull
- streamingPush
- copy
This is not in a finished state. It is absolutely a work in progress. I started putting this together and then later saw someone put out another library on npm. bummer. but cool that it's exciting for other folks.
I just want an easy way to consume this in some examples.
I'll flesh it out soon.