@empiricalrun/llm
TypeScript icon, indicating that this package has built-in type declarations

0.18.0 • Public • Published

llm

Package to connect and trace LLM calls.

Usage

import { LLM } from "@empiricalrun/llm";

const llm = new LLM({
  provider: "openai",
  defaultModel: "gpt-4o",
});
const llmResponse = await llm.createChatCompletion({ ... });

Vision utilities

This package also contains utilities for vision.

Query

Ask a question against the image (e.g. to extract some info, make a decision) and get the answer.

import { query } from "@empiricalrun/llm/vision";

// With Appium
const data = await driver.saveScreenshot("dummy.png");
const instruction =
  "Extract number of ATOM tokens from the image. Return only the number.";

const text = await query(data.toString("base64"), instruction);
// Example response: "0.01"

Get bounding boxes

import { getBoundingBox } from "@empiricalrun/llm/vision/bbox";

// With Appium
const data = await driver.saveScreenshot("dummy.png");
// Give a line describing the screen and then the element that you want to find
const instruction =
  "This screenshot shows a screen to send crypto tokens. What is the bounding box for the dropdown to select the token?";

const bbox = await getBoundingBox(data.toString("base64"), instruction);
const centerToTap = bbox.center; // { x: 342, y: 450 }

// **Note**: These coordinates are relative to the image dimensions, and actions like
// tap require scaling the coordinates to Appium coordinates

Bounding box can require some prompt iterations, and you can do that with a debug flag. This flag returns a base64 image that has the bounding box drawn on top of the original image.

const bbox = await getBoundingBox(data.toString("base64"), instruction, {
  debug: true,
});
console.log(bbox.annotatedImage);

Readme

Keywords

none

Package Sidebar

Install

npm i @empiricalrun/llm

Weekly Downloads

632

Version

0.18.0

License

none

Unpacked Size

149 kB

Total Files

91

Last publish

Collaborators

  • arjun27