MCP-Evals

A command-line tool for evaluating and testing Model Context Protocol (MCP) servers. MCP-Evals helps you validate server capabilities, test tool calls, and ensure LLMs are correctly utilizing the provided tooling.

Installation

To install this package in your project (needed for tool evaluations), run npm install @buildwithlayer/mcp-evals.

To install the CLI globally, run npm install -g @buildwithlayer/mcp-evals.

Configuration

MCP-Evals can be configured using command-line flags, a configuration file, or a combination of both (command-line flags will override configuration file properties).

Configuration File

Create a .mcp-evals.json file in your project root:

{
  "transport": "sse",
  "url": "http://localhost:3001/sse",
  "tool-evals-directory": "./path/to/evals"
}

Configuration Options

Option	Description	Example
`transport`	Connection transport type (`sse`, `stdio`, or `streamableHTTP`)	`"sse"`
`url`	URL for SSE / Streamable HTTP connection	`"http://localhost:3001/sse"`
`command`	Command for STDIO connection	`"node ./server.js"`
`args`	Arguments for STDIO command	`["--port", "3000"]`
`env`	Environment variables for STDIO command	`{"NODE_ENV": "test"}`
`tool-evals-directory`	Directory containing tool evaluation files	`"./src/evals"`

Environment Variables

Create a .env file for sensitive configuration:

OPENAI_API_KEY=your_api_key_here

Commands

Testing Server Connection

# Test connection to an MCP server
mcp-evals connection --sse --url http://localhost:3001/sse

# Using STDIO transport
mcp-evals connection --stdio --command node --args ./server.js

Listing Server Capabilities

# List available tools
mcp-evals list-tools [--verbose]

# List available resources
mcp-evals list-resources [--verbose]

# List available prompts
mcp-evals list-prompts [--verbose]

# Test connection, list tools, resources, and prompts (whichever are included in server capabilities)
mcp-evals all [--verbose]

Common Flags for All Commands

Flag	Description
`--config`	Path to JSON configuration file (defaults to .mcp-evals.json in root directory)
`--sse`	Connect using SSE transport
`--stdio`	Connect using STDIO transport
`--streamableHttp`	Connect using Streamable HTTP transport
`--url`	URL to connect to (for SSE or Streamable HTTP)
`--command`	Command to run (for STDIO)
`--args`	Arguments for the command (space-separated)
`--env`	Environment variables (format: KEY=VALUE)
`--verbose`	Enable verbose output

Tool Evaluation Framework

MCP-Evals includes a testing framework for validating LLM tool calls.

Assertion API

The testing SDK provides a fluent API for assertions:

Method	Description
`expect(MCPAIClient)`	Create an expectation for testing
`.toCall(tool/toolName)`	Assert that a specific tool was called
`.withArguments(args, {exact})`	Assert tool was called with specific arguments (Deep partial match if `exact` not set to `true`)
`.withArguments(matcherFn)`	Assert tool was called with arguments satisfying the matcher function: `(args) ⇒ boolean`
`.withResult(result, {exact})`	Assert tool was called with specific result (Deep partial match if `exact` not set to `true`)
`.withResult(matcherFn)`	Assert tool was called with result satisfying the matcher function: `(result) ⇒ boolean`

Example Usage

Create a TypeScript or JavaScript (ending in .tool-eval.ts or .tool-eval.js) file with exported functions:

// filepath: src/examples/tool-evals/search.tool-eval.ts
import {mcpAi, expect} from '@buildwithlayer/mcp-evals'
import {openai} from '@ai-sdk/openai'

// Test that "search" tool is called with correct query & the result contains a text content object including the word "climate"
async function searchToolEval() {
  // Model will default to gpt-4o if not provided
  await mcpAi.send('Find information about climate change', openai('gpt-3.5-turbo'))
  return expect(mcpAi)
    .toCall('search')
    .withArguments({query: 'climate change'}, {exact: true})
    .withResult((result) => result.content.some((c) => c.type === 'text' && c.text.includes('climate')))
}

// Export all evaluation functions
export default {
  searchToolEval,
}

Running Tool Evaluations

# Run evaluations from a specific file
mcp-evals run-tool-evals --evalsFile ./src/examples/tool-evals/add-tool-eval.ts

# Run all evaluations files within a specific directory
mcp-evals run-tool-evals --evalsDir ./src/examples/tool-evals

# Run all evaluations from the directory listed in your configuration file
mcp-evals run-tool-evals

# Exit with an error status code on first failed eval
mcp-evals run-tool-evals --bail

@buildwithlayer/mcp-evals

MCP-Evals

Installation

Configuration

Configuration File

Configuration Options

Environment Variables

Commands

Testing Server Connection

Listing Server Capabilities

Common Flags for All Commands

Tool Evaluation Framework

Assertion API

Example Usage

Running Tool Evaluations

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

Weekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

@buildwithlayer/mcp-evals

MCP-Evals

Installation

Configuration

Configuration File

Configuration Options

Environment Variables

Commands

Testing Server Connection

Listing Server Capabilities

Common Flags for All Commands

Tool Evaluation Framework

Assertion API

Example Usage

Running Tool Evaluations

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

DownloadsWeekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

Weekly Downloads