A powerful Model Context Protocol (MCP) server that enables natural language interaction with GitHub repositories using Retrieval-Augmented Generation (RAG). This tool makes codebases conversational by leveraging AST parsing, semantic embeddings, and natural language interfaces.
- Multi-language Support: Process TypeScript, JavaScript, and Python codebases
- Flexible Embeddings: Choose between OpenAI, Hugging Face, or Xenova embeddings
- Seamless Integration: Works with Claude Desktop, Cursor, VS Code, and other MCP clients
- Smart Chunking: AST-powered semantic code chunking for better context
- Fast Search: Local FAISS index for quick semantic search
- Natural Q&A: Ask questions about your codebase in plain English
- Node.js (v14 or higher)
- Python 3.x (for Python code support)
- Git (for repository cloning)
- Optional: API keys for OpenAI or Hugging Face (required for their respective embeddings)
-
Create a
.env
file in the project root:cp .env.example .env
-
Edit the
.env
file with your API keys:# OpenAI API Key (required for OpenAI embeddings) OPENAI_API_KEY=your_openai_api_key_here # Hugging Face API Key (required for Hugging Face embeddings) HUGGINGFACE_API_KEY=your_huggingface_api_key_here
⚠️ Important: Never commit your.env
file to version control. It's already in.gitignore
to prevent accidental commits.
-
Install Claude Desktop:
# Follow instructions at https://github.com/jxnl/cluade-desktop
-
Configure the MCP server:
# Open VS Code with the config file code ~/Library/Application\ Support/Claude/claude_desktop_config.json
-
Add the MCP server configuration:
{ "mcpServers": { "github_repo_rag_server": { "command": "npx", "args": ["-y", "github_repo_rag"], "env": { "OPENAI_API_KEY": "your_openai_key", "HUGGINGFACE_API_KEY": "your_huggingface_key" } } } }
Use the following command in Claude Desktop or any compatible MCP client:
process repository https://github.com/owner/repo.git use openai embeddings
-
use openai embeddings
- Use OpenAI's embedding models -
use huggingface
- Use Hugging Face's embedding models - Default: Xenova Transformers (no API key required)
The process will:
- Clone the repository
- Parse files to extract functions and classes
- Create embeddings using your chosen model
- Build a local searchable FAISS index
Query your codebase using natural language:
How does the agent handle GitHub API authentication? repo https://github.com/owner/repo.git
The server will:
- Search the vector index for semantically relevant code
- Return context-rich answers with relevant functions and logic
Issue | Solution |
---|---|
Invalid repo URL | Ensure the repository is public and the URL is correct |
Disk space issues | Check available space for cloning and indexing |
Missing dependencies | Verify Node.js, Python, and Git installations |
API key errors | Confirm correct API keys are set in environment variables |
Push protection errors | Ensure no API keys are committed to the repository |
- Check console logs for detailed error messages and stack traces
- Verify all required dependencies are installed
- Ensure proper permissions for repository cloning and file access
- Verify environment variables are properly set
- Automatic README summarization (when available)
- Support for private repositories (with proper authentication)
- Customizable chunking strategies
- Configurable embedding models
Contributions are welcome! Please feel free to submit a Pull Request.
- Never commit API keys or sensitive information
- Use environment variables for all sensitive data
- Keep your
.env
file in.gitignore
- Use
.env.example
as a template for required environment variables
This project is licensed under the MIT License - see the LICENSE file for details.