A command-line tool to combine markdown files from GitHub repositories. This tool can recursively fetch and combine markdown files from GitHub repositories, useful for creating context for Claude, ChatGPT and other LLM system prompt.
- 🔍 Recursively fetch files from GitHub repositories
- 📂 Support for multiple file extensions
- 🔄 Automatic rate limit handling
- 📝 Generated table of contents
- 🔑 GitHub token support for higher rate limits
You can run this tool directly using npx:
npx md-combine
Or install it globally:
npm install -g md-combine
Basic usage:
npx md-combine -u https://github.com/user/repo/tree/main/docs
Options:
-u, --url <url> GitHub repository URL (required)
-o, --output <path> Output file path (default: "combined_output.md")
-r, --recursive Recursively fetch files from subdirectories
-t, --token <token> GitHub personal access token (optional)
-e, --extensions <exts...> File extensions to include (default: .md)
--help Display help for command
-V, --version Output the version number
- Basic usage (markdown files only):
npx md-combine -u https://github.com/user/repo/tree/main/docs
- Recursive search with custom output file:
npx md-combine -u https://github.com/user/repo/tree/main/docs -r -o documentation.md
- Multiple file extensions:
npx md-combine -u https://github.com/user/repo -r -e .md .txt .json
- Using GitHub token for higher rate limits:
npx md-combine -u https://github.com/user/repo -t your_github_token
- Combining specific file types recursively:
npx md-combine -u https://github.com/user/repo -r -e .vue .js .ts
While the tool works without a token, GitHub API has rate limits:
- Without token: 60 requests per hour
- With token: 5,000 requests per hour
To use a token:
- Create a token at https://github.com/settings/tokens
- Use it in either way:
- Pass it via command line:
-t your_token
- Set environment variable:
export GITHUB_TOKEN=your_token
- Pass it via command line:
The combined file includes:
- Generation timestamp
- Source repository information
- Table of contents with links
- Original file paths as headers
- File contents with proper separation
- Navigation-friendly anchor links
Example output structure:
# Combined Files from owner/repo
Generated on: 2024-11-10T12:00:00.000Z
Source: owner/repo/docs
Extensions: md, txt
## Table of Contents
- [docs/intro.md](#docs-intro-md)
- [docs/api/methods.md](#docs-api-methods-md)
---
<h2 id="docs-intro-md">docs/intro.md</h2>
[Content of intro.md]
---
<h2 id="docs-api-methods-md">docs/api/methods.md</h2>
[Content of methods.md]
Contributions are welcome! Please feel free to submit a Pull Request.
MIT
Jithin Sha