toak
TypeScript icon, indicating that this package has built-in type declarations

3.0.3 • Public • Published

toak

Overview

toak is a tool that processes git repository files, cleans code, redacts sensitive information, and generates markdown documentation with token counts using the Llama 3 tokenizer.

$ cd your-git-repo
$ npx toak

toak


Philosophy

Human-first technologies for a better future.


Features

Data Processing

  • Reads tracked files from git repository
  • Removes comments, imports, and unnecessary whitespace
  • Redacts sensitive information (API keys, tokens, JWT, hashes)
  • Counts tokens using llama3-tokenizer-js
  • Supports nested .toak-ignore files

Token Cleaning

  • Removes single-line and multi-line comments
  • Strips console.log statements
  • Removes import statements
  • Cleans up whitespace and empty lines

Security Features

  • Redacts API keys and secrets
  • Masks JWT tokens
  • Hides authorization tokens
  • Redacts Base64 encoded strings
  • Masks cryptographic hashes

Requirements

  • Node.js (>=14.0.0)
  • Git repository
  • Bun runtime (for development)

Installation

npm install toak

Usage

CLI

npx toak

Programmatic Usage

import { MarkdownGenerator } from 'toak';

const generator = new MarkdownGenerator({
  dir: './project',
  outputFilePath: './output.md',
  verbose: true
});

const result = await generator.createMarkdownDocument();

Configuration

MarkdownGenerator Options

interface MarkdownGeneratorOptions {
  dir?: string;                    // Project directory (default: '.')
  outputFilePath?: string;         // Output file path (default: './prompt.md')
  fileTypeExclusions?: Set<string>;// File types to exclude
  fileExclusions?: string[];      // File patterns to exclude
  customPatterns?: Record<string, any>;      // Custom cleaning patterns
  customSecretPatterns?: Record<string, any>;// Custom redaction patterns
  verbose?: boolean;              // Enable verbose logging (default: true)
}

Ignore File Configuration

Create a .toak-ignore file in any directory to specify exclusions. The tool supports nested ignore files that affect their directory and subdirectories.

Example .toak-ignore:

# Ignore specific files
secrets.json
config.private.ts

# Ignore directories
build/
temp/

# Glob patterns
**/*.test.ts
**/._*

Default Exclusions

The tool automatically excludes common file types and patterns:

File Types:

  • Images: .jpg, .jpeg, .png, .gif, .bmp, .svg, .webp, etc.
  • Fonts: .ttf, .woff, .woff2, .eot, .otf
  • Binaries: .exe, .dll, .so, .dylib, .bin
  • Archives: .zip, .tar, .gz, .rar, .7z
  • Media: .mp3, .mp4, .avi, .mov, .wav
  • Data: .db, .sqlite, .sqlite3
  • Config: .lock, .yaml, .yml, .toml, .conf

File Patterns:

  • Configuration files: .*rc, tsconfig.json, package-lock.json
  • Version control: .git*, .hg*, .svn*
  • Environment files: .env*
  • Build outputs: build/, dist/, out/
  • Dependencies: node_modules/
  • Documentation: docs/, README*, CHANGELOG*
  • IDE settings: .idea/, .vscode/
  • Test files: test/, spec/, tests/

Development

This project uses Bun for development. To contribute:

Setup

git clone <repository>
cd toak
bun install

Scripts

# Build the project
bun run build

# Run tests
bun test

# Lint code
bun run lint

# Fix linting issues
bun run lint:fix

# Format code
bun run format

# Fix all (format + lint)
bun run fix

# Development mode
bun run dev

# Publish development version
bun run deploy:dev

Project Structure

src/
├── index.ts              # Main exports
├── TokenCleaner.ts       # Code cleaning and redaction
├── MarkdownGenerator.ts  # Markdown generation logic
├── cli.ts               # CLI implementation
├── fileExclusions.ts    # File exclusion patterns
└── fileTypeExclusions.ts # File type exclusions

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Commit your changes
  4. Push to the branch
  5. Open a Pull Request

Guidelines

  • Write TypeScript code following the project's style
  • Include appropriate error handling
  • Add documentation for new features
  • Include tests for new functionality
  • Update the README for significant changes

Note

This tool requires a git repository to function properly as it uses git ls-files to identify tracked files.

License

GNU AFFERO GENERAL PUBLIC LICENSE

Version 3, 19 November 2007 © 2024 Geoff Seemueller

/toak/

    Package Sidebar

    Install

    npm i toak

    Weekly Downloads

    13

    Version

    3.0.3

    License

    AGPL-3.0-or-later

    Unpacked Size

    3.5 MB

    Total Files

    12

    Last publish

    Collaborators

    • seemueller-io