CmpStr is a TypeScript library for advanced string comparison, similarity measurement, phonetic indexing, and text analysis. It includes implementations of several established algorithms such as Levenshtein, Dice–Sørensen, Damerau–Levenshtein and Soundex. The library has no external dependencies and allows for the integration of custom metrics, phonetic mappings, and normalization filters.
CmpStr provides a unified API for single, batch and pairwise operations. It is suitable for a range of use cases in application development and research. The package includes support for both ESM and CommonJS environments, TypeScript type declarations and a browser-compatible JavaScript bundle.
Originally launched in 2023 with a minimal feature set, the library was redesigned in 2025 to support a broader set of algorithms and processing features. The current version offers asynchronous operation, configurable normalization and filtering pipelines, phonetic search functionality, and basic tools for string differencing.
Key Features
- Unified API for string similarity, distance measurement and matching
- Modular metric system with support for algorithms such as Levenshtein, Jaro-Winkler, Cosine etc.
- Integrated phonetic algorithms (e.g., Soundex, Metaphone) with configurable registry
- Normalization and filtering pipeline for consistent input processing
- Single, batch and pairwise comparisons with structured, type-safe results
- Phonetic-aware search and comparison
- Utilities for text structure and readability analysis (e.g., syllables, word statistics)
- Diffing tools with CLI-friendly formatting
- TypeScript-native with full type declarations and extensibility
- Supports asynchronous workflows for scalable, non-blocking processing
- Extensible architecture for integrating custom algorithms and filters
Working with CmpStr is simple and straightforward. The package is installed just like any other using the following command:
npm install cmpstr
Minimal usage example:
import { CmpStr } from 'cmpstr';
const cmp = CmpStr.create().setMetric( 'levenshtein' ).setFlags( 'i' );
const result = cmp.test( [ 'hello', 'hola' ], 'Hallo' );
console.log( result );
// { source: 'hello', target: 'Hallo', match: 0.8 }
For asynchronous workloads:
import { CmpStrAsync } from 'cmpstr';
const cmp = CmpStrAsync.create().setProcessors( {
phonetic: { algo: 'soundex' }
} );
const result = await cmp.searchAsync( 'Maier', [
'Meyer', 'Müller', 'Miller', 'Meyers', 'Meier'
] );
console.log( result );
// [ 'Meyer', 'Meier' ]
The full documentation, API reference and advanced usage examples are available in the GitHub Wiki.
LICENSE MIT © 2023-2025 PAUL KÖHLER (KOMED3)