@typhonjs-utils/unicode

0.1.0 • Public • Published

@typhonjs-utils/unicode

NPM Code Style License Build Status Coverage API Docs Discord Twitch

Provides a fast and space efficient ESM based Unicode grapheme parser including an iterable parser.

API documentation

Overview:

There are two resources available that work well in the browser via the fflate compression library:

The main use case presently supported is parsing strings for Unicode grapheme clusters.

The following functions are exported from @typhonjs-utils/unicode:

  • graphemeSplit(string): string[]
  • graphemeIterator(string): IterableIterator<string>

For instance, you can use graphemeIterator as a tokenizer for @typhonjs-svelte/trie-search allowing the trie to be made up of Unicode graphemes. There is more work to be done on this package especially for making a complete implementation of graphemeIterator. Right now there is a trivial / eager implementation that uses graphemeSplit, so the goal is to move toward creating a graphemeIterator implementation w/ full Unicode support, but more importantly the most compact browser capable implementation possible.

When you bundle this package for the browser presumably w/ Rollup or another bundler do remember to configure your bundle for browser support. For instance when using Rollup and @rollup/plugin-node-resolve pass { browser: true } to the Node resolve plugin.

Roadmap:

  • Complete a non-eager implementation of graphemeIterator.

Package Sidebar

Install

npm i @typhonjs-utils/unicode

Weekly Downloads

5

Version

0.1.0

License

MPL-2.0

Unpacked Size

382 kB

Total Files

15

Last publish

Collaborators

  • typhonrt