This package has been deprecated

Author message:

Package no longer supported. Contact Support at https://www.npmjs.com/support for more info.

emoji-patterns

14.0.1 • Public • Published

Emoji Patterns

Description

This Node module returns a JSON-compatible object literal containing both basic and compound emoji pattern strings.

Available Patterns

The following patterns are generated using the information parsed from the Emoji 14.0 data files emoji-data.txt, emoji-sequences.txt and emoji-zwj-sequences.txt:

  • Basic_Emoji
  • Emoji
  • Emoji_Component
  • Emoji_Keycap_Sequence
  • Emoji_Modifier
  • Emoji_Modifier_Base
  • Emoji_Presentation
  • Extended_Pictographic
  • RGI_Emoji_Flag_Sequence
  • RGI_Emoji_Modifier_Sequence
  • RGI_Emoji_Tag_Sequence
  • RGI_Emoji_ZWJ_Sequence

These basic patterns are then used to generate two more complex compound patterns:

  • Emoji_All
  • Emoji_Keyboard
const
{
    Basic_Emoji,
    Emoji,
    Emoji_Component,
    Emoji_Keycap_Sequence,
    Emoji_Modifier,
    Emoji_Modifier_Base,
    Emoji_Presentation,
    Extended_Pictographic,
    RGI_Emoji_Flag_Sequence,
    RGI_Emoji_Modifier_Sequence,
    RGI_Emoji_Tag_Sequence,
    RGI_Emoji_ZWJ_Sequence
} = emojiPatterns;
// Keyboard emoji only (fully-qualified and components)
emojiPatterns["Emoji_Keyboard"] = `(?:${RGI_Emoji_ZWJ_Sequence}|${Emoji_Keycap_Sequence}|${RGI_Emoji_Flag_Sequence}|${RGI_Emoji_Tag_Sequence}|${Emoji_Modifier_Base}${Emoji_Modifier}|${Emoji_Presentation}|${Emoji}\\uFE0F)`;
// All emoji (U+FE0F optional)
emojiPatterns["Emoji_All"] = emojiPatterns["Emoji_Keyboard"].replace (/(\\u{FE0F}|\\uFE0F)/gi, '$1?');

Notes

  • The order of the basic patterns in the compound patterns is critical. Since a regular expression engine is eager and stops searching as soon as it finds a valid match (i.e., it always returns the leftmost match), the longest patterns must come first. The same strategy is also used when generating the RGI_Emoji_ZWJ_Sequence pattern itself.

  • In the compound patterns, ${Emoji_Modifier_Base}${Emoji_Modifier} can be replaced by ${RGI_Emoji_Modifier_Sequence} which is strictly equivalent (but more verbose).

  • Likewise, ${Emoji_Presentation}|${Emoji}\\uFE0F could be replaced by ${Basic_Emoji} (which should actually be called ${Basic_Emoji_Sequence} for the sake of consistency), but the latter is more restrictive since it only contains the 5 skin tone and 4 hairstyle components, excluding the 12 keycap bases and the 26 singleton regional indicators.

  • Providing patterns as strings instead of regular expressions does require the extra step of using new RegExp () to actually make use of them, but it has two main advantages:

    • Flags can be set differently depending on how the patterns are used.

    • The patterns can be further modified before being turned into regular expressions; for instance, unwanted sub-patterns can be discarded by replacing them with an empty string, or the pattern can be embedded into a larger one. See examples below.

Installing

Switch to your project directory (cd) then run:

npm install emoji-patterns

Testing

A basic test can be performed by running the following command line from the package directory:

npm test

Examples

Testing whether an emoji has a keyboard status or not

const emojiPatterns = require ('emoji-patterns');
const emojiKeyboardRegex = new RegExp ('^' + emojiPatterns["Emoji_Keyboard"] + '$', 'u');
console.log (emojiKeyboardRegex.test ("❤️"));
// -> true
console.log (emojiKeyboardRegex.test ("❤"));
// -> false

Extracting all emoji from a string

const emojiPatterns = require ('emoji-patterns');
const emojiAllRegex = new RegExp (emojiPatterns["Emoji_All"], 'gu');
console.log (JSON.stringify ("AaĀā#*0❤🇦愛爱❤️애💜".match (emojiAllRegex)));
// -> ["#","*","0","❤","🇦","❤️","💜"]

Extracting all emoji from a string, except keycap bases and singleton regional indicators

const emojiPatterns = require ('emoji-patterns');
const emojiAllPattern = emojiPatterns["Emoji_All"];
const customPattern = emojiAllPattern.replace (/\\u0023\\u002A\\u0030-\\u0039|\\u\{1F1E6\}-\\u\{1F1FF\}/gi, '');
const customRegex = new RegExp (customPattern, 'gu');
console.log (JSON.stringify ("AaĀā#*0❤🇦愛爱❤️애💜".match (customRegex)));
// -> ["❤","❤️","💜"]

Extracting all keyboard-status emoji from a string

const emojiPatterns = require ('emoji-patterns');
const emojiAllRegex = new RegExp (emojiPatterns["Emoji_All"], 'gu');
const emojiKeyboardRegex = new RegExp ('^' + emojiPatterns["Emoji_Keyboard"] + '$', 'u');
let emojiList = "AaĀā#*0❤🇦愛爱❤️애💜".match (emojiAllRegex);
if (emojiList)
{
    emojiList = emojiList.filter (emoji => emojiKeyboardRegex.test (emoji));
}
console.log (JSON.stringify (emojiList));
// -> ["🇦","❤️","💜"]

Removing all emoji from a string

const emojiPatterns = require ('emoji-patterns');
const emojiAllRegex = new RegExp (emojiPatterns["Emoji_All"], 'gu');
console.log (JSON.stringify ("AaĀā#*0❤🇦愛爱❤️애💜".replace (emojiAllRegex, "")));
// -> "AaĀā愛爱애"

Caveats

  • The basic patterns strictly follow the information extracted from the data files. Therefore, the following characters are considered Emoji in the emoji-data.txt file, although they are omitted in the emoji-test.txt file, as well as in the CLDR annotation files provided in XML format:

    • 12 keycap bases: number sign '#', asterisk '*', digits '0' to '9'
    • 26 singleton regional indicators: '🇦' to '🇿'
  • The regular expressions must include a 'u' flag, since the patterns make use of the new type of Unicode escape sequences: \u{1F4A9}.

  • The two main regular expression patterns Emoji_All and Emoji_Keyboard are pretty big, around 65KB each...

License

The MIT License (MIT).

Copyright © 2018-2021 Michel MARIANI.

Package Sidebar

Install

npm i emoji-patterns

Weekly Downloads

98

Version

14.0.1

License

MIT

Unpacked Size

765 kB

Total Files

10

Last publish

Collaborators

  • npm