Emojibase v0.11.0
This project was renamed to Emojibase from emoji-database, and has moved to emojibase, please update your dependencies!
A collection of up-to-date, pre-generated, specification compliant, emoji datasets and regex patterns. Data is generated with unicode-emoji-data, unicode-emoji-annotations, and emojione packages, for increased accuracy, interoperability, and customizability.
Installation
npm install emojibase --save
// Or
yarn add emojibase
Documentation
Datasets
Emoji's are generated into JSON files called datasets. Each file is customized to provide a subset of data in a specific format (below). This provides multiple options, so choose the best dataset for your application.
data/<format>/list.json
- A list of emoji objects in the specified format.data/<format>/map.json
- A mapping ofhexcode
s to emoji objects in the specified format.data/<format>/by-category.json
- A list of emoji objects in the specified format, grouped by theircategory
.
Replace
<format>
with the format of your choosing.
Datasets can be used by simply importing their JSON file, and parsing it, unless otherwise configured by a build process.
; const data = JSON;
For more specialized and granular use cases (like reduced filesizes), the following extra datasets are also available.
data/extra/unicode.json
- A list of emojiunicode
characters.data/extra/hexcodes.json
- A list of emojihexcode
characters.data/extra/hexcode-to-shortname.json
- A mapping ofhexcode
s toshortname
s.data/extra/shortnames.json
- A list of emoji shortnames.data/extra/shortname-to-unicode.json
- A mapping ofshortname
s tounicode
characters.
;
Formats
Datasets are grouped into 3 different formats, with each composed of a subset of properties.
compact
- Includes theunicode
,hexcode
, andshortname
properties.standard
- Includes theunicode
,hexcode
,shortname
,codepoint
, andname
properties.expanded
- Includes all properties mentioned above.
Properties
Emoji object's within a dataset are composed of the following properties.
category
(string) - The category the emoji character is grouped under.codepoint
(number[]) - An array of code points, parsed from thehexcode
property.display
(string) - The default presentation of the emoji character, either "emoji" or "text".emoji
(string) - The emoji presentation unicode character.gender
(string) - If applicable, the gender of the emoji, either "male" or "female". Only exists for emojis that support genders.hexcode
(string) - The hexadecimal representation of the unicode character, separated by dashes. Does not include zero-width-joiner or variation selectors.name
(string) - The name of the emoji character.order
(number) - The sort order of all emoji characters.shortnames
(string[]) - Short word representations of the emoji character. Does not include surrounding colons.skin
(number) - If applicable, the skin tone, between 1 and 5. Only exists for emojis that support skin tones.tags
(string[]) - Tags and keywords relevant to the emoji character.text
(string) - The text presentation unicode character.unicode
(string) - The emoji or text unicode character depending ondisplay
. Only available in non-expanded formats.
Properties with null or undefined values are omitted from the generated dataset.
CDN Support
If you prefer to not inflate your bundle size with these large JSON dumps,
you can fetch them from our CDN (provided by jsdelivr.com) using fetchFromCDN
.
This function returns a promise, with the JSON data already parsed.
; ;
The 1st argument requires a JSON file path, relative to the data
folder,
while the 2nd argument is the specified release version (defaults to the latest).
Only JSON datasets can be fetched from our CDN.
Regex Patterns
To match emojis and shortnames within a string, multiple regex patterns are available for import.
All imports return a RegExp
object, with no flags, and no outer capture group.
regex
- Matches both emoji and text presentation characters.regex/emoji
- Matches emoji presentation characters.regex/text
- Matches text presentation characters.regex/shortname
- Matches emoji shortnames.
;; '🦍'; // Matches Harambe!
To compose new regex patterns, simply use the source
property.
const EMOJI_SHORTNAME_REGEX = `^|$` 'g';
The
u
(unicode) andg
(global) flags are not required when using these patterns.
Unicode
By default, regex patterns are generated as UCS-2 surrogate pairs. If desired, ES2015+
unicode aware regex patterns can be used, which can be found in the regex/es
directory.
;;
The unicode aware regex patterns are only supported in Node.js and modern browsers.
Helpers
Two helper functions are available for converting between emoji data representations.
The first, fromHexToCodepoint
, can be used to convert a dash separated hexcode into an
array of numerical codepoints.
; ; // [9994, 127996]
While the second, fromUnicodeToHex
, converts a literal unicode character into a dash
separated hexcode. Unless false
is passed as the 2nd argument, zero-width-joiner's
and variation selectors are removed.
; ; // 1F468-1F469-1F467-1F466; // 1F468-200D-1F469-200D-1F467-200D-1F466
Filesizes
The filesizes of all datasets and regex patterns can be found below, in ascending order.
File | Filesize | Gzipped |
---|---|---|
regex/shortname.js | 30 B | 50 B |
regex/text.js | 1.11 KB | 476 B |
regex/es/text.js | 1.28 KB | 492 B |
regex/emoji.js | 5.75 KB | 1.47 KB |
regex/index.js | 5.77 KB | 1.48 KB |
regex/es/emoji.js | 6.37 KB | 1.5 KB |
regex/es/index.js | 6.38 KB | 1.51 KB |
data/extra/unicode.json | 26.63 KB | 6.4 KB |
data/extra/hexcodes.json | 28.63 KB | 5.85 KB |
data/extra/shortnames.json | 38.26 KB | 9.16 KB |
data/extra/shortname-to-unicode.json | 64.89 KB | 15.72 KB |
data/extra/hexcode-to-shortname.json | 66.9 KB | 15.55 KB |
data/compact/map.json | 149.52 KB | 24.29 KB |
data/compact/list.json | 172.85 KB | 24.02 KB |
data/compact/by-category.json | 172.95 KB | 24.07 KB |
data/standard/map.json | 317.57 KB | 45.19 KB |
data/standard/list.json | 340.9 KB | 45.65 KB |
data/standard/by-category.json | 341 KB | 45.64 KB |
data/expanded/by-category.json | 553.12 KB | 73.12 KB |
data/expanded/map.json | 576.93 KB | 74.43 KB |
data/expanded/list.json | 600.26 KB | 74.03 KB |