👪 🗾 🤵 🐯 Base Emoji 🦧 🥅 🔝 🚏
There is base32, there is base64, now there is base-emoji!
Installation
Install base-emoji as a cli executable using npm:
npm install -g @oktupol/base-emoji
or as a library inside your Javascript or Typescript project:
npm install @oktupol/base-emoji
Usage
CLI
-
Encode data from stdin:
echo 'Hello World' | base-emoji ==> 🐅🚓📿🙉🤍🐝🕎🚥🌿🤛🕓
-
Decode with the flag
-d
echo '🐎🍻🪖🦭🍃🍻🪶🦈🍆🌗👩🍶🕗' | base-emoji -d ==> I like emojis
-
Encode or decode data from a file
cat.jpg - 2009, Michael Wilson CC BY-NC-ND 2.0
base-emoji cat.jpg ==> ➿🌾📛🤹🤜😡🗻🦕😀😆📖🤹💅😀😀🙂😀🤪🍙🤹😘😀😀😃😀😀🤣🍶😀😀😀😀 😀😀😀🤾🪣🙂🍃😻🧇📺🕎🧾🧇🥻😇🎷👨😁🥄🚇🐪😟🤹😀😀😀🤑🦝😅🍑😀📿 🤘💋👗🤹😀🤨...
cat.jpg.emoji - full output of above command
-
Direct the output of any command into a file
base-emoji -d dog.jpg.emoji > dog.jpg
-
When encoding, optionally use the
-a
flag to armor the outputbase-emoji -a some-document.pdf ==> 🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔢💝🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵 🏦👭🪛👞🤥🍑⏳😀😀🤴🚎😲🦥😀😀🍀😀😀🤙🥃🤪😀😀🏃🧪🚿💾😀😀😦👮🚇 ... 🔃😀😀😀🦄😫🪛🦶👪🥃🖤🕓 🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔢💔🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵
-
When encoding with armor, optionally use the
--descriptor
option to specify a descriptorgpg --export-secret-key my@email.tld | base-emoji -a --descriptor '🤫🔑🙊' ==> 🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🤫🔑🙊💝🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵 🕧💦🦲👍🕞🧏🪝🫕📤🥯🦭🥬🚸🪦🍇🪶🍯🐸🥊➖🐧➿🪠🎁🪥🥌🐝🔙🍦🧂🕞🐴 ... 🚣🚶💒🦔🦃👂🎱😒🌱⛅🌵🕓 🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🤫🔑🙊💔🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵🔵
-
For a complete list of available options, run
base-emoji --help
Inside a Node project
The base-emoji library can be imported using
CommonJS:
const { BaseEmoji } = require('@oktupol/base-emoji');
ES6, Typescript:
import { BaseEmoji } from '@oktupol/base-emoji';
There are two functions:
BaseEmoji.encode()
Usage:
const result = BaseEmoji.encode(data, options);
Parameters:
-
data
(required) being any of:- a string
- an ArrayBufferLike (e.g. ArrayBuffer, Uint8Array)
-
options
(optional) - an object with following structure; all keys are optional:{ armor?: boolean; armorDescriptor?: boolean; wrap?: number; }
-
armor
- if true, the resulting output will be armored. -
armorDescriptor
- when armored, the value will be used in the header and footer of the output -
wrap
- if provided, wrap after n characters
-
BaseEmoji.decode()
Usage:
const result = BaseEmoji.decode(data, options);
Parameters:
-
data
(required) - A base-emoji encoded string -
options
(optional) - an object with following structure; all keys are optional:{ output: 'string' | 'binary' }
-
output
- return the output as String, ifstring
, or as Uint8Array, ifbinary
-
How does it work
The prinicple is identical to that of base64. In base64, data bits are rearranged from their original 8-tuple bytes into 6-tuples, of which there are 64, and each of these 6-tuples is then represented with one ascii character.
bytes | 104 = h | 105 = i | 33 = ! | ...
DATA |0 1 1 0 1 0.0 0'0 1 1 0.1 0 0 1'0 0.1 0 0 0 0 1| ...
base64 | 26 = a | 6 = G | 36 = k | 33 = h | ...
Therefore, the base64 representation of hi!
is aGkh
.
In base-emoji, 1024 different symbols are used for representing 10-tuples.
bytes | 104 = h | 105 = i | 33 = ! | ...
DATA |0 1 1 0 1 0 0 0'0 1.1 0 1 0 0 1'0 0 1 0.0 0 0 1'0 0 0 0 0 0.0 ...
base-emoji | 417 = 🍒 | 658 = 🌒 | 64 = 😟 | ...
The complete list of emojis is located in emoji-map.json
Padding
Since 10 quite obviously doesn't divide evenly into 8, base-emoji-encoded data
contains a few bits more of information at the end than the original data. In
case of above example, the base-emoji encoded representation of the string
hi!
has 6 bits of information overhanging. This is important to know
especially once there are is an overhang of 8 bits, because then it would
otherwise be ambiguous whether the last 8 bits are a byte of the original
information or not.
To indicate the length of the overhang, following symbols are appended to the end of the base-emoji encoded string:
Padding character | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Bits of overhang | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
Whereas the padding character for 0 bits of overhang is optional, and the characters for 1, 3, 5, 7 and 9 bits can't realistically occur.
In above example, there are six bits of overhang, meaning the emoji
representation receives the padding character hi!
is
Efficiency
All that being said, base-emoji is horribly inefficient at encoding data.
In base64, where every 6-tuple of bits is encoded in one ascii character of one byte, the encoded data size is 4/3 times the original data size, i.e. around 33.3% larger.
In base-emoji, we use 1024 symbols to encode 10-tuples, however, these 1024 symbols are Unicode! An exact number can't be given due to unicode characters being of variable size, but a quick test with 1000 random bytes showed a threefold increase.
head -c 1000 /dev/urandom | base64 | wc -c
==> 1354
head -c 1000 /dev/urandom | base32 | wc -c
==> 1622
head -c 1000 /dev/urandom | base-emoji | wc -c
==> about 3175