regexp-tree-cli
Regular expressions processor in JavaScript
The regexp-tree-cli
is a command line utility fo the regexp-tree package.
Table of Contents
- Installation
- Development
- Usage as a CLI
- Parsing and printing AST
- Capturing locations
- Using optimizer API
- Using compat-transpiler API
- Printing NFA/DFA tables
Installation
The CLI can be installed as an npm module:
npm install -g regexp-tree-cli
regexp-tree-cli --help
Development
- Fork https://github.com/DmitrySoshnikov/regexp-tree-cli repo
- If there is an actual issue from the issues list you'd like to work on, feel free to assign it yourself, or comment on it to avoid collisions (open a new issue if needed)
- Make your changes
- Submit a PR
Usage as a CLI
Check the options available from CLI:
regexp-tree-cli --help
Usage: regexp-tree-cli [options]
Options:
--help, -h Show help [boolean]
--version, -v Show version number [boolean]
--expression, -e A regular expression to be parsed [required]
--loc, -l Whether to capture AST node locations
--optimize, -o Apply optimizer on the passed expression
--compat, -c Apply compat-transpiler on the passed expression
--table, -t Print NFA/DFA transition tables (nfa/dfa/all)
[choices: "nfa", "dfa", "all"]
Parsing and printing AST
To parse a regular expression, pass -e
option:
regexp-tree-cli -e '/a|b/i'
Which produces an AST node corresponding to this regular expression:
{
type: 'RegExp',
body: {
type: 'Disjunction',
left: {
type: 'Char',
value: 'a',
symbol: 'a',
kind: 'simple',
codePoint: 97
},
right: {
type: 'Char',
value: 'b',
symbol: 'b',
kind: 'simple',
codePoint: 98
}
},
flags: 'i',
}
NOTE: the format of a regexp is
/ Body / OptionalFlags
.
Capturing locations
For source code transformation tools it might be useful also to capture locations of the AST nodes. From the command line it's controlled via the -l
option:
regexp-tree-cli -e '/ab/' -l
This attaches loc
object to each AST node:
{
type: 'RegExp',
body: {
type: 'Alternative',
expressions: [
{
type: 'Char',
value: 'a',
symbol: 'a',
kind: 'simple',
codePoint: 97,
loc: {
start: {
line: 1,
column: 1,
offset: 1,
},
end: {
line: 1,
column: 2,
offset: 2,
},
}
},
{
type: 'Char',
value: 'b',
symbol: 'b',
kind: 'simple',
codePoint: 98,
loc: {
start: {
line: 1,
column: 2,
offset: 2,
},
end: {
line: 1,
column: 3,
offset: 3,
},
}
}
],
loc: {
start: {
line: 1,
column: 1,
offset: 1,
},
end: {
line: 1,
column: 3,
offset: 3,
},
}
},
flags: '',
loc: {
start: {
line: 1,
column: 0,
offset: 0,
},
end: {
line: 1,
column: 4,
offset: 4,
},
}
}
Using optimizer API
Optimizer transforms your regexp into an optimized version, replacing some sub-expressions with their idiomatic patterns. This might be good for different kinds of minifiers, as well as for regexp machines.
NOTE: the Optimizer is implemented as a set of regexp-tree plugins.
From CLI the optimizer is available via --optimize
(-o
) option:
regexp-tree-cli -e '/[a-zA-Z_0-9][A-Z_\da-z]*\e{1,}/' -o
Result:
Optimized: /\w+e+/
See the optimizer README for more details.
Using compat-transpiler API
The compat-transpiler module translates your regexp in new format or in new syntax, into an equivalent regexp in a legacy representation, so it can be used in engines which don't yet implement the new syntax.
NOTE: the compat-transpiler is implemented as a set of regexp-tree plugins.
Example, "dotAll" s
flag:
/./s
Is translated into:
/[\0-\uFFFF]/
/(?<value>a)\k<value>\1/
Becomes:
/(a)\1\1/
From CLI the compat-transpiler is available via --compat
(-c
) option:
regexp-tree-cli -e '/(?<all>.)\k<all>/s' -c
Result:
Compat: /([\0-\uFFFF])\1/
Printing NFA/DFA tables
The --table
option allows displaying NFA/DFA transition tables. RegExp Tree also applies DFA minimization (using N-equivalence algorithm), and produces the minimal transition table as its final result.
In the example below for the /a|b|c/
regexp, we first obtain the NFA transition table, which is further converted to the original DFA transition table (down from the 10 non-deterministic states to 4 deterministic states), and eventually minimized to the final DFA table (from 4 to only 2 states).
./bin/regexp-tree-cli -e '/a|b|c/' --table all
Result:
> - starting
✓ - accepting
NFA transition table:
┌─────┬───┬───┬────┬─────────────┐
│ │ a │ b │ c │ ε* │
├─────┼───┼───┼────┼─────────────┤
│ 1 > │ │ │ │ {1,2,3,7,9} │
├─────┼───┼───┼────┼─────────────┤
│ 2 │ │ │ │ {2,3,7} │
├─────┼───┼───┼────┼─────────────┤
│ 3 │ 4 │ │ │ 3 │
├─────┼───┼───┼────┼─────────────┤
│ 4 │ │ │ │ {4,5,6} │
├─────┼───┼───┼────┼─────────────┤
│ 5 │ │ │ │ {5,6} │
├─────┼───┼───┼────┼─────────────┤
│ 6 ✓ │ │ │ │ 6 │
├─────┼───┼───┼────┼─────────────┤
│ 7 │ │ 8 │ │ 7 │
├─────┼───┼───┼────┼─────────────┤
│ 8 │ │ │ │ {8,5,6} │
├─────┼───┼───┼────┼─────────────┤
│ 9 │ │ │ 10 │ 9 │
├─────┼───┼───┼────┼─────────────┤
│ 10 │ │ │ │ {10,6} │
└─────┴───┴───┴────┴─────────────┘
DFA: Original transition table:
┌─────┬───┬───┬───┐
│ │ a │ b │ c │
├─────┼───┼───┼───┤
│ 1 > │ 4 │ 3 │ 2 │
├─────┼───┼───┼───┤
│ 2 ✓ │ │ │ │
├─────┼───┼───┼───┤
│ 3 ✓ │ │ │ │
├─────┼───┼───┼───┤
│ 4 ✓ │ │ │ │
└─────┴───┴───┴───┘
DFA: Minimized transition table:
┌─────┬───┬───┬───┐
│ │ a │ b │ c │
├─────┼───┼───┼───┤
│ 1 > │ 2 │ 2 │ 2 │
├─────┼───┼───┼───┤
│ 2 ✓ │ │ │ │
└─────┴───┴───┴───┘