cspell-grammar
TypeScript icon, indicating that this package has built-in type declarations

8.7.0 • Public • Published

cspell-grammar

CSpell Grammar is used to generate a parser. The Parser is used to add context / scope to parts of a document, making it easier to define the parts to spell spell checked.

This is to address the issues and limitations related to ignoreRegExpList and includeRegExpList.

The parser is use to add scope to sections of a document. The scope can then be used to apply spell checking rules.

Example: Only check comments and strings

rules:
  '*': false
  comment: true
  string: true

It can be even more powerful like controlling the language settings based upon scope.

rules:
  comment:
    language: en
  string:
    language: en,fr
    dictionaries: ['marketing-terms'],
    caseSensitive: true
  string.javascript:
    caseSensitive: false

Rules are applied in the order they match the scope of the text.

When checking JavaScript files with the above example rules:

  • strings will:
    • use the locale en,fr
    • the marketing-terms dictionary will be enabled
    • caseSensitive will be true
  • everything else:
    • locale: en
    • caseSensitive will be false

At its core, cspell-grammar uses a simplified form of the TextMate grammar.

Reasoning

Why use a grammar parser? Couldn't a colorizer / highlighter or a language AST be used? At one level, needs of the spell checker are simpler and different from colorizers or language AST parsers. The goal of a spell checker is to spell check relevant text. The spell check does not need to care about the syntactical correctness of a document or presentation.

The goal of a grammar parser for the spell checker is to allow the user to decide:

  1. What text should be checked.
  2. Which dictionaries (or languages) should be used.
  3. Are accents and case important

Note: CSpell is a pure JavaScript application, so including the Oniguruma is not an option.

Considerations

  • Parsing a document should be fast - meaning the grammar should be as simple as possible to meet the needs of the spell checker and not focus on scope detail. This is where a colorizer grammar is not a good fit to be used.
  • AST's are a bit of an overkill for a spell checker. They provide too much detail while not bringing much benefit from the detail.

Transformation

Consider the following bit of LaTeX:

k\"{o}nnen
können

For the spell checker to work correctly, the \"{o} should be transformed into ö before it is checked against the German dictionary.

This creates a few challenges.

Possible options:

  1. Simple whole document substitution
    • Challenges
      • It is not context aware and might replace the wrong text.
      • It changes the location of the words and messes up issue reporting (some sort of Map would be needed to get the correct line / character offset).
  • Advantages
    • Easy to implement except for the context and mapping.
  1. Scope level substitution Transformations occur at the scope level.
    • Challenges
      • offset mapping is still and issue (maybe)
      • need a way to merge text with adjacent scopes after transformation
    • Advantages
      • it is context aware

Readme

Keywords

Package Sidebar

Install

npm i cspell-grammar

Weekly Downloads

350,237

Version

8.7.0

License

MIT

Unpacked Size

73.3 kB

Total Files

62

Last publish

Collaborators

  • jason-dent