parse-dutch

Natural language parser, for the Dutch language, that produces nlcst.

What is this?
When should I use this?
Install
Use
API
- ParseDutch()
Algorithm
Types
Compatibility
Security
Related
Contribute
License

What is this?

This package exposes a parser that takes Dutch natural language and produces a syntax tree.

When should I use this?

If you want to handle Dutch natural language as syntax trees manually, use this.

Alternatively, you can use the retext plugin retext-dutch, which wraps this project to also parse natural language at a higher-level (easier) abstraction.

For English or most Latin-script languages, you can instead use parse-english or parse-latin.

Install

This package is ESM only. In Node.js (version 16.0+), install with npm:

npm install parse-dutch

In Deno with esm.sh:

import {ParseDutch} from 'https://esm.sh/parse-dutch@7'

In browsers with esm.sh:

<script type="module">
  import {ParseDutch} from 'https://esm.sh/parse-dutch@7?bundle'
</script>

Use

import {inspect} from 'unist-util-inspect'
import {ParseDutch} from 'parse-dutch'

const tree = new ParseDutch().parse(
  'Kunt U zich ’s morgens melden bij het afd. hoofd dhr. Venema?'
)

console.log(inspect(tree))

Yields:

RootNode[1] (1:1-1:62, 0-61)
└─0 ParagraphNode[1] (1:1-1:62, 0-61)
    └─0 SentenceNode[24] (1:1-1:62, 0-61)
        ├─0  WordNode[1] (1:1-1:5, 0-4)
        │    └─0 TextNode "Kunt" (1:1-1:5, 0-4)
        ├─1  WhiteSpaceNode " " (1:5-1:6, 4-5)
        ├─2  WordNode[1] (1:6-1:7, 5-6)
        │    └─0 TextNode "U" (1:6-1:7, 5-6)
        ├─3  WhiteSpaceNode " " (1:7-1:8, 6-7)
        ├─4  WordNode[1] (1:8-1:12, 7-11)
        │    └─0 TextNode "zich" (1:8-1:12, 7-11)
        ├─5  WhiteSpaceNode " " (1:12-1:13, 11-12)
        ├─6  WordNode[2] (1:13-1:15, 12-14)
        │    ├─0 PunctuationNode "’" (1:13-1:14, 12-13)
        │    └─1 TextNode "s" (1:14-1:15, 13-14)
        ├─7  WhiteSpaceNode " " (1:15-1:16, 14-15)
        ├─8  WordNode[1] (1:16-1:23, 15-22)
        │    └─0 TextNode "morgens" (1:16-1:23, 15-22)
        ├─9  WhiteSpaceNode " " (1:23-1:24, 22-23)
        ├─10 WordNode[1] (1:24-1:30, 23-29)
        │    └─0 TextNode "melden" (1:24-1:30, 23-29)
        ├─11 WhiteSpaceNode " " (1:30-1:31, 29-30)
        ├─12 WordNode[1] (1:31-1:34, 30-33)
        │    └─0 TextNode "bij" (1:31-1:34, 30-33)
        ├─13 WhiteSpaceNode " " (1:34-1:35, 33-34)
        ├─14 WordNode[1] (1:35-1:38, 34-37)
        │    └─0 TextNode "het" (1:35-1:38, 34-37)
        ├─15 WhiteSpaceNode " " (1:38-1:39, 37-38)
        ├─16 WordNode[2] (1:39-1:43, 38-42)
        │    ├─0 TextNode "afd" (1:39-1:42, 38-41)
        │    └─1 PunctuationNode "." (1:42-1:43, 41-42)
        ├─17 WhiteSpaceNode " " (1:43-1:44, 42-43)
        ├─18 WordNode[1] (1:44-1:49, 43-48)
        │    └─0 TextNode "hoofd" (1:44-1:49, 43-48)
        ├─19 WhiteSpaceNode " " (1:49-1:50, 48-49)
        ├─20 WordNode[2] (1:50-1:54, 49-53)
        │    ├─0 TextNode "dhr" (1:50-1:53, 49-52)
        │    └─1 PunctuationNode "." (1:53-1:54, 52-53)
        ├─21 WhiteSpaceNode " " (1:54-1:55, 53-54)
        ├─22 WordNode[1] (1:55-1:61, 54-60)
        │    └─0 TextNode "Venema" (1:55-1:61, 54-60)
        └─23 PunctuationNode "?" (1:61-1:62, 60-61)

API

This package exports the identifier ParseDutch. There is no default export.

`ParseDutch()`

Create a new parser.

ParseDutch extends ParseLatin. See parse-latin for API docs.

Algorithm

All of parse-latin is included, and the following support for the Dutch natural language:

unit and time abbreviations (gr., sec., min., ma., vr., vrij., febr., mrt., and more)
lots of abbreviations: (Mr., Mv., Sr., Em., bijv., zgn., amb., and more)
common elision (omission of letters) (d’, ’n, ’ns, ’t, ’s, ’er, ’em, ’ie, and more)

Types

This package is fully typed with TypeScript. It exports no additional types.

Compatibility

Projects maintained by me are compatible with maintained versions of Node.js.

When I cut a new major release, I drop support for unmaintained versions of Node. This means I try to keep the current release line, parse-dutch@^7, compatible with Node.js 16.

Security

This package is safe.

parse-latin — Latin-script natural language parser
parse-english — English natural language parser

Contribute

Yes please! See How to Contribute to Open Source.

parse-dutch

parse-dutch

Contents

What is this?

When should I use this?

Install

Use

API

`ParseDutch()`

Algorithm

Types

Compatibility

Security

Related

Contribute

License

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

Weekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

parse-dutch

parse-dutch

Contents

What is this?

When should I use this?

Install

Use

API

ParseDutch()

Algorithm

Types

Compatibility

Security

Related

Contribute

License

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

DownloadsWeekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

`ParseDutch()`

Weekly Downloads