wink-ner
Language agnostic named entity recognizer
Recognize named entities in a sentence using wink-ner
. It is a smart Gazetteer-based Named Entity Recognizer (NER), which can be easily trained to suite specific needs. For example, the wink-ner can differentiate between Manchester United
& Manchester
in a single sentence and tag them as a club and city respectively.
Installation
Use npm to install:
npm install wink-ner --save
Getting Started
Named Entity Recognition
// Load wink ner.var ner = ;// Create your instance of wink ner & use default config.var myNER = ;// Define training data.var trainingData = text: 'manchester united' entityType: 'club' uid: 'manu' text: 'manchester' entityType: 'city' text: 'U K' entityType: 'country' uid: 'uk' ;// Learn from the training data.myNER;// Since recognize() requires tokens, use wink-tokenizer.var winkTokenizer = ;// Instantiate it and extract tokenize() api.var tokenize = tokenize;// Tokenize the sentence.var tokens = ;// Simply Detect entities!tokens = myNER;console;// -> [// { entityType: 'club', uid: 'manu', originalSeq: [ 'Manchester', 'United' ],// value: 'manchester united', tag: 'word' },// { value: 'is', tag: 'word' },// { value: 'a', tag: 'word' },// { value: 'football', tag: 'word' },// { value: 'club', tag: 'word' },// { value: 'based', tag: 'word' },// { value: 'in', tag: 'word' },// { entityType: 'city', value: 'Manchester', tag: 'word',// originalSeq: [ 'Manchester' ], uid: 'manchester' },// { value: ',', tag: 'punctuation' },// { entityType: 'country', uid: 'uk', originalSeq: [ 'U', '.', 'K' ],// value: 'u k', tag: 'word' },// { value: '.', tag: 'punctuation' }// ]
Integration with POS Tagging
The tokens
returned from recognize()
may be further passed down to tag()
api of wink-pos-tagger
for pos tagging.
Just in case you need to assign a specific pos tag to an entity, the same can be achieved by including a property pos
in the entity definition and assigning it the desired pos tag (e.g. 'NNP'
); the wink-pos-tagger will automatically do the needful. For details please refer to learn()
api of wink-ner.
// Load pos tagger.var tagger = ;// Instantiate it and extract tag api.var tag = tag;tokens = ;console;// -> [ { entityType: 'club', uid: 'manu', originalSeq: [ 'Manchester', 'United' ],// value: 'manchester united', tag: 'word', normal: 'manchester united', pos: 'NNP' },// { value: 'is', tag: 'word', normal: 'is', pos: 'VBZ', lemma: 'be' },// { value: 'a', tag: 'word', normal: 'a', pos: 'DT' },// { value: 'football', tag: 'word', normal: 'football', pos: 'NN', lemma: 'football' },// { value: 'club', tag: 'word', normal: 'club', pos: 'NN', lemma: 'club' },// { value: 'based', tag: 'word', normal: 'based', pos: 'VBN', lemma: 'base' },// { value: 'in', tag: 'word', normal: 'in', pos: 'IN' },// { value: 'Manchester', tag: 'word', originalSeq: [ 'Manchester' ],// uid: 'manchester', entityType: 'city', normal: 'manchester', pos: 'NNP' },// { value: ',', tag: 'punctuation', normal: ',', pos: ',' },// { entityType: 'country', uid: 'uk', originalSeq: [ 'U', '.', 'K' ],// value: 'u k', tag: 'word', normal: 'u k', pos: 'NNP' },// { value: '.', tag: 'punctuation', normal: '.', pos: '.' }// ]
Documentation
Check out the named entity recognizer API documentation to learn more.
Need Help?
If you spot a bug and the same has not yet been reported, raise a new issue or consider fixing it and sending a pull request.
About wink
Wink is a family of open source packages for Statistical Analysis, Natural Language Processing and Machine Learning in NodeJS. The code is thoroughly documented for easy human comprehension and has a test coverage of ~100% for reliability to build production grade solutions.
Copyright & License
wink-ner is copyright 2017-20 GRAYPE Systems Private Limited.
It is licensed under the terms of the MIT License.