@lolojs/htmlindexer

This is a library for indexing a document or extracting unique non stopwords tokens and getting their frequency

For indexing call the function IndexDocument and listen for the finish event when indexing completed and also you can access extracted token using the tokens property and it is a Map data structure


const HtmlIndexer =require('./htmlIndexer');
const indexer = new HtmlIndexer();
indexer.IndexDocument("tests/test.html");
    indexer.on("indexFinished", () => {
        for (var key of indexer.tokens.keys()) {
            console.log(`Term : ${key}    Frequency : ${indexer.tokens.get(key)}`);
        }
    });

You can access generated tokens with using stream with getOutPutStream passing chunk size or number of tokens

per chunk and the output is json based with format { term: 'test', freq: 1, isFirstChunk: true, isLastChunk: true }


            var stream =indexer.getOutPutStream(2);
        stream.on('data',(data)=>console.log(data));

@lolojs/htmlindexer

This is a library for indexing a document or extracting unique non stopwords tokens and getting their frequency

For indexing call the function IndexDocument and listen for the finish event when indexing completed and also you can access extracted token using the tokens property and it is a Map data structure

You can access generated tokens with using stream with getOutPutStream passing chunk size or number of tokens

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

Weekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

@lolojs/htmlindexer

This is a library for indexing a document or extracting unique non stopwords tokens and getting their frequency

For indexing call the function IndexDocument and listen for the finish event when indexing completed and also you can access extracted token using the tokens property and it is a Map data structure

You can access generated tokens with using stream with getOutPutStream passing chunk size or number of tokens

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

DownloadsWeekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

Weekly Downloads