yan-crawler

0.2.1 • Public • Published

Overview

Simple module which allows you to poll websites at regular intervals and extract whatever information you want from the response. Strictly speaking, it's not a crawler. If you are looking for one, there are some quite popular alternatives out there like node-crawler.

Installation

npm install yan-crawler

Usage

var Crawler = require('yan-crawler').Crawler;
var crawler = Crawler.getInstance();

var amazonTemplate = {
    name: 'Amazon',
    url: 'https://www.amazon.com/',
    interval: 3000,
    callback: function(body, $) {
        // $ is cheerio - https://github.com/cheeriojs/cheerio
        console.log("Grabbed Amazon.");
    }
};

var IMDBTemplate = {
    name: 'IMDB',
    interval: 2000,
    url: 'http://www.imdb.com',
    callback: function(body, $) {
        console.log('Grabbed IMDB.');
    }
};

crawler.addEntry(amazonTemplate);
crawler.addEntry(IMDBTemplate);
crawler.start();

The code above will make requests to www.amazon.com every 3000ms and to www.imdb.com every 2000ms, calling their respective callbacks when it gets the results.

License

MIT

Readme

Keywords

Package Sidebar

Install

npm i yan-crawler

Weekly Downloads

1

Version

0.2.1

License

MIT

Last publish

Collaborators

  • cgavrila