super-simple-crawler

1.2.4 • Public • Published

Super Simple Crawler travisCI

A super simple crawler for crawling websites and reporting back stats.

Installation

npm install -S super-simple-crawler

Usage

import simpleCrawler from 'super-simple-crawler';

const crawler = simpleCrawler({ url: 'http://madole.xyz' });

crawler.on('response', {status, responseTime, body, size} => {
    console.log(status);
    console.log(responseTime);
    console.log(depthLimit);
    console.log(size);
});

crawler.on('done', () => {
    console.log('Finished crawling');
});

Parameters

simpleCrawler takes an object as a parameter.

  • url - string: the url to crawl
  • maxDepthLimit - number: the depth which to crawl, defaults to 2

Events

response

  • status - string: the response status (HTTP Code)
  • responseTime - number: the time taken for the server to respond to the request
  • depthLimit - number: the depth which the URL features in the site
  • size - number: the size, in bytes, of the response
  • path - string: the path of the url eg. '/glendalough-double-barrel'
  • url - string: the full url of eg. 'http://whiskeynerds.com/glendalough-double-barrel/'
  • response - object: the whole response object

done

The done event is fired when there are either no more urls to crawl, or the maximum depth limit has been reached.

/super-simple-crawler/

    Package Sidebar

    Install

    npm i super-simple-crawler

    Weekly Downloads

    2

    Version

    1.2.4

    License

    ISC

    Last publish

    Collaborators

    • madole