super-simple-crawler

1.2.4 • Public • Published

Super Simple Crawler travisCI

A super simple crawler for crawling websites and reporting back stats.

Installation

npm install -S super-simple-crawler

Usage

import simpleCrawler from 'super-simple-crawler';

const crawler = simpleCrawler({ url: 'http://madole.xyz' });

crawler.on('response', {status, responseTime, body, size} => {
    console.log(status);
    console.log(responseTime);
    console.log(depthLimit);
    console.log(size);
});

crawler.on('done', () => {
    console.log('Finished crawling');
});

Parameters

simpleCrawler takes an object as a parameter.

  • url - string: the url to crawl
  • maxDepthLimit - number: the depth which to crawl, defaults to 2

Events

response

  • status - string: the response status (HTTP Code)
  • responseTime - number: the time taken for the server to respond to the request
  • depthLimit - number: the depth which the URL features in the site
  • size - number: the size, in bytes, of the response
  • path - string: the path of the url eg. '/glendalough-double-barrel'
  • url - string: the full url of eg. 'http://whiskeynerds.com/glendalough-double-barrel/'
  • response - object: the whole response object

done

The done event is fired when there are either no more urls to crawl, or the maximum depth limit has been reached.

Dependencies (6)

Dev Dependencies (15)

Package Sidebar

Install

npm i super-simple-crawler

Weekly Downloads

0

Version

1.2.4

License

ISC

Last publish

Collaborators

  • madole