@dschnare/chai

2.0.1 • Public • Published

Chai

Build Status Code Climate Test Coverage npm version License

Chai is a simple web crawler that scrapes relevant SEO data from each page it visits.

Usage

npm install @dschnare/chai -g
chai http://mywebsite.com > crawl.json

Scraping

Chai will scrape the following data from each page it visits.

  • Page title
  • All H1 values
  • All H2 values

The scrape data written to stdout is a JSON array of objects with the following shape:

{ title, url, headings: { h1: [], h2: [] } }

For URLs that respond with an error the scrape object has this shape:

{ url, statusCode, error }

Where error is the error object returned from Superagent.

Roadmap

  • Expose way to filter out URLs to be crawled
  • Expose way to customize the scraper
  • Make it easier to identify 404 URLs
  • Add option to control verbosity

Readme

Keywords

none

Package Sidebar

Install

npm i @dschnare/chai

Weekly Downloads

3

Version

2.0.1

License

MIT

Last publish

Collaborators

  • dschnare