Chai

Chai is a simple web crawler that scrapes relevant SEO data from each page it visits.

Usage

npm install @dschnare/chai -g
chai http://mywebsite.com > crawl.json

Chai will scrape the following data from each page it visits.

The scrape data written to stdout is a JSON array of objects with the following shape:

{ title, url, headings: { h1: [], h2: [] } }

For URLs that respond with an error the scrape object has this shape:

{ url, statusCode, error }

Where error is the error object returned from Superagent.