table-scraper
Simple utility for scraping data from html tables on a given website into a list of javascript objects.
installation
npm install --save table-scraper
methods
get(url)
Returns a promise that resolves to a list of tables found on the input website. HTML table rows are converted to javascript objects
For example: suppose the website at http://www.some-fake-url.com
consisted of the following:
StateCapital CityPop. MinnesotaSaint Paul3 New YorkAlbanyEight Million
The following code would result in the array displayed below:
var scraper = ;scraper ;
Important to note: the tableData
returned is a list of lists. So, if some-fake-url.com
contained three tables, the structure of the response would look like
/* list of data from the first table */ /* list of data from the second table */ /* list of data from the third table */
If a table has NO headings (no <th>
elements), the object keys are simply the column index:
'0': <first column data of first row> '1': <second column data of first row> ...
Contributing
Feedback/PRs welcome! Please include tests around any new functionality, and make sure existing tests pass:
npm test
Credits
The following node libraries make this utility super easy: