WikiScraper.js
Get data off Wikipedia. Fast.
Wikiscraper is both a library, and a command line utility.
$ wikiscraper Helium Hydrogen Oxygen Lithium## or $ echo '["Helium", "Hydrogen", "Oxygen", "Lithium"]' | wikiscraper
A JSON array of Wikipedia sites will be returned:
$ wikiscraper markdown[ ]
Install
$ npm install -g wikiscraper
About
This tool will get the contents of Wikipedia's infobox (.infobox class), as a JSON object.
We needed to scrape Wikipedia's data on chemical elements for Mendeleev.io), so this was made.
Usage in JavaScript
WikiScraper should initialized with a array of pages on Wikipedia.
new WikiScraper(["JavaScript"])
or WikiScraper.selectSites(["JavaScript"])
This will translate to the URL en.wikipedia.org/wiki/JavaScript
The language of the page can be changed.
WikiScraper.setLanguage("de")
will scrape de.wikipedia.org/wiki/JavaScript
.
For every request, the callback will be called. This allows faster processing and is more secure than returning a big array.
var WikiScraper = ; var wikiscraper = "Helium" "Hydrogen" "Oxygen" "Lithium"; /* Callback Style */ var elements = ; wikiscraper; wikiscraper; /* Event style */ wikiscraper; wikiscraper; wikiscraper; wikiscraper;