pársz
- A tool for parsing the web
Usage
Install globally from npm/yarn
$ npm install -g parsz
View options from help menu
$ parsz --help
Use a "parselet" as a recipe/filter to parse a website.
The structure of the parselet is JSON.
Here is an example of a parselet for grabbing business data from a Yelp page:
As a module
You can also use parsz as a module:
; ;
Tips
This is a very general purpose and flexible tool. But here are some tips for getting started.
Grabbing a list of data
Use a reference selector in the key and an Array as the value.
Use transformation functions on data
Add a pipe (|) and the transformation name after the data selector.
If anyone would like to see a certain, helpful transformation function added, please just open a issue
Grabbing an attribute
Use a (@) symbol to reference an attribute.
Grabbing remote data
Use a (~) and a link selector to reference external content. The mapping (value) will be relative to that new external scope.
Have fun!