tito

0.6.1 • Public • Published

tito

tito is a Node.js module and command-line utility for translating between tabular text streams in formats such as CSV, TSV, JSON and HTML tables. It stands for Tables In, Tables Out.

Formats

  • JSON: structured with JSONPath queries or newline-delimited (the default for input and output).
  • Comma-, tab-, and otherwise-delimited text, with support for custom column and row delimiters.
  • HTML tables, with support for targeted parsing with CSS selectors and formatted output.

Installation

Install it with npm:

npm install -g tito

Examples

Here are some examples of what tito can do:

Convert CSV to TSV

Use the --read and --write options to set the read and write formats:

tito --read csv data.csv --write tsv data.tsv

Or pipe data into and out of tito via stdio:

cat data.csv | tito --read csv --write tsv > data.tsv
Turn HTML tables into CSV

tito's html reader uses a streaming HTML parser and can target tables with CSS selectors:

curl -s "http://www.federalreserve.gov/releases/h15/current/" \
  | tito --read.format html --read.selector 'table.statistics' --write csv \
  > interest-rates.csv
Import structured JSON data from a URL into dat

tito can take structured JSON like this:

{
  "results": [
    { /* ... */ },
    // etc.
  ]
}

and turn it into newline-delimited JSON. Just set --read.format to json and --read.path to the JSONPath expression of your data elements. For the structure above, which is common to many REST APIs, you would use results.*. You could then use the following to import data from one such API into dat:

curl -s http://api.data.gov/some-data \
  | tito --read.format json --read.path 'results.*' \
  | dat import
Map and filter your data

The tito --map and --filter options allow you to perform streaming transformations on your data. Both options can either be specified as fof-compatible expressions or filenames.

tito --filter 'd => d.Year > 2000' \
  --map 'd => {{year: d.Year, region: d.Region, revenue: +d.Revenue}}' \
  --read csv data.csv

If you specify an existing filename for either --map or --filter, it will be require()d and its value passed to fof(). This means that you can specify map and filter transformations in JSON or JavaScript, e.g.:

{
  year: 'd => +d.Year',
  region: 'Region',
  revenue: 'd => +d.Revenue'
}

then, you could use this transformation with:

tito --map ./transform.json \
  --read csv --write json input.csv > output.json

Usage

This is the output of tito --help formats:

tito [options] [input] [output]

Options:
  --read, -r     the input format (see below)        [default: "ndjson"]
  --write, -w    the output format (see below)       [default: "ndjson"]
  --in, -i       the input filename                                     
  --out, -o      the output filename                                    
  --filter, -f   filter input by this data expression           [string]
  --map, -m      map input to this data expression              [string]
  --help, -h     Show this help message.                                
  --version, -v  Print the version and exit                             

Formats:

The following values may be used for the input and output format
options, --read/-r or --write/-w:

  tito --read csv --write tsv
  tito -r csv -w tsv

If you wish to specify format options, you must use the dot notation:

  tito --read.format csv --read.delim=, data.csv
  tito -r.format json -r.path='results.*' data.json
  tito data.ndjson | tito -w.format html -w.indent='  '

"csv": Read and write comma-separated (or otherwise-delimted) text
  Options:
  - "delimiter", "delim", "d": The field delimiter
  - "newline", "line", "n": The row delimiter
  - "quote", "q": The quote character

"tsv": Read and write tab-separated values
  Options:
  - "headers": 
  - "newline", "line", "n": The line separator character sequence

"ndjson": Read and write newline-delimted JSON
  Options:

"json": Read and write arrays from streaming JSON
  Options:
  - "path", "p": The JSONPath selector containing the data (read-only)
  - "open", "o": Output this string before streaming items (write-only)
  - "separator", "sep", "s": Output this string between items (write-only)
  - "close", "c": Output this string after writing all items (write-only)

"html": Read and write data from HTML tables
  Options:
  - "selector", "s": the CSS selector of the table to target (read-only)
  - "indent", "i": indent HTML with this string (write-only)

Dependencies (10)

Dev Dependencies (0)

    Package Sidebar

    Install

    npm i tito

    Weekly Downloads

    12

    Version

    0.6.1

    License

    CC0-1.0

    Last publish

    Collaborators

    • shawnbot