normalize-html-table
Normalization of DOM table rows - creates a matrix with duplicate cells based on rowspan and colspan.
Handy for scraping and parsing of Wikipedia tables.
- Vanilla DOM - no dependencies.
- Does one job only - rowspan and colspan.
Usage
npm i @eirikb/normalize-html-table
import normalizeHtmlTable from '@eirikb/normalize-html-table';
const table = document.querySelector('table');
const rows = normalizeHtmlTable(table);
console.log(rows);
This will return a matrix of rows and cells. Each cell contains the td
element.
Each row will have a property row
attached to them, in case you need to reference the original tr
element.
E.g.,
console.log(rows[0].row); // tr element
Notes
This library will not:
- Map your table to a JavaScript object.
- Do anything with your headers.
- Convert cells to text.
- Support older browsers (you must transpile it).
All above can be solved by you, and does not fit into this library. E.g., converting to JavaScript object with cells turned into text can be done like this:
function tableToJson(table) {
const headers = [...table.querySelectorAll('th')].map(th => th.textContent.trim());
return normalizeHtmlTable(table).map(row =>
headers.reduce((res, header, index) => {
res[header] = row[index].textContent.trim();
return res;
}, {})
);
}
For nodejs support use jsdom.