web crawler, parser and scraper with storage capabilities
published version 0.3.8, 7 years agoResponsible for handling toolbar integration. Opens the admin front-end into a new tab.
published version 0.3.0, 5 years agoContains common typescript definitions and utilities used throughout the lerna monorepo.
published version 0.4.1, 4 years agoExtract Resources scenario is used for extracting various resources from the corresponding sites.
published version 0.4.1, 4 years agoHandles the crawling and scraping logic.
published version 0.4.1, 4 years agoReact based frontend communicating with the backend (extension background script) via chrome.runtime.sendMessage. Bundled as a single page app with the help of react-router-dom.
published version 0.4.1, 4 years agoBundles the entire monorepo sub-packges into a valid extension folder.
published version 0.4.1, 4 years agoExtract Html Headings scenario is used for extracting H1, H2, H3, H4, H5, H6 text content.
published version 0.1.0, 6 years agoExtract html headings (H1 - H6) content.
published version 0.2.0, 5 years agoExtract Html Content scenario is used for extracting html nodes text based on dom selectors.
published version 0.1.2-rc.2, 5 years ago- published version 0.4.1, 4 years ago
Extract article content using Mozilla Readability library.
published version 0.2.0, 5 years agoextracts text and binary content from dynamic (javascript) pages based on CSS selectors
published version 0.4.1, 4 years agoextracts text and binary content from static html pages based on CSS selectors
published version 0.4.1, 4 years agoscraping test definitions, launches resources to be scraped under a configurable web server
published version 0.8.0, 3 years agoPlugin based node.js web scraper. It scrapes, stores and exports data. Supports multiple storage options: SQLite, MySQL, PostgreSQL. Supports multiple browser or dom-like clients: Puppeteer, Playwright, Cheerio, Jsdom.
published version 0.11.0, 3 years ago