scrape-feed
TypeScript icon, indicating that this package has built-in type declarations

1.1.0 • Public • Published

scrape-feed

npm version CircleCI ISC License

Reads the contents of JSON, RSS, and Atom feeds from a URL.

Installation

npm install scrape-feed

Usage

Simple use

const { scrapeFeed } = require("scrape-feed")

const feed = await scrapeFeed("https://www.mattmoriarity.com/feed.json")

feed will have information pulled from the feed. See ScrapedFeed in src/index.ts for the structure of feed here.

scrape-feed supports JSON Feed as well as Atom and RSS through feedparser. All feed types produce the same structure, so it's a bit lossy in that way: not all feed information is captured.

Using caching headers

If you are polling feeds regularly and would like to avoid extra work, you can hang on to feed.cachingHeaders and provide it again when you next poll the feed. The caching headers include the Etag and Last-Modified response headers if the response included them. If they are provided when scraping, they will be used to set the If-None-Match and If-Modified-Since request headers, respectively.

A well-behaved server, when given these headers, will return a 304 Not Modified response with no body as long as the content hasn't changed, in which case scrapeFeed will just return null. If you get a null, you can go along your merry way and be happy you didn't waste that bandwidth and those CPU cycles.

const { cachingHeaders } = feed
const feedAgain = await scrapeFeed(
  "https://www.mattmoriarity.com/feed.json",
  cachingHeaders
)
// => null

Readme

Keywords

none

Package Sidebar

Install

npm i scrape-feed

Weekly Downloads

5

Version

1.1.0

License

ISC

Unpacked Size

67.1 kB

Total Files

18

Last publish

Collaborators

  • mmoriarity