(X) Feed Parser
Parse RSS, Atom, JSON Feed, and HTML into a common JSON format. Complete with XML decoding, HTML sanitization, date standardization, media and metadata extraction.
This project is based on the rbren/rss-parser upgraded to ESM with JSDoc types and the addition of features above.
Install
npm install xfp
Usage
import { parse } from 'xfp'
let rawFeedString // XML (RSS/Atom), JSON Feed, or HTML
const feed = parse(rawFeedString)
Running the code above with a valid rawFeedString
returns a response with the following schema:
{
type: 'rss' | 'atom' | 'json' | 'html'
lang?: string
title?: string
description?: string
feedUrl?: string
siteUrl?: string
imageUrl?: string
etag?: string
updatedAt?: string
items?: [{
id?: string
url?: string
lang?: string
title?: string
summary?: string
author?: string
content?: string
snippet?: string
categories?: string[]
commentsUrl?: string
imageUrl?: string
media?: [{
url: string
length?: number
type?: string
}]
createdAt?: string
updatedAt?: string
}]
meta?: {
[key: string]: any // youtube, itunes metadata
}
}
See the test/
folder for complete usage examples.
API
This library exports the parse
function, which is a thin wrapper for parseXmlFeed
, parseJsonFeed
, and parseHtmlFeed
.
parse(str)
Identifies the filetype (xml
, json
, or html
) and assigns the appropriate parser.
import { parse } from 'xfp'
parseXmlFeed(str)
Handler for RSS (v0.9 - v2.0) and Atom feeds.
import { parseXmlFeed } from 'xfp'
parseJsonFeed(str)
Handler for JSON feeds (v1).
import { parseJsonFeed } from 'xfp'
parseHtmlFeed(str)
WIP! Extracts feed data from an HTML document using rehype-extract-meta and rehype-extract-posts.
import { parseHtmlFeed } from 'xfp'