html-ast-transform
A set of helpers around parse5 for transforming HTML via an AST. Allows for flexible transformations useful for eg. processing rich text editor output for email.
Features
- Strip tags or nodes
- Replace nodes with string(s) or a function
- Reduce child nodes for adding/removing/modifying sibling nodes
- Utility functions to simplify creating nodes, adding and checking for attributes
Install
npm install --save html-ast-transform parse5
Usage
// or es5:var transform = transform// umd:const transform h hasClass getAttr = HtmlAstTransform const result =
Examples
Replace paragraph tags with table rows
const input = '<div><p class="text-center">Some text</p></div>' const output = // output = '<div><tr><td class="text-center">Some text</td></tr></div>'
Generate plain text version retaining links and image alt text
const input = '<p>Text with <a href="example.com">a link</a><img alt="and an image" /></p>' const stringifyLinks = { if nodenodeName !== 'a' return acc const href = return acc} const getAltText = { const alt = return } const output = // output = '\nText with a link [example.com] [and an image]'
API
transform
transforminput: string, , reduceAll?:Node, stripContent?: string, stripTags?: string, trimWhitespace?: boolean, fragment?: boolean}
replaceTags: A mapping of tag names to their replacements. Replacements can be a string that will replace the opening tag, an array of strings that will replace opening and closing tags or a function that receives the node and returns a replacement node.
reduceAll: A reducer to run over the children of all nodes. Receives the accumulated childNodes, the current childNode, the current index and the list of childNodes.
stripContent: An array of tag names that will be removed along with their contents
stripTags An array of tag names to be removed while retaining their contents
trimWhitespace: Handle indentation and newlines by removing whitespace only text nodes and trimming text nodes with multiple leading/trailing whitespace to a single space. default = true
fragment: Parse the input as an html fragment rather than a full document. default = true
Helpers
node factory
hhtml: string: Node htype: string, value: string: TextNode | CommentNode h tagName: string, attrs: Attribute, childNodes: Node: Element
attribute helpers
// get the value of the named attribute, if presentgetAttrnode: Element, name: string: string | undefined // add an attribute to an element or update if it already existswithAttrnode: Element, name: string, value: string: Element // check if the element has a classhasClassnode: Element, name: string: boolean // add a class to an element if not already presentwithClassnode: Element, name: string: Element