@atjson/source-html
TypeScript icon, indicating that this package has built-in type declarations

0.37.0 • Public • Published

🧭 @atjson/source-html

The HTML source turns an HTML document into an annotated document, with the raw HTML source as the text, and all the tags (and attributes) as annotations.

This source can be used to parse and convert HTML pages into another form of markup, like markdown. The snippet of code to do this is:

import HTMLSource from "@atjson/source-html";
import CommonMarkRenderer from "@atjson/renderer-commonmark";
import OffsetSource from "@atjson/offset-annotations";

function htmlToMarkdown(html: string) {
  return CommonMarkRenderer.render(
    HTMLSource.fromRaw(html).convertTo(OffsetSource)
  );
}

🔮 Insights into your HTML

The HTML source is particularly useful to take HTML and be able to modernize it into a rich experience, for example. We've taken a complex static / JS rendered webpage and turned it into a React application using atjson at Condé Nast as an example of how powerful this can be.

💁‍♂️ How Annotations are generated

We dynamically generate the HTML annotations for this package directly from the WHATWG HTML spec. To regenerate the annotations for this source, run the script in scripts/generate-annotations.js:

node ./scripts/generate-annotations.js

This will regenerate all files in the annotations directory, so beware! Any manual changes can and probably will be overridden by this script in the future, as the HTML spec evolves over time.

Readme

Keywords

none

Package Sidebar

Install

npm i @atjson/source-html

Weekly Downloads

203

Version

0.37.0

License

Apache-2.0

Unpacked Size

694 kB

Total Files

763

Last publish

Collaborators

  • copilot-robot
  • andrealandonio
  • igostu
  • nayeemrehman
  • varun9110
  • anurag-cn
  • tce
  • bbui
  • gmedina
  • dkorenblyum