html-to-json-rs

0.0.6 • Public • Published

HtmlToJson

Description

HtmlToJson is a package designed for converting HTML strings to JSON and vice versa using Rust and WebAssembly. While the package is relatively raw, it gets the job done.

The creation of this package was made possible by utilizing the HtmlEditor library.

Getting Started

Installation

To use HtmlToJson in your project, you can install it using npm:

npm install html-to-json-rs

Usage

After installing the package, you can use it in your project by importing it into your JavaScript or TypeScript code:

import {init, jsonToHtml, htmlToJson, NODES} from 'html-to-json-rs';

const main = async () => {
    /**
     * First you need to call init
     * Init is an async function, the rest are normal
     */
    await init();
    
    // Constants of all node types, for check obj_type field
    console.log(NODES);
    
    // Example: Convert HTML to JSON
    const htmlString = '<p>Hello, </p><span>World!</span>';
    const jsonResult = htmlToJson(htmlString);
    console.log(jsonResult);
    
    // Example: Convert JSON to HTML
    const jsonObject = [
        { 
            obj_type: 'Element',
            name: 'p',
            attrs: [],
            children: [{ Text: 'World!'}]
        }
    ];
    
    const htmlResult = jsonToHtml(JSON.stringify(jsonObject));
    console.log(htmlResult);
} 

JSON structure

Result of calling function htmlToJson is a json string with array of JsonObj, if you want to render it by yourself below the definition of this struct. The JsonObj type represents the core structure of the generated JSON. It has the following fields:

  • obj_type (String): Indicates the type of the JSON object. Possible values include:

    • "Element": Represents an HTML element.
    • "Text": Represents text content.
    • "Comment": Represents a comment in the HTML.
    • "Doctype": Represents the document type declaration.
  • text (String): The content of the Text or Comment element.

  • name (String): The name of the HTML element. For elements, this corresponds to the tag name.

  • attrs (Array of Tuples): Represents the attributes associated with the HTML element. Each attribute is a tuple of key-value pairs.

  • children (Array of JsonObj): Contains child elements if the current object is an HTML element. It represents the nested structure of the HTML.

  • id (String): Tag id, also present in the attrs array (maybe in the future it will not be in attrs)

  • class (String): Tag classes, also present in the attrs array (maybe in the future it will not be in attrs)

Fields text, attrs, children, id, class - are optional, if id is not presented in html tag, id field will be undefined.

Here's an example JSON structure for a simple HTML document:

<p class="ql-align-center">
    Hello <span id="fav" style="color: rgb(230, 0, 0);">World</span>
</p>
[
  {
    "obj_type": "Element",
    "name": "p",
    "attrs": [
      [
        "class",
        "ql-align-center"
      ]
    ],
    "class": "ql-align-center",
    "children": [
      {
        "obj_type": "Text",
        "name": "Text",
        "text": "Hello ",
        "attrs": [],
        "children": []
      },
      {
        "obj_type": "Element",
        "name": "span",
        "attrs": [
          ["style", "color: rgb(230, 0, 0);"],
          ["id", "fav"]
        ],
        "id": "fav",
        "children": [
          {
            "obj_type": "Text",
            "name": "Text",
            "text": "World",
            "attrs": [],
            "children": []
          }
        ]
      }
    ]
  }
]

Package Sidebar

Install

npm i html-to-json-rs

Weekly Downloads

48

Version

0.0.6

License

MIT

Unpacked Size

229 kB

Total Files

40

Last publish

Collaborators

  • ildar7sins