Node.js Markdown to HTML Parser
This Markdown-to-HTML parser uses a custom, lightweight, Markdown syntax.
It allows to create: italic, bold, strikethrough and superscript texts, headers, links, images, videos, audios, inline codes, multiline codes, unordered lists, ordered lists, nested lists, horizontal lines, quotes and footnotes.
A browser module is also available here: @deskeen/markdown-browser
Usage
const parser = require('@deskeen/markdown')
const html = parser.parse('some markdown text').innerHTML
// html === '<p>some markdown text</p>'
Learn more
- Installation
- Parser options
- Element object
- Markdown syntax cheatsheet
- Markdown syntax
- Compatibility with other popular Markdown
- Unsupported syntaxes
- Examples
- Other ressources
- FAQ
- Contact
- Licence
Installation
This package can be added to your Node.js dependencies by running:
npm install @deskeen/markdown
To import the parser to your JavaScript code, use:
const parser = require('@deskeen/markdown')
To parse a text and transform it into HTML, use:
const htmlCode = parser.parse('some markdown text').innerHTML
The parser has been tested with Node.js v10+ but it may be working with older Node.js versions too.
Parser options
const element = parse(markdownText[, options])
An option object can be passed to the parser.
Available options are:
-
allowHeader
: Whether headers are allowed. Defaults totrue
. -
allowHeaderFormat
: Whether formatted text is allowed in headers. Defaults tofalse
. -
allowLink
: Whether links are allowed. Defaults totrue
. -
allowImage
: Whether images are allowed. Defaults totrue
. -
allowCode
: Whether inline codes are allowed. Defaults totrue
. -
allowMultilineCode
: Whether multiline codes are allowed. Defaults totrue
. -
allowUnorderedList
: Whether unordered lists are allowed. Defaults totrue
. -
allowUnorderedNestedList
: Whether unordered nested lists are allowed. Defaults totrue
. -
allowOrderedList
: Whether ordered lists are allowed. Defaults totrue
. -
allowOrderedNestedList
: Whether ordered nested lists are allowed. Defaults totrue
. -
allowHorizontalLine
: Whether horizontal lines are allowed. Defaults totrue
. -
allowQuote
: Whether quotes are allowed. Defaults totrue
. -
allowFootnote
: Whether footnotes are allowed. Defaults tofalse
. -
allowHTMLAttributes
: Whether HTML attributes are allowed. Defaults tofalse
(beta). -
maxHeader
: Max header level. Number from 1 to 6 included. e.g. 2 means authorized header tags are<h1>
and<h2>
. Defaults to 3.
Callback functions can be passed to the options as well. They allow to edit the output element (e.g. add custom attributes).
Available callbacks are:
-
onHeader
: Function called when a header is parsed. -
onLink
: Function called when a link is parsed. -
onImage
: Function called when an image is parsed. -
onAudio
: Function called when an audio element is parsed. -
onVideo
: Function called when a video is parsed. -
onCode
: Function called when an inline code is parsed. -
onMultilineCode
: Function called when a multiline code is parsed. The second argument is the (optional) language name. -
onUnorderedList
: Function called when a unordered list is parsed. -
onOrderedList
: Function called when an ordered list is parsed. -
onHorizontalLine
: Function called when a horizontal line is parsed. -
onQuote
: Function called when a quote is parsed. -
onReference
: Function called when a footnote reference is parsed. The second argument contains the identifier.
The first argument of the callbacks is always the parsed element:
function onXXX(element) {
// Your logic here
// e.g.: element.className = 'css-class'
}
Element object
The parser returns a custom Element
that is similar to a DOM Element in the browser.
Available properties are:
-
tagName
: Tag name of the element. MDN Docs -
id
: id attribute of the element. MDN Docs -
className
: Class attribute of the element. MDN Docs -
attributes
: Element attributes. MDN Docs -
children
: List of child Elements. MDN Docs -
childNodes
: List of child Elements and child Texts. MDN Docs -
firstChild
: First child. MDN Docs -
lastChild
: Last child. MDN Docs -
parentNode
: Parent of the element. MDN Docs -
textContent
: Text of the element and its descendants. MDN Docs -
hasAttribute(attrName)
: Returns whether the element has the specified attribute. MDN Docs -
setAttribute(attrName, attrValue)
: Adds an attribute to the element.( MDN Docs -
getAttribute(attrName)
: Returns an element attribute. MDN Docs -
removeAttribute(attrName)
: Removes an element attribute. MDN Docs -
appendChild(child)
: Adds a node to the end of the list of children. MDN Docs -
prepend(...nodesToPreprend)
: Inserts a set of node or text before the first child. MDN Docs -
append(...nodesToAppend)
: Inserts a set of node or text after the last child. MDN Docs -
removeChild(child)
: Removes a child. MDN -
remove()
: Removes the child from its parent. MDN -
innerHTML
: Returns the HTML markup of the elements contained in the element. MDN Docs -
outerHTML
: Returns the HTML markup of the element and its descendants.( MDN Docs
New elements can be created by using the Element
class and text can be created using the Text
class:
const { Element, Text } = require('@deskeen/markdown')
const myDivElement = new Element('div')
const myText = new Text('Some text')
Markdown syntax cheatsheet
Type | Markdown syntax |
---|---|
Italic text | *Italic text* |
Bold text | **Bold text** |
Bold-italic text | ***Bold-italic text*** |
Strikethrough text | ~~Strikethrough text~~ |
Superscript text | ^Superscript |
Header | # Header |
Link | [Link text](link_url) |
Image,Video,Audio | ![Alt text](image_url) |
Unordered list | - List item |
Unordered nested list | 2 spaces |
Ordered list | + Ordered list item |
Ordered nested list | 3 spaces |
Horizontal Line | \n\n---\n\n |
Inline Code | `Code text` |
Multiline Code | ```\nCode text\n``` |
Quote | > Quote |
Footnote | Reference[^1] |
Escape character | \# Header not parsed |
Markdown syntax
Italic text
An italic text is surrounded by a star (*
).
Example
This is an *italic text*
<p>This is an <em>italic text</em></p>
Bold text
A bold text is surrounded by two stars (**
).
Example
This is an **italic text**
<p>This is an <strong>italic text</strong></p>
Bold-italic text
A bold and italic text is surrounded by three stars (***
).
Example
This is a ***bold and italic text***
<p>This is a <strong><em>bold and italic text</em></strong></p>
Strikethrough text
A strikethrough text is surrounded by two tildes (~~
).
Example
This is a ~~strikethrough text~~
<p>This is a <s>strikethrough text</s></p>
Superscript text
A superscript text starts with a circumflex (^
) and ends with a space or a newline. A superscript text that contains spaces can be surrounded by parenthesis (( )
)
Example
This is a ^superscript text
<p>This is a <sup>superscript</sup> text</p>
Example with parenthesis
This is a ^(superscript text)
<p>This is a <sup>superscript text</sup></p>
Paragraph
A single newline adds the line of text to the previous paragraph. Two newlines create a new paragraph.
Example with a single newline
First line of text.
Second line of text.
<p>First line of text.<br>Second line of text.</p>
Example of a paragraph
First line of text.
Second line of text.
<p>First line of text.</p>
<p>Second line of text.</p>
Header
A header starts with one to six hashes (#
) followed by a space.
Example
# Title level 1
## Title level 2
### Title level 3
#### Title level 4
##### Title level 5
###### Title level 6
<h1>Title level 1</h1>
<h2>Title level 2</h2>
<h3>Title level 3</h3>
<h4>Title level 4</h4>
<h5>Title level 5</h5>
<h6>Title level 6</h6>
Link
A link is made up of two parts. The text in square brackets ([]
), followed by an URL in round brackets (( )
), i.e. [Link](url)
. Closing brackets in the text must be escaped.
Example
This is a [link](https://example.com)
<p>This is a <a href="https://example.com">link</a></p>
Example with a closing bracket
According to [[1\]](#ref1), this package is the best!
<p>According to <a href="#ref1">[1]</a>, this package is the best!</p>
Image
An image starts with an exclamation mark (!
) followed by an alt text in square brackets ([]
), followed by the URL in round brackets (( )
). i.e. ![alt text](image_url)
Example of an inline image
This is an ![inline image](https://example.com/some_image.png)
<p>This is an <img src="https://example.com/some_image.png" alt="inline image"></p>
Example of an image on a single line
![Image only on a line](https://example.com/some_image.png)
<figure>
<img src="https://example.com/some_image.png" alt="Image only on a line">
</figure>
Example of an image with a caption
![Image with a caption](https://example.com/some_image.png "caption")
<figure style="height: 100px; width: 100px">
<img src="https://example.com/some_image.png" alt="Image with a caption">
<figcaption>caption</figcaption>
</figure>
Example of an image with inline style
![Image with inline style](https://example.com/some_image.png){style="height: 100px; width: 100px"}
<figure style="height: 100px; width: 100px">
<img src="https://example.com/some_image.png" alt="Image with inline style">
</figure>
Video
Videos work the same way as images, i.e. ![][video_url]
.
Example of a video
![][https://example.com/some_video.mp4 "my caption"]
<figure>
<video controls="">
<source src="https://example.com/some_video.mp4" type="video/mp4">
</video>
</figure>
Example of a video with a caption
![][https://example.com/some_video.mp4 "my caption"]
<figure>
<video controls="">
<source src="https://example.com/some_video.mp4" type="video/mp4">
</video>
<figcaption>my caption</figcaption>
</figure>
Audio
Audio elements work the same way as images, i.e. ![][audio_url]
.
Example of an audio
![][https://example.com/some_audio.mp3]
<figure>
<audio controls="">
<source src="https://example.com/some_audio.mp3" type="audio/mpeg">
</audio>
</figure>
Example of an audio with a caption
![][https://example.com/some_audio.mp3 "my audio caption"]
<figure>
<audio controls="">
<source src="https://example.com/some_audio.mp3" type="audio/mpeg">
</audio>
<figcaption>my audio caption</figcaption>
</figure>
Unordered list
Unordered list items start with a dash (-
) followed by a space.
Newlines can be inserted within a list item by starting the line with two spaces.
Nested list items start with at least two spaces, followed by a dash and another space. Only one unordered nested list is allowed.
Example
- Item 1
- Item 2
- Item 3
<ul>
<li>Item 1</li>
<li>Item 2</li>
<li>Item 3</li>
</ul>
Example with a newline within an item
- Item 1
Following Item 1
- Item 2
<ul>
<li>Item 1<br>Following Item 1</li>
<li>Item 2</li>
</ul>
Example with nested list
- Item 1
- Item 1.1
- Item 1.2
- Item 2
<ul>
<li>
Item 1
<ul>
<li>Item 1.1</li>
<li>Item 1.2</li>
</ul>
</li>
<li>Item 2</li>
</ul>
Ordered list
Ordered list items start with a number, followed by a period (.
), and a space.
Newlines can be inserted within a list item by starting the line with three spaces.
Nested list items start with at least three spaces, followed by a number and a space. Only one ordered nested list is allowed.
Example
1. Item 1
2. Item 2
3. Item 3
<ol>
<li>Item 1</li>
<li>Item 2</li>
<li>Item 3</li>
</ol>
Example with a newline within an item
1. Item 1
Following Item 1
2. Item 2
<ol>
<li>Item 1<br>Following Item 1</li>
<li>Item 2</li>
</ol>
Example with nested list
1. Item 1
1. Item 1.1
2. Item 1.2
2. Item 2
<ol>
<li>
Item 1
<ol>
<li>Item 1.1</li>
<li>Item 1.2</li>
</ol>
</li>
<li>Item 2</li>
</ol>
The numbers of the ordered list items are not taken into account. The list is rendered the same way whether the numbers are in order, or not.
Example with numbers not in order
5. Item 1
1. Item 2
2. Item 3
1. Item 4
<ol>
<li>Item 1</li>
<li>Item 2</li>
<li>Item 3</li>
<li>Item 4</li>
</ol>
Horizontal Line
A Horizontal line starts with an empty line, followed by three dashes (---
), followed by another empty line.
Example
Above horizontal line
---
Below horizontal line
<p>Above horizontal line</p>
<hr>
<p>Below horizontal line</p>
Code
A code text is surrounded by a single backtick (```)
Example
This is `some technical term`
<p>This is <code>some technical term</code></p>
Multiline Code
A multiline code is surrounded by three backticks (```) set on separate lines.
The language name of the code can be added to the opening backticks. It is not included in the output HTML but it is passed to the onMultilineCode
callback. See example further down.
Example
\`\`\`
Some code line 1
Some code line 2
Some code line 3
\`\`\`
<pre><code>Some code line 1
Some code line 2
Some code line 3</code></pre>
Example with language name
\`\`\`javascript
console.log('Hello World!')
\`\`\`
<pre><code>console.log('Hello World!')</code></pre>
Quote
A quote starts with a "greater than" sign (>
).
Example
> Quote Line 1
> Quote Line 2
> Quote Line 3
<blockquote>
<p>
Quote Line 1
<br>
Quote Line 2
<br>
Quote Line 3
</p>
</blockquote>
Foonote
A footnote is made up of two parts: a reference and a note.
The reference starts with an opening square bracket ([
), followed by a circumflex (^
), an identifier (a number or a text but no space) and a closing square brackets (]
). e.g. [^1]
The note should be on its own line anywhere in the document and should match the reference. An extra colon (:
) is added next to the reference, followed by the note text.
The reference identifier is only used to link the reference with the footnote. The HTML output will be numbered sequentially.
Example
This is the fist reference[^1].
And the second one[^two].
[^1]: First footnote.
[^two]: Second footnote.
<p>This is the fist reference<a href="#reference1"><sup>1</sup></a>.</p>
<p>And the second one<a href="#reference2"><sup>2</sup></a>.</p>
<section>
<ol>
<li id="reference1">First footnote.</li>
<li id="reference2">Second footnote.</li>
</ol>
</section>
Escape character
The escape character is a backslash (\
). It can be used to tell the parser not to interpret Markdown syntax characters, i.e. *
, [
, `
, !
, #
, ~
, ^
and \
.
Example
This \*bold text\* is not converted into html.
<p>This *bold text* is not converted into html.</p>
Example 2
This backslash \ is not removed because it is not followed by a special character.
<p>This backslash \ is not removed because it is not followed by a special character.</p>
Compatibility with other popular Markdown
A tick (☑) means that the syntax should work on the platform.
Syntax | GitHub | GitLab | CommonMark | ||
---|---|---|---|---|---|
Italic | * |
☑ | ☑ | ☑ | ☑ |
Bold | ** |
☑ | ☑ | ☑ | ☑ |
Bold-italic | *** |
☑ | ☑ | ☑ | ⚠ N/A |
Strikethrough | ~~ |
☑ | ☑ | ☑ | ⚠ N/A |
Newline | \n |
☑ | ☑ | ⚠ Space | ⚠ Space |
Paragraph | \n\n |
☑ | ☑ | ☑ | ☑ |
Header | # |
☑ | ☑ | ☑ | ☑ |
Link | []() |
☑ | ☑ | ☑ | ☑ |
Image | ![]() |
☑ | ☑ | ☑ | ☑ |
Un. list | - |
☑ | ☑ | ☑ | ☑ |
Un. list \n | 2 spaces | ☑ | ☑ | ☑ | ⚠ \n
|
Un. nested | 2 spaces | ☑ | ☑ | ☑ | ☑ |
Ord. list | 1. |
☑ | ☑ | ☑ | ☑ |
Ord. list \n | 3 spaces | ☑ | ☑ | ☑ | ⚠ \n
|
Ord. nested | 3 spaces | ☑ | ☑ | ☑ | ☑ |
Horiz. Line | \n---\n |
☑ | ☑ | ☑ | ☑ |
Code | ` |
☑ | ☑ | ☑ | ☑ |
MultiCode | ``` |
☑ | ☑ | ☑ | ☑ |
Quote | > |
☑ | ☑ | ☑ | ☑ |
Escape char | \ |
☑ | ☑ | ☑ | ☑ |
Superscript | ^ |
⚠ HTML | ☑ | ⚠ HTML | ⚠ N/A |
Subscript | N/A | ⚠ HTML | ☑ N/A | ⚠ HTML | ⚠ N/A |
Foonote | [^1] |
☑ | ⚠ N/A | ⚠ Diff. | ⚠ N/A |
HTML | N/A | ⚠ Av. | ☑ N/A | ⚠ Av. | ⚠ Av. |
Source: GitHub Markdown, Reddit Markdown, GitLab Markdown, CommonMark
Unsupported syntaxes
The following syntaxes are NOT supported:
- Italic, bold and italic-bold texts with one, two and three underscores.
- Headers with dashes/equal signs underneath.
- Unordered Lists with a plus sign or a star.
- More than one nested list.
- Horizontal lines with with stars or underscores.
- Links with less-than and greater-than signs.
- HTML code.
Examples
Add an identifier to headers
parseMarkdown('# Title 1', {
onHeader: element => {
// node.textContent === 'Title 1'
element.id = element.textContent.replace(/ /g, '-').toLowerCase()
}
}).innerHTML
<h1 id="title-1">Title 1</h1>
Open external links in a new tab
parseMarkdown('See [this page](https:/example.com)!', {
onLink: element => {
// element.getAttribute('href') === 'http:/example.com'
const href = element.getAttribute('href')
if (href.startsWith('https://MY_SITE.com') === false) {
element.setAttribute('target', '_blank')
}
}
}).innerHTML
<p>See <a href="https:/example.com" target="_blank">this page</a>!</p>
Add a base URL to images with a relative link
parseMarkdown('![Beautiful image](beautiful_image.png)', {
onImage: element => {
// element.tagName === 'IMG'
// element.getAttribute('src') === 'beautiful_image.png'
// element.getAttribute('alt') === 'Beautiful image'
if (element.hasAttribute('src')) {
const src = element.getAttribute('src')
if (src.startsWith('http') === false) {
element.setAttribute('src', 'https://example.com/' + src)
}
}
}
}).innerHTML
<figure>
<img src="https://example.com/beautiful_image.png" alt="Beautiful image">
</figure>
Add a CSS Class to inline codes
parseMarkdown('This is body html tag: `<body>`', {
onCode: element => {
element.className = 'some-class'
}
}).innerHTML
<p>This is body html tag: <code class="some-class"><body></code></p>
Pretty print JSON objects
const markdownText = '```json\n{"some_property":"foo","some_other_property":"bar"}\n```'
parseMarkdown(markdownText, {
onMultilineCode: (element, language) => {
if (language === 'json') {
// element is a <pre> tag that includes the <code> tag
const codeElement = element.firstChild
const codeText = codeElement.textContent
const jsonObject = JSON.parse(codeText)
codeElement.textContent = JSON.stringify(jsonObject, null, 2)
}
}
}).innerHTML
<pre><code>{
"some_property": "foo",
"some_other_property": "bar"
}</code></pre>
Other ressources
- Original Markdown: https://daringfireball.net/projects/markdown/
- CommonMark: https://commonmark.org/
FAQ
What can I do if I have a problem?
You can raise an issue and ask for help.
What can I do to help?
You can:
- Have a look at the issues and see if you can help someone.
- Have a look at the code and see if you can improve it.
- Translate this README in your language.
- Star this repo.
Contact
You can reach me at {my_firstname}@{my_name}.fr
Licence
MIT Licence - Copyright (c) Morgan Schmiedt