unified-doc-util-text-offsets
unified-doc hast utility to add text offsets to text nodes.
Install
npm install unified-doc-util-text-offsets
Use
Given a hast
tree parsed from some HTML content:
; // html: '<blockquote><strong>some</strong>\ncontent</blockquote>'const hast = type: 'root' children: type: 'element' tagName: 'blockquote' children: type: 'element' tagName: 'strong' children: type: 'text' value: 'some' type: 'text' value: '\ncontent' ; ;
API
textOffsets(hast)
Interface
;
Accepts a hast
tree and adds textOffset
data to text nodes. Returns a new tree.
A TextOffset
for a text
node tracks the start and end offset of its text value relative to the textContent
representation of the provided hast
tree. The textContent
representation of a hast
tree is the concatenation of all text node values under the tree. The following pseudocode helps visualize this behavior:
const html = '<blockquote><strong>some</strong>\ncontent</blockquote>';const textContent = 'some\ncontent';const textNodes = 'some' '\ncontent';const textOffsets = start: 0 end: 4 // "[some]\ncontent" start: 4 end: 12 // "some[\ncontent]"; // textOffset data mentioned above attached to text nodesconst withTextOffsets = ;
Related interfaces