@citeproc-rs/wasm
This is a front-end to
citeproc-rs
, a citation
processor written in Rust and compiled to WebAssembly.
It contains builds appropriate for:
- Node.js
- Browsers, using a bundler like Webpack.js
- Browsers directly importing an ES Module from a webserver
Installation / Release channels
There are two release channels:
Stable is each versioned release. (At the time of writing, there are no versioned releases.) Install with:
yarn add @citeproc-rs/wasm
Canary tracks the master branch on
GitHub. Its version numbers follow
the format 0.0.0-canary-GIT_COMMIT_SHA
, so version ranges in your
package.json
are not meaningful. But you can install the latest one with:
yarn add @citeproc-rs/wasm@canary
# alternatively, a specific commit
yarn add @citeproc-rs/wasm@0.0.0-canary-COMMIT_SHA
If you use NPM, replace yarn add
with npm install
.
Including in your project
For Node.js, simply import the package as normal. Typescript definitions are provided, though parts of the API that cannot have auto-generated type definitions are alluded to in doc comments with an accompanying type you can import.
// Node.js
const { Driver } = require("@citeproc-rs/wasm");
Microsoft Edge
Note the caveats in around Microsoft Edge's TextEncoder/TextDecoder support in the wasm-bindgen tutorial.
Using Webpack
When loading on the web, for technical reasons and because the compiled WebAssembly is large, you must load the package asynchronously. Webpack comes with the ability to import packages asynchronously like so:
// Webpack
import("@citeproc-rs/wasm")
.then(go)
.catch(console.error);
function go(wasm) {
const { Driver } = wasm;
// use Driver
}
When you do this, your code will trigger a download (and streaming parse) of
the binary, and when that is complete, your go
function will be called. The
download can of course be cached if your web server is set up correctly, making
the whole process very quick.
You can use the regular-import Driver as a TypeScript type anywhere, just don't
use it to call new Driver()
.
React
If you're writing a React app, you may wish to use React.lazy
like so:
// App.tsx
import React, { Suspense } from "react";
const AsyncCiteprocEnabledComponent = React.lazy(async () => {
await import("@citeproc-rs/wasm");
return await import("./CiteprocEnabledComponent");
});
const App = () => (
<Suspense
fallback={<div>Loading citation formatting engine...</div>}>
<AsyncCiteprocEnabledComponent />
</Suspense>
);
// CiteprocEnabledComponent
import { Driver } from "@citeproc-rs/wasm";
// ...
web
target)
Importing it in a script tag (To directly import it without a bundler in a (modern) web browser with ES modules support, the procedure is different. You must:
- Make the
_web
subdirectory of the published NPM package available in a content directory on your webserver, or use a CDN like unpkg. - Include a
<script type="module">
tag in your page's<body>
, like so:
<script type="module">
import init, { Driver } from './path/to/_web/citeproc_rs_wasm.js';
async function run() {
await init();
// use Driver
}
run()
</script>
Careful: This method does not ensure the package is loaded only once. If you call init again, it will invalidate any previous Drivers you created.
no-modules
target)
Importing it in a script tag (This is based on the wasm-bindgen guide
entry,
noting the caveats. You will, similarly to the web
target, need to make the
contents of the _no_modules
subdirectory of the published NPM package
available on a webserver or via a CDN. But it has ONE ADDITIONAL FILE to
import via a script tag.
Careful: This method does not ensure the package is loaded only once. If you call init again, it will invalidate any previous Drivers you created.
<html>
<head>
<meta content="text/html;charset=utf-8" http-equiv="Content-Type"/>
</head>
<body>
<!-- Include these TWO JS files -->
<script src='path/to/@citeproc-rs/wasm/_no_modules/citeproc_rs_wasm_include.js'></script>
<script src='path/to/@citeproc-rs/wasm/_no_modules/citeproc_rs_wasm.js'></script>
<script>
// Like with the `--target web` output the exports are immediately
// available but they won't work until we initialize the module. Unlike
// `--target web`, however, the globals are all stored on a
// `wasm_bindgen` global. The global itself is the initialization
// function and then the properties of the global are all the exported
// functions.
//
// Note that the name `wasm_bindgen` will at some point be configurable with the
// `--no-modules-global` CLI flag (https://github.com/rustwasm/wasm-pack/issues/729)
const { Driver } = wasm_bindgen;
async function run() {
// Note the _bg.wasm ending
await wasm_bindgen('path/to/@citeproc-rs/wasm/_no_modules/citeproc_rs_wasm_bg.wasm');
// Use Driver
}
run();
</script>
</body>
</html>
Usage in Zotero
There is a special build for Zotero and the legacy Firefox ESR extensions API,
which wants a CommonJS module format but without the Node.js fs
APIs, and
no-modules
' loading mechanisms but without the use of window
as a global as
it doesn't exist. The files are in the _zotero
directory of the NPM package.
Usage is essentially the same as no-modules; you'll need all three files:
@citeproc-rs/wasm/_zotero/citeproc_rs_wasm_include.js
@citeproc-rs/wasm/_zotero/citeproc_rs_wasm.js
@citeproc-rs/wasm/_zotero/citeproc_rs_wasm_bg.wasm
Apart from the CommonJS shims, the main difference is that the API will be
loaded onto the Zotero.CiteprocRs
object, in order for it all to be linked
together.
Careful: This method does not ensure the package is loaded only once. If
you call initWasmModule
again, it will invalidate any previous Drivers you
created.
require("citeproc_rs_wasm_include");
const initWasmModule = require("citeproc_rs_wasm");
const wasmBinaryPromise = Zotero.HTTP
.request('GET',
'resource://zotero/citeproc_rs_wasm_bg.wasm',
{ responseType: "arraybuffer" })
.then(xhr => xhr.response);
await initWasmModule(wasmBinaryPromise);
let driver;
try {
driver = new Zotero.CiteprocRs.Driver({...});
} catch (e) {
if (e instanceof Zotero.CiteprocRs.CslStyleError) {
// ...
}
}
Usage
Overview
The basic pattern of interactive use is:
- Create a driver instance with your style
- Edit the references or the citation clusters as you please
- Call
driver.batchedUpdates()
- Apply the updates to your document (e.g. GUI)
- Go to step 2 when a user makes a change
Step three is the important one. Each time you edit a cluster or a reference, it is common for only one or two visible modifications to result. Therefore, the driver only gives you those clusters or bibliography entries that have changed, or have been caused to change by an edit elsewhere. You can submit any number of edits between each call.
The API also allows for non-interactive use. See below.
Error handling
Many Driver methods can throw errors.
If you want to handle the errors from this library specifically, you can, and this is mainly useful for showing style parse or validation errors. Some error types have structured data attached to them.
try {
let driver = new Driver({ ... });
// do stuff with driver
} catch (error) {
if (error instanceof CslStyleError) {
console.error("Could not parse CSL, error:", error);
} else if (error instanceof CiteprocRsDriverError) {
console.error("Error in usage of Driver", error);
} else if (error instanceof CiteprocRsError) {
// CslStyleError and CiteprocRsDriverError are both subclasses of
// CiteprocRsError, so this branch would catch them too had they not
// been checked already.
//
// There may be errors that are not a subclass, but directly an
// instance of CitprocRsError, so for completeness one should test for
// this too.
console.error("Catch-all error", error);
} else {
throw error;
}
} finally {
// Driver is only undefined if `new Driver` threw an error.
if (driver) {
driver.free()
}
}
The error types must unfortunately be global exports, on window/global/self.
1. Creating a driver instance
First, create a driver. Note that for now, you must also call .free()
on the
Driver when you are finished with it to deallocate its memory, but there is a TC39
proposal
in the implementation phase that will make this unnecessary.
A driver needs at least an XML style string, a fetcher (below), and an output
format (one of "html"
, "rtf"
or "plain"
).
let fetcher = ...; // see below
let driver = new Driver({
style: "<style version=\"1.0\" class=\"note\" ... > ... </style>",
format: "html", // optional, html is the default
formatOptions: { // optional
linkAnchors: true, // optional, default true
},
localeOverride: "de-DE", // optional, like setting default-locale on the style
// bibliographyNoSort: true // disables sorting on the bibliography
fetcher,
});
// Fetch the chain of locale files required to use the specified locale
await driver.fetchLocales();
// ... use the driver ...
driver.free()
The library parses and validates the CSL style input. Any validation errors are reported, with byte offsets to find the CSL fragment responsible, a descriptive and useful message (in English) and sometimes even a hint for how to fix it. See Error Handling for how to access this.
Fetcher
There are hundreds of locales, and the locales you need depend on the style default, any overrides and any fallback locales defined, so the procedure for retrieving one is asynchronous to allow for fetching one over HTTP. There's not much more to it than this:
class Fetcher {
async fetchLocale(lang) {
return await fetch("https://some-cdn-with-locales.com/locales-${lang}.xml")
.then(res => res.text());
// or just
// return "<locale> ... </locale>";
// return LOCALES_PRELOADED[lang];
// or if you don't support locales other than the bundled en-US!
// return null;
}
}
let fetcher = new Fetcher();
let driver = new Driver({ ..., fetcher });
// Make sure you actually fetch them!
await driver.fetchLocales();
Unless you don't have async
syntax, in which case, return a Promise
directly, e.g. return Promise.resolve("<locale> ... </locale>")
.
Declining to provide a locale fetcher in new Driver
or forgetting to call
await driver.fetchLocales()
results in use of the bundled en-US
locale. You
should also never attempt to use the driver instance while it is fetching locales.
2. Edit the references or the citation clusters
References
You can insert a reference like so. This is a CSL-JSON object.
driver.insertReference({ id: "citekey", type: "book", title: "Title" });
driver.insertReferences([ ... many references ... ]);
driver.resetReferences([ ... deletes any others ... ]);
driver.removeReference("citekey");
Citation Clusters and their Cites
A document consists of a series of clusters, each with a series of cites. Each
cluster has an id
, which is any old string.
// initClusters is like booting up an existing document and getting up to speed
driver.initClusters([
{ id: "one", cites: [ {id: "citekey"} ] },
{ id: "two", cites: [ {id: "citekey", locator: "56", label: "page" } ] },
]);
// Update or insert any one of them like so
driver.insertCluster({ id: "one", cites: [ { id: "updated_citekey" } ] });
// (You can use `driver.randomClusterId()` to generate a new one at random.)
let three = driver.randomClusterId();
driver.insertCluster({ id: three, cites: [ { id: "new_cluster_here" } ] });
These clusters do not contain position information, so reordering is a separate procedure. Without calling setClusterOrder, the driver considers the document to be empty.
So, setClusterOrder
expresses the ordering of the clusters within the
document. Each one in the document should appear in this list. You can skip
note numbers, which means there were non-citing footnotes in between. Omitting
note
means it's an in-text reference. Note numbers must be monotonic, but you
can have more than one cluster in the same footnote.
driver.setClusterOrder([ { id: "one", note: 1 }, { id: "two", note: 4 } ]);
You will notice that if an interactive user cuts and pastes a paragraph
containing citation clusters, the whole reordering operation can be expressed
in two calls, one after the cut (with some clusters omitted) and one after the
paste (with those same clusters placed somewhere else). No calls to
insertCluster
need be made.
Uncited items
Sometimes a user wishes to include references in the bibliography even though they are not mentioned in a citation anywhere in the document.
driver.includeUncited("None"); // Default
driver.includeUncited("All");
driver.includeUncited({ Specific: ["citekeyA", "citekeyB"] });
The "All" is based on which references your driver knows about. If you have
this set to "All", simply calling driver.insertReference()
with a new
reference ID will result in an entry being added to the bibliography. Entries
in Specific mode do not have to exist when they are provided here; they can be,
for instance, the citekeys of collection of references in a reference library
which are subsequently provided in full to the driver, at which point they
appear in the bibliography, but not items from elsewhere in the library.
driver.batchedUpdates()
and apply the diff
3. Call This gets you a diff to apply to your document UI. It includes both clusters that have changed, and bibliography entries that have changed.
// Get the diff since last time batchedUpdates, fullRender or drain was called.
let diff = driver.batchedUpdates();
// apply cluster changes to the UI.
// ("myDocument" is an imaginary API.)
for (let changedCluster of diff.clusters) {
let [id, html] = changedCluster;
myDocument.updateCluster(id, html);
}
// Null? No change to the bibliography.
if (diff.bibliography != null) {
let bib = diff.bibliography;
// Save the entries that have actually changed
for (let key of Object.keys(bib.updatedEntries)) {
let rendered = bib.updatedEntries[key];
myDocument.updateBibEntry(key, rendered);
}
// entryIds is the full list of entries in the bibliography.
// If a citekey isn't in there, it should be removed.
// It is non-null when it has changed.
if (bib.entryIds != null) {
myDocument.setBibliographyOrder(bib.entryIds);
}
}
Note, for some intuition, if you call batchedUpdates()
again immediately, the
diff will be empty.
Bibliographies
Beyond the interactive batchedUpdates method, there are two functions for producing a bibliography statically.
// returns BibliographyMeta, with information about how a library consumer should
// lay out the bibliography. There is a similar API in citeproc-js.
let meta = driver.bibliographyMeta();
// This is an array of BibEntry
let bibliography = driver.makeBibliography();
for (let entry of bibliography) {
console.log(entry.id, entry.value);
}
Preview citation clusters
Sometimes, a user wants to see how a cluster will look while they are editing it, before confirming the change.
let cluster = { cites: [ { id: "citekey", locator: "45" }, { ... } ] };
let positions = [ ... before, { note: 34 }, ... after ];
let preview = driver.previewCluster(cluster, positions);
let plainPreview = driver.previewCluster(cluster, positions, "plain");
The cluster argument is just a cluster, without an id
field, since it's
ephemeral. The lack of id
field is reflected in the positions
argument as
well.
The positions array is exactly like a call to setClusterOrder
, except exactly
one of the positions omits the id field. This could either:
- Replace an existing cluster's position, and preview a cluster replacement; or
- Represent the position a cluster is hypothetically inserted.
If you passed only one position, it would be like previewing an operation like
"delete the entire document and replace it with this one cluster". That would
mean you would never see "ibid" in a preview. So for maximum utility,
assemble the positions array as you would a call to setClusterOrder
with
exactly the operation you're previewing applied.
The format argument is optional, and works like the format passed to
new Driver
: one of "html"
, "rtf"
or "plain"
. The driver will use that
instead of its normal output format.
AuthorOnly
, SuppressAuthor
& Composite
@citeproc-rs/wasm
supports these flags on clusters (all 3) and cites (except
Composite
), in a similar way to citeproc-js
. See the citeproc-js
documentation on Special Citation
Forms
for reference.
// only two modes for cites
let citeAO = { id: "jones2006", mode: "AuthorOnly" };
let citeSA = { id: "jones2006", mode: "SuppressAuthor" };
// additional options for clusters
let clusterAO = { id: "one", cites: [...], mode: "AuthorOnly" };
let clusterSA = { id: "one", cites: [...], mode: "SuppressAuthor" };
let clusterSA_First = { id: "one", cites: [...], mode: "SuppressAuthor", suppressFirst: 3 };
let clusterC = { id: "one", cites: [...], mode: "Composite" };
let clusterC_Infix = { id: "one", cites: [...], mode: "Composite", infix: ", whose book" };
let clusterC_Full = { id: "one", cites: [...], mode: "Composite", infix: ", whose books", suppressFirst: 0 };
It does support one extra option with SuppressAuthor
and Composite
on
clusters: suppressFirst
, which limits the effect to the first N name groups
(or if cite grouping is disabled, first N names). Setting it to 0 means
unlimited.
<intext>
element with AuthorOnly
etc.
citeproc-rs
supports the <intext>
element described in the citeproc-js
docs linked above, but it is not enabled by default. It also supports <intext and="symbol">
or and="text"
, which will swap out the last intext layout
delimiter (<layout delimiter="; ">
) for either the ampersand or the and
term.
If you want to use the <intext>
element in CSL, you may either:
Option 1: Add a feature flag to the style wishing to use it
<style class="in-text">
<features>
<feature name="custom-intext" />
</features>
...
</style>
AFAIK no other processors support this syntax yet.
custom-intext
feature for all styles via new Driver
Option 2: Enable the let driver = new Driver({ ..., cslFeatures: ["custom-intext"] });
// ... driver.free();
Non-Interactive use, or re-hydrating a previously created document
If you are working non-interactively, or re-hydrating a previously created document for interactive use, you may want to do one pass over all the clusters in the document, so that each cluster and bibliography entry reflects the correct value.
// Get the clusters from your document (example)
let allNotes = myDocument.footnotes.map(fn => {
return { cluster: getCluster(fn), number: fn.number }
});
// Re-hydrate the entire document based on the reference library and your
// document's clusters
driver.resetReferences(myDocument.allReferences);
driver.initClusters(allNotes.map(fn => fn.cluster));
driver.setClusterOrder(allNotes.map(fn => { id: fn.cluster.id, note: fn.number }));
// Render every cluster and bibliography item.
// It then drains the update queue, leaving the diff empty for the next edit.
// see the FullRender typescript type
let render = driver.fullRender();
// Write out the rendered clusters into the doc
for (let fn of allNotes) {
fn.renderedHtml = render.allClusters[fn.cluster.id];
}
// Write out the bibliography entries as well
let allBibKeys = render.bibEntries.map(entry => entry.id);
for (let bibEntry of render.bibEntries) {
myDocument.bibliographyMap[entry.id] = entry.value;
}
// Update your (example) UI
updateUserInterface(allNotes, myDocument, whatever);
parseStyleMetadata
Sometimes you want information about a CSL style without actually booting up a
whole driver. One important use case is a dependent style, which can't be used
with new Driver()
because it doesn't have the ability to render citations on
its own, and is essentially just a container for three pieces of information:
- A journal name
- An independent parent style
- A possible default-locale override
@citeproc-rs/wasm
provides an API for finding out what's in a CSL style file.
let styleMeta = parseStyleMetadata("<style ...> ... </style>");
This function can still throw a CslStyleError
, but this is less likely than
with new Driver() as it will not actually attempt to parse and validate all the
parts of a style. It will throw if the XML is malformed or if the <info>
block is too invalid to salvage.
Here's how to use parseStyleMetadata
to parse and use a dependent style.
let dependentStyle = "<style ...> ... </style>";
let meta = parseStyleMetadata(dependentStyle);
let isDependent = meta.info.parent != null;
let parentStyleId = isDependent && meta.info.parent.href;
let localeOverride = meta.defaultLocale;
// ...
let parentStyle = await downloadStyleWithId(parentStyleId);
let driver = new Driver({
style: parentStyle,
localeOverride,
...
});
await driver.fetchLocales();
// Here you might also want to know if the style can render a bibliography or not
let parentMeta = parseStyleMetadata(parentStyle);
if (parentMeta.independentMeta.hasBibliography) {
let bib = driver.makeBibliography();
// ...
}
// ...
driver.free();
setOutputFormat
and setStyle
If you wish to change the output format of the entire driver, you can use
setOutputFormat(format, formatOptions)
. The format is a string, one of "html" | "rtf" | "plain"
just like the new Driver
method. The options is an optional
argument with the same value as formatOptions
in new Driver
.
setStyle(xmlString)
will change the CSL style used by the driver.
Both of these methods will require throwing out almost all cached computation, so use sparingly.
If you need to render a preview in a different format, there is an argument on
previewCluster
for doing just that. It does not throw out all the
computation. citeproc-rs
' disambiguation procedures do take formatting into
account, so <i>Title</i>
can be distinct from <b>Title</b>
in HTML and RTF,
but not if the whole driver's output format is "plain"
, since they both look
identical in plain text. previewCluster
will simply translate the formatting
into another format, without re-computing all the disambiguation.