@jackdbd/eleventy-plugin-text-to-speech

Eleventy plugin that uses text-to-speech to generate audio assets for your website, then injects audio players in your HTML.

Installation
About
Docs
Preliminary Operations
Usage
Configuration
- Plugin options
- Rule
Troubleshooting
Dependencies
Credits
License

Installation

npm install @jackdbd/eleventy-plugin-text-to-speech

Note: this library was tested on Node.js >=18. It might work on other Node.js versions though.

About

Eleventy plugin that uses text-to-speech to generate audio assets for your website, then injects audio players in your HTML.

To synthesize text into speech you can use:

To host the generated audio assets you can use:

Cloud Storage
Cloudflare R2
Filesystem (self host your audio assets)

⚠️ The Cloud Text-to-Speech API has a limit of 5000 characters.

See also:

this issue of the Wavenet for Chrome extension

this discussion on Google Groups

Docs

Docs generated by TypeDoc

📖 API Docs

This project uses API Extractor and api-documenter markdown to generate a bunch of markdown files and a .d.ts rollup file containing all type definitions consolidated into a single file. I don't find this .d.ts rollup file particularly useful. On the other hand, the markdown files that api-documenter generates are quite handy when reviewing the public API of this project.

See Generating API docs if you want to know more.

Preliminary Operations

Enable the Text-to-Speech API

Before you can begin using the Text-to-Speech API, you must enable it. You can enable the API with the following command:

gcloud services enable texttospeech.googleapis.com

Set up authentication via a service account

This plugin uses the official Node.js client library for the Text-to-Speech API. In order to authenticate to any Google Cloud API you will need some kind of credentials. At the moment this plugin supports only authentication via a service account JSON key.

First, create a service account that can use the Text-to-Speech API. You can also reuse an existing service account if you want. This service account should have the necessary IAM permissions to create/delete objects in a Cloud Storage bucket. You can grant the service account the Storage Object Admin predefined IAM role.

gcloud iam service-accounts create sa-text-to-speech-user \
  --display-name "Text-to-Speech user SA"

Second, download the JSON key of this service account and store it somewhere safe. Do not track this file in git.

Optional: Create Cloud Storage bucket (only if you want to host audio files on Cloud Storage)

Create a Cloud Storage bucket in your desired location. Enable uniform bucket-level access and use the nearline storage class.

gsutil mb \
  -p $GCP_PROJECT_ID \
  -l $CLOUD_STORAGE_LOCATION \
  -c nearline \
  -b on \
  gs://bkt-eleventy-plugin-text-to-speech-audio-files

If you want, you can check that uniform bucket-level access is enabled using this command:

gsutil uniformbucketlevelaccess get \
  gs://bkt-eleventy-plugin-text-to-speech-audio-files

Make the bucket's objects publicly available for read access (otherwise people will not be able to listen/download the audio files):

gsutil iam ch allUsers:objectViewer \
  gs://bkt-eleventy-plugin-text-to-speech-audio-files

Usage

Let's say that you are hosting your Eleventy website on Cloudflare Pages. Your current deployment is at the URL indicated by the environment variable CF_PAGES_URL.

Self-hosting the generated audio assets

If you want to self-host the audio assets that this plugin generates and use all default options, you can register the plugin with this code:

import { textToSpeechPlugin } from '@jackdbd/eleventy-plugin-text-to-speech'

export default function (eleventyConfig) {
  // some eleventy configuration...
  
  eleventyConfig.addPlugin(textToSpeechPlugin, {
    // TODO: add config with process.env.CF_PAGES_URL here
  })

  // some more eleventy configuration...

}

Hosting the generated audio assets on Cloud Storage

If you want to host the audio assets on a Cloud Storage bucket and configure the rules for the audio matches, you could register the plugin using something like this:

import { textToSpeechPlugin } from '@jackdbd/eleventy-plugin-text-to-speech'

export default function (eleventyConfig) {
  // some eleventy configuration...
  
  eleventyConfig.addPlugin(textToSpeechPlugin, {
    // TODO: add config with Cloud Storage bucket here
  })

  // some more eleventy configuration...

}

Multiple hosts

If you want to host the generated audio assets on multiple hosts, register this plugin multiple times. Here are a few examples:

Self-host some audio assets, and host on a Cloud Storage bucket some other assets.
Host all audio assets on Cloud Storage, but host some on one bucket, and some others on a different bucket.

Have a look at the Eleventy configuration of the demo-site in this monorepo.

Configuration

Plugin options

Key	Default	Description
`collectionName`	`undefined`	Name of the 11ty collection defined by this plugin
`rules`	`undefined`	Rules that determine which texts to convert into speech
`transformName`	`undefined`	Name of the 11ty transform defined by this plugin

Rule

Key	Default	Description
`audioInnerHTML`	`undefined`	Function that returns some HTML from the list of hrefs where the generated audio assets are hosted.
`cssSelectors`	`undefined`	CSS selectors to find matches in a HTML document
`hosting`	`undefined`	Client that provides hosting capabilities
`regex`	`undefined`	RegExp to find matches in the output path
`synthesis`	`undefined`	Client that provides Text-to-Speech capabilities
`xPathExpressions`	`undefined`	XPath expressions to find matches in a HTML document

Troubleshooting

This plugin uses the debug library for logging. You can control what's logged using the DEBUG environment variable.

For example, if you set your environment variables in a .envrc file, you can do:

# print all logging statements
export DEBUG=11ty-plugin:*

Dependencies

Package	Version
@jackdbd/zod-schemas	`^2.2.0`
html-to-text	`^9.0.5`
jsdom	`^24.0.0`
specificity	`^1.0.0`
zod	`^3.23.0`
zod-validation-error	`^3.1.0`

⚠️ Peer Dependencies

This package defines 6 peer dependencies.

Peer	Version range
`@11ty/eleventy`	`>=2.0.0 or 3.0.0-alpha.6`
`@aws-sdk/client-s3`	`>=3.0.0`
`@aws-sdk/lib-storage`	`>=3.0.0`
`@google-cloud/storage`	`>=7.0.0`
`@google-cloud/text-to-speech`	`>=5.0.0`
`debug`	`>=4.0.0`

Credits

I had the idea of this plugin while reading the code of the homonym eleventy-plugin-text-to-speech by Larry Hudson. Larry's plugin uses the Microsoft Azure Speech SDK.