A TypeScript library for generating speech using Microsoft Edge's text-to-speech API
Generate speech from text using Microsoft Edge's text-to-speech service. This library provides access to Edge's TTS capabilities with subtitle generation support and voice customization options.
npm install @echristian/edge-tts
import { synthesize, synthesizeStream } from "@echristian/edge-tts";
// Basic usage with synthesize()
const { audio, subtitle } = await synthesize({
text: "Hello, world!",
});
// Stream processing usage
const generator = synthesizeStream({ text: "Hello world" });
for await (const chunk of generator) {
// chunk is a Uint8Array of raw audio data
// Process or save each chunk as needed
}
// Collecting all streamed chunks
const chunks: Uint8Array[] = [];
for await (const chunk of synthesizeStream({ text: "Hello world" })) {
chunks.push(chunk);
}
Main function to generate speech from text.
Creates an async generator that yields chunks of processed audio data. Each chunk has metadata headers automatically removed.
Uses the same options as synthesize()
, but without subtitle support:
Option | Type | Default | Description |
---|---|---|---|
text | string | (required) | Text to convert to speech |
voice | string | "en-US-AvaNeural" | Voice ID to use |
language | string | "en-US" | Language code |
outputFormat | string | "audio-24khz-96kbitrate-mono-mp3" | Audio format |
rate | string | "default" | Speaking rate |
pitch | string | "default" | Voice pitch |
volume | string | "default" | Audio volume |
For detailed configuration options, refer to Microsoft's documentation:
Note: Some options may be limited by Microsoft Edge's service capabilities.
Option | Type | Default | Description |
---|---|---|---|
text | string | (required) | Text to convert to speech |
voice | string | "en-US-AvaNeural" | Voice ID to use |
language | string | "en-US" | Language code |
outputFormat | string | "audio-24khz-96kbitrate-mono-mp3" | Audio format |
rate | string | "default" | Speaking rate |
pitch | string | "default" | Voice pitch |
volume | string | "default" | Audio volume |
subtitle | SubtitleOptions | { splitBy: "word", wordsPerCue: 10 } | Subtitle options |
Option | Type | Default | Description |
---|---|---|---|
splitBy | "word" | "duration" | "word" | How to split subtitles |
wordsPerCue | number | 10 | Words per subtitle when using 'word' |
durationPerCue | number | 5000 | Duration (ms) when using 'duration' |
Property | Type | Description |
---|---|---|
audio | Blob | Generated audio data |
subtitle | Array | Generated subtitles |
Property | Type | Description |
---|---|---|
text | string | Subtitle text |
start | number | Start time (ms) |
end | number | End time (ms) |
duration | number | Duration (ms) |