openai-whisper-js

openai-whisper-js is a Node.js wrapper for the OpenAI Whisper library, enabling seamless audio transcription using Whisper models. This package simplifies the process of interacting with Whisper by providing a JavaScript interface to execute transcriptions.

Features

Supports all Whisper models (tiny, base, small, medium, large, turbo, large-v3).
Easy-to-use API for audio transcription.
Handles Python virtual environment initialization.
Debug mode for detailed logs during transcription.

Installation

First, install the package:

npm install openai-whisper-js

Prerequisites

Python 3 installed on your system.
Bash shell (/bin/bash) available.
Ensure ffmpeg is installed for audio processing, as required by Whisper.

Getting Started

Usage Example

import path from 'path';
import { whisper } from 'openai-whisper-js';

async function transcribeAudio() {
  try {
    const result = await whisper.transcribe({
      modelName: 'tiny',
      audio: path.join(__dirname, './audio/test.mp3'),
      // debug: true, // Uncomment for detailed logs
    });

    console.log('Transcription Result:', result);
  } catch (error) {
    console.error('Error during transcription:', error);
  }
}

transcribeAudio();

Output

On success, the transcription of the audio file will be printed to the console.
If there’s an error, it will be logged to the console.

API Reference

`whisper.transcribe(options: ITranscribeOptions): Promise<string>`

Transcribes an audio file using the specified Whisper model.

Parameters

modelName (required): Whisper model to use. Supported values: 'tiny' | 'base' | 'small' | 'medium' | 'large' | 'turbo' | 'large-v3'.
audio (required): Path to the audio file for transcription.
debug (optional): Boolean flag for enabling debug mode. Default: false.

Returns

A Promise<string> resolving to the transcription text.

Advanced Options

You can extend the transcription options by modifying the output format or directory:

await whisper.transcribe({
  modelName: 'base',
  audio: path.join(__dirname, './audio/test.mp3'),
  debug: true,
});

Contributing

Contributions are welcome! Please open an issue or submit a pull request to improve the package.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Support

For issues or feature requests, visit the GitHub repository.

Acknowledgments

OpenAI Whisper for the transcription capabilities.
Inspiration from audio transcription libraries.

Developed by Ahmed Adel.