The @sourcesync-sdk/whisper-web enables audio fingerprinting and matching capabilities in web applications. It allows developers to implement real-time audio capture, generate audio fingerprints, and match them against a database.
To install the package, run the following command in your project directory:
npm install sourcesync-sdk
Or install a submodule as its own dependency:
npm install @sourcesync-sdk/whisper-web @sourcesync-sdk/app
Here's a quick guide to get started with the Whisper Web SDK:
- Import and initialize the SDK:
import { createWhisperFactory } from 'sourcesync-sdk/whisper-web';
import { initializeApp } from 'sourcesync-sdk/app';
const app = await initializeApp({
appKey: 'your-app-key',
// app config
})
const whisperFactory = await createWhisperFactory(app, {
// set default options for Whisper instances
defaultOpts: {
apiUrl: 'YOUR_API_URL',
apiToken: 'YOUR_API_TOKEN',
apiKey: 'YOUR_API_KEY' // optional
}
});
- Create a Whisper instance:
const whisper = whisperFactory.create();
await whisper.init();
- Set up fingerprinting:
await whisper.setCallback((fingerprint) => {
// Handle the generated fingerprint
console.log('Fingerprint generated:', fingerprint);
// Request a match
whisper.requestMatch(fingerprint)
.then((matchResult) => {
console.log('Match result:', matchResult);
})
.catch((err) => {
// Handle errors
});
});
The MatchResponse
object contains the following properties:
interface MatchResponse {
// device id from whisper platform if registered
wdeviceid: number
// session id from whisper platform if registered
wsessionid: number
// pass-through JSON string, you can store timestamps, custom device id's or anything
// you need to receive back with the match response in the match request
requestJson: string
// match object stores the matches grouped by type or an array of matches
matches: MatchGroups | MatchItem[]
}
type MatchGroups = Record<string, MatchItem[]>
interface MatchItem {
// reference content id as registered on the whisper platform
wrefid: number
// confidence score, 0 - 100, higher is better
confidence: number
// unknown time for the match based on the incoming fingerprint timestamp
unknowntime: string
// the time within the reference content that matched
referencetime: string
// JSON data that was registered with the content on the whisper platform
// this could be channel id's, custom content id's, title, description, etc.
refJson: string
title?: string
}
- Setup audio capture:
// attach the microphone
await whisper.attachMicrophone();
// or attach the video element
await whisper.attachVideoElement(videoElement);
// or attach the audio element
await whisper.attachAudioElement(audioElement);
- Start and stop audio capture:
// Start capturing
await whisper.start();
// Stop capturing
await whisper.stop();
For media resources to be accessible, the server must include the following header in its response:
Access-Control-Allow-Origin: *
Note: Using a wildcard (*) allows access from all origins. For production environments, consider specifying allowed origins explicitly for enhanced security.
When using the WASM build, specific headers are required to enable SharedArrayBuffers, which are necessary for the WASM runtime to support pthreads via WebWorkers.
The service must return these headers in the page response:
Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp
The service hosting the WASM code must return these headers in OPTIONS requests:
Access-Control-Allow-Origin: *
Cross-Origin-Resource-Policy: cross-origin
- These requirements may not cover all scenarios, especially when hosting components on different domains.
- Enabling these headers may potentially break existing website functionality.
- Without loading the WASM module, fingerprint generation in the browser is not possible.
-
Cross-Origin-Opener-Policy: 'same-origin'
- Isolates the browsing context to same-origin documents
- Enhances security by preventing cross-origin popup manipulation
- Reduces the risk of side-channel attacks
-
Cross-Origin-Embedder-Policy: 'require-corp'
- Ensures cross-origin resources must explicitly grant permission to be loaded
- Enables the use of SharedArrayBuffer, crucial for multi-threaded WebAssembly
- Prevents unauthorized data access from other origins
These headers enable "cross-origin isolation", which is important for WebAssembly because it:
- Allows use of high-precision timers and SharedArrayBuffer, improving performance
- Creates a more isolated environment for WebAssembly code execution
- Enables advanced WebAssembly features like threads in most browsers
- May break integrations with some third-party services relying on cross-origin access
- Can complicate embedding your content in other sites or vice versa
For full WebAssembly functionality, ensure all resources (including .wasm files) are served from the same origin or have appropriate CORS headers.
-
create()
: Creates a new Whisper instance
-
init()
: Initializes the Whisper instance -
attachMicrophone()
: Attaches the microphone for audio capture -
attachVideoElement(videoElement)
: Attaches a video element for audio capture -
attachAudioElement(audioElement)
: Attaches an audio element for audio capture -
setCallback(callback)
: Sets the callback for fingerprint generation -
start()
: Starts audio capture and fingerprinting -
stop()
: Stops audio capture -
requestMatch(fingerprint)
: Requests a match for the given fingerprint
- Ensure proper cleanup by calling
stop()
when audio capture is no longer needed
This SDK is distributed under the Apache License, Version 2.0. The Apache 2.0 License applies to the SDK only and not any other component of the SourceSync Platform.