Official JavaScript SDK for Sign-Speak. Unlock seamless integration of American Sign Language (ASL) sign recognition, avatars, and speech recognition in your app. Experience world-class AI technology that’s accurate, robust, and easy to implement—everything you need to elevate your app in one powerful SDK.
Explore our complete API documentation and learn about implementation and benchmarks in our Standards and Efficiency research papers.
For a preview of what you can do with the SDK, this is our avatar (I know, it looks human, right!):
And this is our Sign Language Recognition:
- Sign Language-Powered AI Chatbots (e.g., ASL GPT)
- Automatic Captioning for Deaf Content Creators
- …and much more! think TV sign commands...
Start building with the Sign-Speak SDK today and unlock the true power of a visual language. After all, why settle for basic gesture recognition when you have a rich, expressive language at your fingertips? Create immersive experiences that transcend accessibility and redefine interaction.
The Sign-Speak SDK offers four main capabilities:
- ASL Recognition: Transforms ASL video streams into accurate English transcripts.
- ASL Production: Generates realistic ASL avatar videos from English text.
- English Speech Recognition (Speech-to-Text): Efficiently converts spoken English audio into text.
- English Speech Generation (Text-to-Speech): Creates natural-sounding English speech audio files (MP3) from text.
npm install @sign-speak/react-sdk
# -or-
yarn add @sign-speak/react-sdk
Learn how to obtain your API keys at your Sign-Speak developer portal.
import { setKey } from 'sign-speak-sdk/network/key';
// set the API Key programmatically
setKey('YOUR_API_KEY');
or via environment variables:
SIGN_SPEAK_API_KEY="YOUR_API_KEY"
Sign-Speak delivers advanced ASL recognition and speech processing tools for seamless integration. For optimal accuracy and performance in your application, follow these best practices and be mindful of the current limitations.
-
Use conversational ASL:
Our models achieve best results with natural, conversational signing at normal pace. Avoid instructional or artificially slow signing—this typically decreases recognition accuracy. -
Proper video positioning:
Use well-lit environments and landscape videos that clearly capture the signer, keeping them centered and showing their full upper body. This positioning ensures accurate predictions. You can see an example of a good video positioning on the API documentation. -
Start small, then scale:
Initially test your integration with short, simple statements. Once familiar and successful, gradually progress to longer and more complex interactions to ensure continued accuracy and responsiveness.
-
Personal Sign Names: Recognition accuracy for personalized sign names is currently limited, with standardized and widely accepted signs providing the best results. However, with proper consent, we can train a custom model for individual users to recognize unique elements like their sign name.
-
Video and Audio Quality: Background noise, low-quality audio, poor lighting, or unclear framing significantly detract from accuracy. For best results, strive for clear audio and stable, well-lit video streams.
-
API Usage & Quotas: Be aware your account includes default rate-limits (requests per minute) and monthly quotas (processed minutes). Check usage consistently on your developer dashboard, or contact management@sign-speak.com if you require higher limits.
-
Real-time latency and streaming: While our WebSocket protocol is designed for low latency, network conditions or browser limitations can impact smooth real-time interaction. Thorough testing across supported browsers and networks is recommended.
-
Limited Feature Availability: Certain advanced functionalities (like personalized avatars, specific regional variants, and custom fine-tuning) are available under limited or partnership arrangements. Please reach out to discuss specific use-cases or special requirements.
For detailed performance standards and efficiency insights, visit:
Following these key recommendations and understanding our current limitations will ensure you maximize the effectiveness and accuracy of the Sign-Speak platform SDK in your applications. For more questions or support, feel free to contact our team at management@sign-speak.com.
Quickly integrate with React:
import { SignRecognition } from 'sign-speak-sdk';
const MyASLRecognition = () => (
<div>
<h2>ASL Recognition</h2>
<SignRecognition />
</div>
);
import { SignProduction } from 'sign-speak-sdk';
const MyASLProduction = () => (
<div>
<h2>Generate ASL</h2>
<SignProduction text="Hello, how are you?" model="MALE" />
</div>
);
import { SpeechRecognition } from 'sign-speak-sdk';
const MySpeechRecognition = () => (
<div>
<h2>English Speech Recognition</h2>
<SpeechRecognition />
</div>
);
import { SpeechProduction } from 'sign-speak-sdk';
const MyTextToSpeech = () => (
<div>
<h2>Text to Speech</h2>
<SpeechProduction text="Hello from Sign-Speak." model="FEMALE" />
</div>
);
Simplify state management and API calls via React hooks:
import React, { useEffect } from 'react';
import { useSignLanguageRecognition } from 'sign-speak-sdk';
function ASLHookExample() {
const { prediction, startRecognition, stopRecognition, recording, loading } = useSignLanguageRecognition();
const interpretation = prediction?.prediction.filter(x => x.confidence > Math.log(0.5)).map(x => x.prediction).join(" ");
return (
<div>
{
loading ? <p>loading...</p> : (
recording ?
<button onClick={stopRecognition}>Stop Watching</button> :
<button onClick={startRecognition}>Start Watching</button>
)
}
<p>{interpretation}</p>
</div>
);
}
import React from 'react';
import { useSpeechRecognition } from 'sign-speak-sdk';
function SpeechHookExample() {
const { prediction, startRecognition, stopRecognition, recording, loading } = useSpeechRecognition();
const interpretation = prediction?.prediction.filter(x => x.confidence > Math.log(0.5)).map(x => x.prediction).join(" ");
return (
<div>
{
loading ? <p>loading...</p> : (
recording ?
<button onClick={stopRecognition}>Stop Listening</button> :
<button onClick={startRecognition}>Start Listening</button>
)
}
<p>{interpretation}</p>
</div>
);
}
note: analogous hooks exist for ASL production (
useSignProduction
) and Text-to-Speech (useSpeechProduction
).
import { recognizeSign } from 'sign-speak-sdk/network/rest';
import { getKey } from 'sign-speak-sdk/network/key';
const videoBase64 = '...'; // Your base64-encoded video
const result = await recognizeSign(videoBase64, {
request_class: 'BLOCKING',
model: 'SLR.2.sm',
hint: 'Name is Nikolas.',
single_recognition_mode: true,
apiKey: getKey()
});
console.log(result.prediction);
import { produceSign } from 'sign-speak-sdk/network/rest';
const response = await produceSign({
english: "Welcome to Sign-Speak!",
}, {
request_class: "BATCH",
model: "MALE"
});
console.log("Batch ID:", response.batch_id);
import { SignSpeakWebSocket } from 'sign-speak-sdk/network/websockets';
async function startRealtimeRecognition() {
const wsConfig = {
sliceLength: 500,
singleRecognitionMode: false
};
const onASLPrediction = (pred: any) => {
console.log('Realtime ASL:', pred);
};
const socketClient = new SignSpeakWebSocket(wsConfig, onASLPrediction);
await socketClient.connect()
await socketClient.streamLiveVideo();
// record for 5 seconds and then stop
setTimeout(() => {
socketClient.stopStreaming(false) // if you set the terminate signal to true here, the socket will close immediately
}, 5000)
}
Sign-Speak provides flexible request classes:
- BLOCKING (RESTful): Immediate response; falls back to BATCH if takes longer than ~30 seconds
- BATCH (RESTful): Start-and-query later; perfect for large or time-intensive tasks
- WebSocket (streaming): Optimized for real-time, latency-sensitive user experiences
Pass configuration when initializing hooks, components, or WebSockets:
const config = {
sliceLength: 500, // milliseconds video/audio chunks
singleRecognitionMode: true,
deviceId: 'camera-device-id'
};
- PRIVATE keys: Manage API keys, account settings (server-side only)
- PUBLIC keys: Used in client-side for inference only
- ROOT key: A required administrative key that cannot be removed
Manage keys at client.sign-speak.com. Do not expose private keys publicly.
You can train and improve our models using the SDK feedback system:
import { submitFeedback } from 'sign-speak-sdk/network/rest';
await submitFeedback({
feedback_id: "feedback_id_from_response",
rating: "GOOD",
correction: ""
});
Interested in contributing? We ❤️ pull requests! Your contributions are always welcome.
We’re a small startup, so while we don’t have a formal code of conduct yet, we ask that you be respectful and collaborative—let’s keep it that way!
To contribute, please follow these steps:
- Fork the repository
- Create a feature or bugfix branch (
git checkout -b feat/my-feature
) - Commit your improvements and submit a PR
For assistance, reach us at management@sign-speak.com.
Below shows a quick example integration which builds a translator between ASL and English (note that we use daisyUI and tailwind here to style):
import { useEffect, useState } from "react";
import { RecognitionResult, setKey, SignProduction, SignRecognition, SpeechProduction } from "@sign-speak/react-sdk"
export default function Demo({
apiKey
}: {
apiKey: string
}) {
const [directionToASL, setDirectionToASL] = useState(false)
const [engText, setEngText] = useState("Nice to meet you.")
const [submittedEngText, setSubmittedEngText] = useState("Nice to meet you.")
const [recognizedASL, setRecognizedASL] = useState("")
const [show, setShow] = useState(false)
useEffect(() => {
setKey(apiKey)
setShow(true)
}, [apiKey])
const parseSLRResult = (result: RecognitionResult | null) => {
if (result) {
setRecognizedASL(result.prediction.filter(x => x.confidence > Math.log(0.5)).map(x => x.prediction).join(" "))
}
}
if (!show) {
return null
}
return <div className="m-5 w-full">
<div className="flex flex-row items-center">
<h1 className="font-semibold text-2xl mr-4">Translation Demo</h1>
<button onClick={() => setDirectionToASL(!directionToASL)} className="btn bg-base-100">{
directionToASL ? <>
English to ASL
</> : <>
ASL to English
</>
} </button>
</div>
<div className="flex flex-row justify-center gap-8 w-full mt-4">
{
directionToASL ? <>
<div className="card flex-1">
<div className="card bg-base-100 w-full p-6">
<h1 className="font-semibold">English</h1>
<textarea
value={engText}
onChange={e => setEngText(e.target.value)}
placeholder="Enter Text in English"
className="bg-base-200 rounded-lg mt-4 p-2 textarea resize-none"
/>
<div>
<button
onClick={() => setSubmittedEngText(engText)}
className="btn btn-primary mt-2"
>
Translate
</button>
</div>
</div>
</div>
<div className="card flex-1">
<div className="card bg-base-100 p-6 w-full">
<h1 className="font-semibold mb-4">ASL Translation</h1>
<SignProduction
key="production"
videoContainerClassName="w-full relative aspect-video"
loadingClassName="w-full h-full absolute left-1/2 -translate-x-1/2 top-0 skeleton"
videoClassName="w-full h-full"
text={submittedEngText}
/>
</div>
</div>
</> : <>
<div className="card flex-1">
<div className="card bg-base-100 w-full p-6">
<h1 className="font-semibold mb-2">ASL</h1>
<SignRecognition interpretationClassName="hidden" containerClassName="relative" loadingClassName="absolute w-full h-full absolute top-0 skeleton !rounded-none" buttonClassName={"btn btn-primary"} onResult={parseSLRResult} />
</div>
</div>
<div className="card flex-1">
<div className="card bg-base-100 p-6 w-full">
<h1 className="font-semibold mb-4">English Translation</h1>
<SpeechProduction text={recognizedASL} />
<p>{recognizedASL}</p>
</div>
</div>
</>
}
</div>
</div>
}
At Sign-Speak, we're on a mission to make technology truly inclusive. It all started with our Deaf founder asking a simple yet game-changing question: “Why can’t I just sign to everything that has voice recognition?”
And then came the second realization: “Why are we building gesture recognition when we already have a full visual language?” 🎯
So, instead of making people wave their hands around like they’re casting spells in a video game, we built real ASL recognition—because sign language is a language, not just a bunch of gestures.
This journey has been 9 years in the making, and now, we’re handing you the keys to a groundbreaking SDK that makes ASL integration seamless. Whether you're making an app more accessible or exploring gesture recognition for innovative interactions through ASL, we can't wait to see what you build!