A powerful, fully configurable React component that provides instant voice chat functionality powered by OpenAI's Realtime API. Instead of building voice chat from scratch, simply import and configure this component to add natural, human-like voice conversations to your application.
- 🎯 Instant Integration: Add voice chat to your app in minutes, not days
- 🔧 Fully Configurable: Customize every aspect of the voice chat experience
- 🎭 Natural Conversations: Built-in emotional expressions and human-like speech patterns
- 🎨 Multiple UI Variants: Choose from 4 pre-built UI designs or create your own
- ⚡ Real-time Communication: Powered by OpenAI's latest Realtime API
- 🛡️ Production Ready: Comprehensive error handling and fallback mechanisms
npm install realtime-voice-ai
yarn add realtime-voice-ai
import React from 'react';
import { VoiceChatTrigger } from 'realtime-voice-ai';
function App() {
const config = {
instructions: "You are a helpful assistant that speaks naturally with emotions.",
voice: 'verse',
temperature: 0.8
};
return (
<VoiceChatTrigger
name="John"
botType="rvc"
uiVersion="v2"
apikey="your-openai-api-key"
config={config}
/>
);
}
Prop | Type | Required | Default | Description |
---|---|---|---|---|
name |
string |
No | undefined |
User's name for personalized greetings and responses |
botType |
string |
Yes | - | Type of voice chat: 'rvc' (RealTimeVoiceChat) or 'va' (VoiceAssistant) |
uiVersion |
string |
No | 'v1' |
UI variant: 'v1' , 'v2' , 'v3' , 'v4' , or 'custom'
|
customUI |
object |
No | undefined |
Custom UI configuration when uiVersion='custom'
|
apikey |
string |
Yes | - | Your OpenAI API key |
isDisabled |
boolean |
No | false |
Whether the voice chat is disabled |
config |
object |
No | {} |
Session configuration object (see below) |
Property | Type | Default | Allowed Values | Description |
---|---|---|---|---|
model |
string |
'gpt-4o-realtime-preview-2024-12-17' |
'gpt-4o-realtime-preview-2024-12-17' |
OpenAI Realtime model |
modalities |
array |
['audio', 'text'] |
['audio'] , ['text'] , ['audio', 'text']
|
Supported interaction modes |
instructions |
string |
Natural conversation prompt | Any string | AI personality and behavior instructions |
Property | Type | Default | Allowed Values | Description |
---|---|---|---|---|
voice |
string |
'alloy' |
'alloy' , 'ash' , 'ballad' , 'coral' , 'echo' , 'sage' , 'shimmer' , 'verse'
|
Voice personality |
temperature |
number |
0.8 |
0.6 - 1.2
|
Response creativity and randomness |
max_response_output_tokens |
string|number |
'inf' |
'inf' or 1 - 4096
|
Maximum response length |
Property | Type | Default | Allowed Values | Description |
---|---|---|---|---|
input_audio_format |
string |
'pcm16' |
'pcm16' , 'g711_ulaw' , 'g711_alaw'
|
Input audio format |
output_audio_format |
string |
'pcm16' |
'pcm16' , 'g711_ulaw' , 'g711_alaw'
|
Output audio format |
input_audio_transcription |
object |
{ model: 'whisper-1' } |
{ model: 'whisper-1' } |
Transcription settings |
Property | Type | Default | Range/Values | Description |
---|---|---|---|---|
turn_detection.type |
string |
'server_vad' |
'server_vad' , 'none'
|
Voice activity detection type |
turn_detection.threshold |
number |
0.5 |
0.0 - 1.0
|
Voice detection sensitivity |
turn_detection.prefix_padding_ms |
number |
200 |
0 - 5000
|
Audio padding before speech (ms) |
turn_detection.silence_duration_ms |
number |
400 |
0 - 20000
|
Silence duration to trigger response (ms) |
turn_detection.create_response |
boolean |
true |
true , false
|
Auto-generate responses |
Choose from multiple pre-built UI designs:
Classic animated blob with pulse effects
Modern gradient blob with smooth animations
Geometric animated shapes with color transitions
Advanced particle system with dynamic effects
Advanced particle system with dynamic effects
Use your own UI component by passing it to the customUI
prop
Voice | Characteristics | Best For |
---|---|---|
alloy |
Neutral, balanced | General purpose, professional |
ash |
Smooth, sophisticated | Business presentations, formal conversations |
ballad |
Melodic, expressive | Storytelling, creative content |
coral |
Warm, friendly | Customer service, casual conversations |
echo |
Clear, crisp | Technical support, education |
sage |
Wise, calm | Healthcare, therapy, guidance |
shimmer |
Soft, gentle | Children's content, soothing interactions |
verse |
Rhythmic, engaging | Entertainment, dynamic conversations |
import { VoiceChatTrigger } from 'realtime-voice-ai';
const BasicExample = () => {
const basicConfig = {
instructions: "You are a helpful customer service representative.",
voice: "coral",
temperature: 0.7
};
return (
<VoiceChatTrigger
name="Sarah"
botType="rvc"
uiVersion="v2"
apikey={process.env.REACT_APP_OPENAI_API_KEY}
config={basicConfig}
/>
);
};
const AdvancedExample = () => {
const advancedConfig = {
model: 'gpt-4o-realtime-preview-2024-12-17',
modalities: ['audio', 'text'],
instructions: `You are an enthusiastic fitness coach. Use natural expressions like "Oh wow!", "That's amazing!", and "Let's go!" to motivate users. Be encouraging and energetic.`,
voice: 'nova',
temperature: 0.9,
max_response_output_tokens: 2048,
turn_detection: {
type: 'server_vad',
threshold: 0.6,
prefix_padding_ms: 300,
silence_duration_ms: 500,
create_response: true
}
};
return (
<VoiceChatTrigger
name="Alex"
botType="rvc"
uiVersion="v4"
apikey={process.env.REACT_APP_OPENAI_API_KEY}
config={advancedConfig}
/>
);
};
For security, store your API key in environment variables:
# .env
REACT_APP_OPENAI_API_KEY=your_openai_api_key_here
// Use in your component
apikey={import.meta.env.REACT_APP_OPENAI_API_KEY}
Fine-tune when the AI responds by adjusting these parameters:
- High Sensitivity (threshold: 0.2-0.4): Responds to quieter speech, may pick up background noise
- Medium Sensitivity (threshold: 0.4-0.6): Balanced detection, good for most environments
- Low Sensitivity (threshold: 0.6-0.8): Requires clearer speech, filters out background noise
- Fast Response (silence_duration_ms: 200-400): Quick interactions, may interrupt user
- Normal Response (silence_duration_ms: 400-600): Standard conversation timing
- Thoughtful Response (silence_duration_ms: 600-1000): Allows for pauses, better for complex topics
Issue | Solution |
---|---|
No audio playback | Check browser microphone permissions |
Voice cuts out | Adjust threshold value (try 0.6-0.7) |
AI interrupts user | Increase silence_duration_ms (try 600-800) |
Delayed responses | Decrease silence_duration_ms (try 300-400) |
Connection fails | Verify OpenAI API key and internet connection |
- ✅ Chrome 88+
- ✅ Firefox 84+
- ✅ Safari 14.1+
- ✅ Edge 88+
Contributions are welcome! Please feel free to submit a Pull Request.
If you encounter any issues or have questions:
- 🐛 Open an issue on our GitHub repository
- 📧 Contact directly: huzzaifaasim@gmail.com
Made with ❤️ for developers who want to add natural voice conversations to their applications.