Voice Interfaces¶
ASHAI provides two voice interfaces for different interaction patterns:
Standard Voice Chat (/voice
)¶
Browser-based voice interface using Web Speech APIs for basic voice interaction.
Features: - Voice-to-text input using Web Speech API - Text-to-speech responses with natural voice synthesis - Same comprehensive tool access as text-based agents - Patient profile integration - Mobile-friendly responsive design
Best for: Direct patient consultations, accessibility needs, hands-free operation
Realtime Voice Chat (/voice-realtime
)¶
Advanced WebSocket-based interface designed for ChatGPT-style real-time voice interaction.
Features: - Real-time voice streaming with OpenAI Realtime API - Continuous conversation flow - Low-latency responses - Advanced voice processing
Best for: Natural conversation flow, requires OpenAI Realtime API access
Getting Started¶
- Start the server:
./run.sh
- Access voice interfaces:
- Standard:
http://localhost:8000/voice
- Realtime:
http://localhost:8000/voice-realtime
Both interfaces provide access to the same underlying medical knowledge bases and AI capabilities through voice interaction.