Voice Interfaces¶

ASHAI provides two voice interfaces for different interaction patterns:

Standard Voice Chat (`/voice`)¶

Browser-based voice interface using Web Speech APIs for basic voice interaction.

Features: - Voice-to-text input using Web Speech API - Text-to-speech responses with natural voice synthesis - Same comprehensive tool access as text-based agents - Patient profile integration - Mobile-friendly responsive design

Best for: Direct patient consultations, accessibility needs, hands-free operation

Realtime Voice Chat (`/voice-realtime`)¶

Advanced WebSocket-based interface designed for ChatGPT-style real-time voice interaction.

Features: - Real-time voice streaming with OpenAI Realtime API - Continuous conversation flow - Low-latency responses - Advanced voice processing

Best for: Natural conversation flow, requires OpenAI Realtime API access

Getting Started¶

Start the server: ./run.sh
Access voice interfaces:
Standard: http://localhost:8000/voice
Realtime: http://localhost:8000/voice-realtime

Both interfaces provide access to the same underlying medical knowledge bases and AI capabilities through voice interaction.

Voice Interfaces¶

Standard Voice Chat (/voice)¶

Realtime Voice Chat (/voice-realtime)¶

Getting Started¶

Standard Voice Chat (`/voice`)¶

Realtime Voice Chat (`/voice-realtime`)¶