How to build flexible, on-prem voice agents using LiveKit & Palantir’s OSDK
In our last blog, we talked about building agentic V2V workflows by exposing Palantir’s OSDK to ElevenLabs’ Conversational AI product. This approach allowed us to build robust voice agents capable of securely querying enterprise data, generating customised action plans and updating systems of action based on interactions with customers.
Our customers loved this. But their requirements prompted us to go deeper and see how much of the end-to-end flow we could build ourselves.
As a reminder, a turn-based conversational AI product has five components:
- STT (Speech-to-Text): Convert user’s spoken input into text.
- LLM (Large Language Model): Core reasoning and language engine for understanding, planning, and generating responses.
- Agents: Built on top of the LLM using:
- System Prompt: Define role, tone, and objectives (e.g. ‘You are a travel sales agent’).
- Tools: Domain-specific capabilities (e.g. Palantir OSDK, databases, search APIs, booking engines).
- TTS (Text-to-Speech): Convert the LLM’s textual response back into natural-sounding audio.
- Orchestration Layer: Route between STT → LLM → Tools → TTS and manage multi-turn context (memory, conversation state, session management).

Source:https://elevenlabs.io/blog/how-do-you-optimize-latency-for-conversational-ai
Our customers needed four features that our ElevenLabs integration couldn’t handle:
- Model flexibility - ElevenLabs does not support external STT or TTS models like Cartesia’s Sonic or Deepgram’s Nova-3; their in-house models only support a handful of languages; as a result, many regional languages are not supported.
- Silent voice co-pilots - these copilots do not include the final TTS layer from the diagram above. For example, a copilot might silently listen to a conversation provide helpful suggestions, automatically search for mentioned objects, and automate follow-on actions. Most conversational AI products aren’t modular, so the TTS layer cannot be removed.
- Zero data retention - ElevenLabs only provides ZDR as part of a comprehensive enterprise licence.
- On-prem/VPC hosting - ElevenLabs does not offer this, even with the enterprise license.