Documentation Index
Fetch the complete documentation index at: https://docs.heylua.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
lua-cli/voice re-exports the LiveKit plugin namespaces that LuaVoice accepts as class instances. Importing through lua-cli/voice means you don’t need to add the underlying plugin packages as direct dependencies in your project.
For most voice agents, prefer the string-descriptor form documented in Voice API. The plugin class forms documented here are for two cases: (1) you need provider-specific options not exposed by the descriptor route, or (2) you’re using a realtime (speech-to-speech) model in the
llm slot.What’s allowed where
The compiler enforces two separate allowlists. Knowing them up front saves time:| Form | Allowed in llm / stt / tts |
|---|---|
'<provider>/<model>' string descriptor | Any provider supported by Lua’s inference layer. The descriptor route handles credentials. |
new deepgram.<Class>({...}) | Plugin route. Only deepgram and elevenlabs are allowlisted. |
new elevenlabs.<Class>({...}) | Plugin route. Only deepgram and elevenlabs are allowlisted. |
new inference.<Class>({ model, ... }) | Typed shortcut for the descriptor route — same semantics as a string descriptor, just with autocomplete on the options. |
new <provider>.realtime.RealtimeModel({...}) | Realtime route. openai, google (via google.beta.realtime.*), xai are allowlisted for realtime only (goes in the llm slot, replaces STT+TTS). |
Plugin Route: Deepgram + ElevenLabs
The two allowlisted plugin providers. Use these class forms when you need provider-specific options not exposed by the string-descriptor route.Deepgram STT (plugin form)
new deepgram.STT({...})— Deepgram’s v1 WebSocket endpoint. Use this fornova-3,nova-2, etc.new deepgram.STTv2({...})— Deepgram’s v2 endpoint. Required for Flux models that use semantic endpointing (eotThreshold,eagerEotThreshold,eotTimeoutMs).
ElevenLabs TTS (plugin form)
Inference Route (typed shortcut)
inference.LLM, inference.STT, inference.TTS are typed wrappers for the string-descriptor route. The compiler normalizes both forms to the same wire shape; the class form just gives you better TypeScript autocomplete on the options.
model option is required — it’s the same provider-prefixed string you’d pass directly. For TTS, pass voice separately.
This is the only way to use class syntax for providers that aren’t on the plugin allowlist (OpenAI, Google, xAI, Cartesia, etc.).
Realtime Route (speech-to-speech)
The realtime route puts a speech-to-speech model in thellm slot, replacing the cascaded STT → LLM → TTS pipeline. The class-construction path differs by provider:
- OpenAI:
new openai.realtime.RealtimeModel({...}) - Google (Gemini):
new google.beta.realtime.RealtimeModel({...})— note the.beta.prefix (matches Google’s Node SDK shape)
Available realtime models
| Class form | Model id | Notes |
|---|---|---|
new openai.realtime.RealtimeModel({ model: 'gpt-realtime-1.5' }) | gpt-realtime-1.5 | OpenAI flagship realtime. GA. |
new openai.realtime.RealtimeModel({ model: 'gpt-realtime-mini' }) | gpt-realtime-mini | Cost-efficient OpenAI realtime. GA. |
new google.beta.realtime.RealtimeModel({ model: 'gemini-3.1-flash-live-preview' }) | gemini-3.1-flash-live-preview | Newest Gemini realtime. Preview. |
new google.beta.realtime.RealtimeModel({ model: 'gemini-2.5-flash-live-preview' }) | gemini-2.5-flash-live-preview | Cheaper Gemini alternative. Preview. |
xai is reserved in the realtime allowlist but no xAI realtime models are currently published.Half-cascade mode
You can keep a separatetts with a realtime LLM — the worker injects modalities: ['text'] so the realtime model emits text and tts handles synthesis. Useful when you want realtime’s low-latency reasoning but ElevenLabs’ voice quality:
llm with a custom stt — the compiler rejects it. Realtime models handle audio input directly.
Credentials
Plugin class instances rely on credentials provisioned by the Lua platform — you do not need to setDEEPGRAM_API_KEY, ELEVENLABS_API_KEY, etc. in your project’s .env. The keys live in the lua-livekit worker; the compiled push payload references the class form and the worker constructs the actual plugin at runtime.
When to use which form
| Goal | Recommended form |
|---|---|
| Quick start, sensible defaults | String descriptor — stt: 'deepgram/nova-3' |
| TypeScript autocomplete on options | inference.X — stt: new inference.STT({ model: 'deepgram/nova-3' }) |
| Deepgram or ElevenLabs with provider-specific options | Plugin class — stt: new deepgram.STT({ model: 'nova-3', smartFormat: true }) |
| Speech-to-speech (OpenAI/Google/xAI realtime) | Realtime class — llm: new openai.realtime.RealtimeModel({ model: 'gpt-realtime-1.5' }) |
Related
- Voice API —
LuaVoicedefinition and string-descriptor catalog - Voice Command — live testing

