Voice Plugins

Overview

lua-cli/voice re-exports the LiveKit plugin namespaces that LuaVoice accepts as class instances. Importing through lua-cli/voice means you don’t need to add the underlying plugin packages as direct dependencies in your project.

import { LuaVoice } from 'lua-cli';
import { deepgram, elevenlabs, openai, google, xai, inference } from 'lua-cli/voice';

For most voice agents, prefer the string-descriptor form documented in Voice API. The plugin class forms documented here are for two cases: (1) you need provider-specific options not exposed by the descriptor route, or (2) you’re using a realtime (speech-to-speech) model in the llm slot.

What’s allowed where

The compiler enforces two separate allowlists. Knowing them up front saves time:

Form	Allowed in `llm` / `stt` / `tts`
`'<provider>/<model>'` string descriptor	Any provider supported by Lua’s inference layer. The descriptor route handles credentials.
`new deepgram.<Class>({...})`	Plugin route. Only `deepgram` and `elevenlabs` are allowlisted.
`new elevenlabs.<Class>({...})`	Plugin route. Only `deepgram` and `elevenlabs` are allowlisted.
`new inference.<Class>({ model, ... })`	Typed shortcut for the descriptor route — same semantics as a string descriptor, just with autocomplete on the options.
`new <provider>.realtime.RealtimeModel({...})`	Realtime route. `openai`, `google` (via `google.beta.realtime.`), `xai` are allowlisted for realtime only* (goes in the `llm` slot, replaces STT+TTS).

new openai.LLM(...), new google.LLM(...), new xai.LLM(...) and similar class forms fail compile-time validation. These providers are not on the plugin allowlist. Use string descriptors ('openai/gpt-5') or — for speech-to-speech — the realtime form (new openai.realtime.RealtimeModel({...})).

Plugin Route: Deepgram + ElevenLabs

The two allowlisted plugin providers. Use these class forms when you need provider-specific options not exposed by the string-descriptor route.

Deepgram STT (plugin form)

import { LuaVoice } from 'lua-cli';
import { deepgram, elevenlabs } from 'lua-cli/voice';

export default new LuaVoice({
  name: 'support-line',
  llm: 'openai/gpt-5.1-chat-latest',
  stt: new deepgram.STT({
    model: 'nova-3',
    smartFormat: true,
    fillerWords: false,
  }),
  tts: new elevenlabs.TTS({
    voiceId: 'pwMBn0SsmN1220Aorv15',
    model: 'eleven_flash_v2_5',
  }),
});

Deepgram exposes two STT classes:

new deepgram.STT({...}) — Deepgram’s v1 WebSocket endpoint. Use this for nova-3, nova-2, etc.
new deepgram.STTv2({...}) — Deepgram’s v2 endpoint. Required for Flux models that use semantic endpointing (eotThreshold, eagerEotThreshold, eotTimeoutMs).

The compiler routes each to the correct underlying plugin based on which class you used.

ElevenLabs TTS (plugin form)

tts: new elevenlabs.TTS({
  voiceId: 'pwMBn0SsmN1220Aorv15',
  model: 'eleven_v3',
  stability: 0.5,
  similarityBoost: 0.75,
});

The plugin route lets you pass advanced ElevenLabs options (stability, similarity boost, style, speaker boost, etc.) that the descriptor route doesn’t surface.

Inference Route (typed shortcut)

inference.LLM, inference.STT, inference.TTS are typed wrappers for the string-descriptor route. The compiler normalizes both forms to the same wire shape; the class form just gives you better TypeScript autocomplete on the options.

import { LuaVoice } from 'lua-cli';
import { inference } from 'lua-cli/voice';

export default new LuaVoice({
  name: 'support-line',
  llm: new inference.LLM({ model: 'openai/gpt-5.1-chat-latest' }),
  stt: new inference.STT({ model: 'deepgram/nova-3' }),
  tts: new inference.TTS({
    model: 'elevenlabs/eleven_turbo_v2_5',
    voice: 'pwMBn0SsmN1220Aorv15',
  }),
});

The model option is required — it’s the same provider-prefixed string you’d pass directly. For TTS, pass voice separately. This is the only way to use class syntax for providers that aren’t on the plugin allowlist (OpenAI, Google, xAI, Cartesia, etc.).

Realtime Route (speech-to-speech)

The realtime route puts a speech-to-speech model in the llm slot, replacing the cascaded STT → LLM → TTS pipeline. The class-construction path differs by provider:

OpenAI: new openai.realtime.RealtimeModel({...})
Google (Gemini): new google.beta.realtime.RealtimeModel({...}) — note the .beta. prefix (matches Google’s Node SDK shape)

import { LuaVoice } from 'lua-cli';
import { openai } from 'lua-cli/voice';

export default new LuaVoice({
  name: 'realtime-line',
  llm: new openai.realtime.RealtimeModel({
    model: 'gpt-realtime-1.5',
    voice: 'alloy',
  }),
  // stt and tts are NOT specified — realtime handles audio directly.
});

import { LuaVoice } from 'lua-cli';
import { google } from 'lua-cli/voice';

export default new LuaVoice({
  name: 'realtime-gemini',
  llm: new google.beta.realtime.RealtimeModel({
    model: 'gemini-3.1-flash-live-preview',
  }),
});

Available realtime models

Class form	Model id	Notes
`new openai.realtime.RealtimeModel({ model: 'gpt-realtime-1.5' })`	`gpt-realtime-1.5`	OpenAI flagship realtime. GA.
`new openai.realtime.RealtimeModel({ model: 'gpt-realtime-mini' })`	`gpt-realtime-mini`	Cost-efficient OpenAI realtime. GA.
`new google.beta.realtime.RealtimeModel({ model: 'gemini-3.1-flash-live-preview' })`	`gemini-3.1-flash-live-preview`	Newest Gemini realtime. Preview.
`new google.beta.realtime.RealtimeModel({ model: 'gemini-2.5-flash-live-preview' })`	`gemini-2.5-flash-live-preview`	Cheaper Gemini alternative. Preview.

xai is reserved in the realtime allowlist but no xAI realtime models are currently published.

Half-cascade mode

You can keep a separate tts with a realtime LLM — the worker injects modalities: ['text'] so the realtime model emits text and tts handles synthesis. Useful when you want realtime’s low-latency reasoning but ElevenLabs’ voice quality:

new LuaVoice({
  name: 'hybrid-line',
  llm: new openai.realtime.RealtimeModel({ model: 'gpt-realtime-mini' }),
  // stt omitted — realtime handles input audio.
  tts: 'elevenlabs/eleven_turbo_v2_5:pwMBn0SsmN1220Aorv15',
});

You cannot combine a realtime llm with a custom stt — the compiler rejects it. Realtime models handle audio input directly.

Credentials

Plugin class instances rely on credentials provisioned by the Lua platform — you do not need to set DEEPGRAM_API_KEY, ELEVENLABS_API_KEY, etc. in your project’s .env. The keys live in the lua-livekit worker; the compiled push payload references the class form and the worker constructs the actual plugin at runtime.

When to use which form

Goal	Recommended form
Quick start, sensible defaults	String descriptor — `stt: 'deepgram/nova-3'`
TypeScript autocomplete on options	`inference.X` — `stt: new inference.STT({ model: 'deepgram/nova-3' })`
Deepgram or ElevenLabs with provider-specific options	Plugin class — `stt: new deepgram.STT({ model: 'nova-3', smartFormat: true })`
Speech-to-speech (OpenAI/Google/xAI realtime)	Realtime class — `llm: new openai.realtime.RealtimeModel({ model: 'gpt-realtime-1.5' })`

Voice API — LuaVoice definition and string-descriptor catalog
Voice Command — live testing

Getting Started

Core Concepts

CLI Commands

API Reference

Template & Examples

Overview

What’s allowed where

Plugin Route: Deepgram + ElevenLabs

Deepgram STT (plugin form)

ElevenLabs TTS (plugin form)

Inference Route (typed shortcut)

Realtime Route (speech-to-speech)

Available realtime models

Half-cascade mode

Credentials

When to use which form

Getting Started

Core Concepts

CLI Commands

API Reference

Template & Examples

Documentation Index

​Overview

​What’s allowed where

​Plugin Route: Deepgram + ElevenLabs

​Deepgram STT (plugin form)

​ElevenLabs TTS (plugin form)

​Inference Route (typed shortcut)

​Realtime Route (speech-to-speech)

​Available realtime models

​Half-cascade mode

​Credentials

​When to use which form

​Related

Overview

What’s allowed where

Plugin Route: Deepgram + ElevenLabs

Deepgram STT (plugin form)

ElevenLabs TTS (plugin form)

Inference Route (typed shortcut)

Realtime Route (speech-to-speech)

Available realtime models

Half-cascade mode

Credentials

When to use which form

Related