> ## Documentation Index
> Fetch the complete documentation index at: https://docs.heylua.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Voice Plugins

> Construct STT, TTS, LLM, and realtime engines for LuaVoice via the lua-cli/voice subpath

## Overview

`lua-cli/voice` re-exports the LiveKit plugin namespaces that `LuaVoice` accepts as class instances. Importing through `lua-cli/voice` means you don't need to add the underlying plugin packages as direct dependencies in your project.

```typescript theme={null}
import { LuaVoice } from 'lua-cli';
import { deepgram, elevenlabs, openai, google, xai, inference } from 'lua-cli/voice';
```

<Note>
  **For most voice agents, prefer the string-descriptor form** documented in [Voice API](/api/voice). The plugin class forms documented here are for two cases: (1) you need provider-specific options not exposed by the descriptor route, or (2) you're using a realtime (speech-to-speech) model in the `llm` slot.
</Note>

***

## What's allowed where

The compiler enforces two separate allowlists. Knowing them up front saves time:

| Form                                           | Allowed in `llm` / `stt` / `tts`                                                                                                                           |
| ---------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `'<provider>/<model>'` string descriptor       | **Any** provider supported by Lua's inference layer. The descriptor route handles credentials.                                                             |
| `new deepgram.<Class>({...})`                  | Plugin route. Only `deepgram` and `elevenlabs` are allowlisted.                                                                                            |
| `new elevenlabs.<Class>({...})`                | Plugin route. Only `deepgram` and `elevenlabs` are allowlisted.                                                                                            |
| `new inference.<Class>({ model, ... })`        | Typed shortcut for the descriptor route — same semantics as a string descriptor, just with autocomplete on the options.                                    |
| `new <provider>.realtime.RealtimeModel({...})` | Realtime route. `openai`, `google` (via `google.beta.realtime.*`), `xai` are allowlisted **for realtime only** (goes in the `llm` slot, replaces STT+TTS). |

<Warning>
  **`new openai.LLM(...)`, `new google.LLM(...)`, `new xai.LLM(...)` and similar class forms fail compile-time validation.** These providers are not on the plugin allowlist. Use string descriptors (`'openai/gpt-5'`) or — for speech-to-speech — the realtime form (`new openai.realtime.RealtimeModel({...})`).
</Warning>

***

## Plugin Route: Deepgram + ElevenLabs

The two allowlisted plugin providers. Use these class forms when you need provider-specific options not exposed by the string-descriptor route.

### Deepgram STT (plugin form)

```typescript theme={null}
import { LuaVoice } from 'lua-cli';
import { deepgram, elevenlabs } from 'lua-cli/voice';

export default new LuaVoice({
  name: 'support-line',
  llm: 'openai/gpt-5.1-chat-latest',
  stt: new deepgram.STT({
    model: 'nova-3',
    smartFormat: true,
    fillerWords: false,
  }),
  tts: new elevenlabs.TTS({
    voiceId: 'pwMBn0SsmN1220Aorv15',
    model: 'eleven_flash_v2_5',
  }),
});
```

Deepgram exposes two STT classes:

* **`new deepgram.STT({...})`** — Deepgram's v1 WebSocket endpoint. Use this for `nova-3`, `nova-2`, etc.
* **`new deepgram.STTv2({...})`** — Deepgram's v2 endpoint. Required for **Flux** models that use semantic endpointing (`eotThreshold`, `eagerEotThreshold`, `eotTimeoutMs`).

The compiler routes each to the correct underlying plugin based on which class you used.

### ElevenLabs TTS (plugin form)

```typescript theme={null}
tts: new elevenlabs.TTS({
  voiceId: 'pwMBn0SsmN1220Aorv15',
  model: 'eleven_v3',
  stability: 0.5,
  similarityBoost: 0.75,
});
```

The plugin route lets you pass advanced ElevenLabs options (stability, similarity boost, style, speaker boost, etc.) that the descriptor route doesn't surface.

***

## Inference Route (typed shortcut)

`inference.LLM`, `inference.STT`, `inference.TTS` are typed wrappers for the string-descriptor route. The compiler normalizes both forms to the same wire shape; the class form just gives you better TypeScript autocomplete on the options.

```typescript theme={null}
import { LuaVoice } from 'lua-cli';
import { inference } from 'lua-cli/voice';

export default new LuaVoice({
  name: 'support-line',
  llm: new inference.LLM({ model: 'openai/gpt-5.1-chat-latest' }),
  stt: new inference.STT({ model: 'deepgram/nova-3' }),
  tts: new inference.TTS({
    model: 'elevenlabs/eleven_turbo_v2_5',
    voice: 'pwMBn0SsmN1220Aorv15',
  }),
});
```

The `model` option is required — it's the same provider-prefixed string you'd pass directly. For TTS, pass `voice` separately.

This is the **only** way to use class syntax for providers that aren't on the plugin allowlist (OpenAI, Google, xAI, Cartesia, etc.).

***

## Realtime Route (speech-to-speech)

The realtime route puts a speech-to-speech model in the `llm` slot, replacing the cascaded STT → LLM → TTS pipeline. The class-construction path differs by provider:

* **OpenAI**: `new openai.realtime.RealtimeModel({...})`
* **Google (Gemini)**: `new google.beta.realtime.RealtimeModel({...})` — note the `.beta.` prefix (matches Google's Node SDK shape)

```typescript theme={null}
import { LuaVoice } from 'lua-cli';
import { openai } from 'lua-cli/voice';

export default new LuaVoice({
  name: 'realtime-line',
  llm: new openai.realtime.RealtimeModel({
    model: 'gpt-realtime-1.5',
    voice: 'alloy',
  }),
  // stt and tts are NOT specified — realtime handles audio directly.
});
```

```typescript theme={null}
import { LuaVoice } from 'lua-cli';
import { google } from 'lua-cli/voice';

export default new LuaVoice({
  name: 'realtime-gemini',
  llm: new google.beta.realtime.RealtimeModel({
    model: 'gemini-3.1-flash-live-preview',
  }),
});
```

### Available realtime models

| Class form                                                                           | Model id                        | Notes                                |
| ------------------------------------------------------------------------------------ | ------------------------------- | ------------------------------------ |
| `new openai.realtime.RealtimeModel({ model: 'gpt-realtime-1.5' })`                   | `gpt-realtime-1.5`              | OpenAI flagship realtime. GA.        |
| `new openai.realtime.RealtimeModel({ model: 'gpt-realtime-mini' })`                  | `gpt-realtime-mini`             | Cost-efficient OpenAI realtime. GA.  |
| `new google.beta.realtime.RealtimeModel({ model: 'gemini-3.1-flash-live-preview' })` | `gemini-3.1-flash-live-preview` | Newest Gemini realtime. Preview.     |
| `new google.beta.realtime.RealtimeModel({ model: 'gemini-2.5-flash-live-preview' })` | `gemini-2.5-flash-live-preview` | Cheaper Gemini alternative. Preview. |

<Note>
  `xai` is reserved in the realtime allowlist but no xAI realtime models are currently published.
</Note>

### Half-cascade mode

You can keep a separate `tts` with a realtime LLM — the worker injects `modalities: ['text']` so the realtime model emits text and `tts` handles synthesis. Useful when you want realtime's low-latency reasoning but ElevenLabs' voice quality:

```typescript theme={null}
new LuaVoice({
  name: 'hybrid-line',
  llm: new openai.realtime.RealtimeModel({ model: 'gpt-realtime-mini' }),
  // stt omitted — realtime handles input audio.
  tts: 'elevenlabs/eleven_turbo_v2_5:pwMBn0SsmN1220Aorv15',
});
```

You **cannot** combine a realtime `llm` with a custom `stt` — the compiler rejects it. Realtime models handle audio input directly.

***

## Credentials

Plugin class instances rely on credentials provisioned by the Lua platform — you do **not** need to set `DEEPGRAM_API_KEY`, `ELEVENLABS_API_KEY`, etc. in your project's `.env`. The keys live in the lua-livekit worker; the compiled push payload references the class form and the worker constructs the actual plugin at runtime.

***

## When to use which form

| Goal                                                  | Recommended form                                                                             |
| ----------------------------------------------------- | -------------------------------------------------------------------------------------------- |
| Quick start, sensible defaults                        | **String descriptor** — `stt: 'deepgram/nova-3'`                                             |
| TypeScript autocomplete on options                    | **`inference.X`** — `stt: new inference.STT({ model: 'deepgram/nova-3' })`                   |
| Deepgram or ElevenLabs with provider-specific options | **Plugin class** — `stt: new deepgram.STT({ model: 'nova-3', smartFormat: true })`           |
| Speech-to-speech (OpenAI/Google/xAI realtime)         | **Realtime class** — `llm: new openai.realtime.RealtimeModel({ model: 'gpt-realtime-1.5' })` |

## Related

* [Voice API](/api/voice) — `LuaVoice` definition and string-descriptor catalog
* [Voice Command](/cli/voice-command) — live testing
