> ## Documentation Index
> Fetch the complete documentation index at: https://docs.heylua.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Model Selection

> Choose which LLM powers your agent — statically or dynamically per request

## Overview

By default, every Lua agent uses **Google Gemini 2.5 Flash**. The `model` property on `LuaAgent` lets you choose a different model or select one dynamically based on the request context.

```typescript theme={null}
export const agent = new LuaAgent({
  name: 'my-agent',
  persona: '...',
  model: 'openai/gpt-5.4',  // ← add this
  skills: [mySkill]
});
```

<Note>
  **Lua manages the API credentials.** You don't need to configure any API keys or provider accounts — Lua handles all LLM infrastructure on your behalf.

  Support for user-provided API keys (Bring Your Own Key) is coming in a future release.
</Note>

***

## Available Models

### Google (Vertex AI)

| Model string                           | Context window | Notes                                                |
| -------------------------------------- | -------------- | ---------------------------------------------------- |
| `google/gemini-2.5-flash`              | 1M tokens      | **Default** — fast reasoning, best price-performance |
| `google/gemini-2.5-pro`                | 1M tokens      | Advanced reasoning for complex analysis              |
| `google/gemini-2.5-flash-lite`         | 1M tokens      | Budget-friendly in the 2.5 family                    |
| `google/gemini-3.1-pro-preview`        | 1M tokens      | Most capable Gemini (preview)                        |
| `google/gemini-3-flash-preview`        | 1M tokens      | Next-gen frontier flash (preview)                    |
| `google/gemini-3.1-flash-lite-preview` | 1M tokens      | Next-gen lite (preview)                              |

### OpenAI

| Model string          | Context window | Notes                           |
| --------------------- | -------------- | ------------------------------- |
| `openai/gpt-5.4`      | 1.1M tokens    | Current flagship                |
| `openai/gpt-5.4-mini` | 400K tokens    | Fast flagship variant           |
| `openai/gpt-5.4-nano` | 400K tokens    | Ultra-low-latency               |
| `openai/gpt-5`        | 400K tokens    | Strong general-purpose          |
| `openai/gpt-5-mini`   | 400K tokens    | Compact GPT-5                   |
| `openai/gpt-4.1`      | 1M tokens      | Stable, previous gen            |
| `openai/gpt-4.1-mini` | 1M tokens      | Budget variant of 4.1           |
| `openai/gpt-4.1-nano` | 1M tokens      | Fastest 4.1 variant             |
| `openai/o3`           | 200K tokens    | Most powerful reasoning         |
| `openai/o3-mini`      | 200K tokens    | Fast reasoning                  |
| `openai/o3-pro`       | 200K tokens    | Premium reasoning               |
| `openai/o4-mini`      | 200K tokens    | Latest cost-efficient reasoning |

### Anthropic

| Model string                  | Context window | Notes                           |
| ----------------------------- | -------------- | ------------------------------- |
| `anthropic/claude-opus-4-6`   | 1M tokens      | Most intelligent                |
| `anthropic/claude-sonnet-4-6` | 1M tokens      | Best speed/intelligence balance |
| `anthropic/claude-haiku-4-5`  | 200K tokens    | Fastest model                   |
| `anthropic/claude-opus-4-5`   | 200K tokens    | Previous gen, still active      |
| `anthropic/claude-sonnet-4-5` | 200K tokens    | Previous gen, still active      |
| `anthropic/claude-opus-4-1`   | 200K tokens    | Older gen, still active         |

### DeepSeek

| Model string                 | Context window | Notes                        |
| ---------------------------- | -------------- | ---------------------------- |
| `deepseek/deepseek-chat`     | 131K tokens    | DeepSeek-V3, general purpose |
| `deepseek/deepseek-reasoner` | 128K tokens    | DeepSeek-R1, reasoning       |

### Groq

| Model string                                     | Context window | Notes                        |
| ------------------------------------------------ | -------------- | ---------------------------- |
| `groq/llama-3.3-70b-versatile`                   | 131K tokens    | Production, tool use capable |
| `groq/meta-llama/llama-4-scout-17b-16e-instruct` | 131K tokens    | Latest Llama 4               |
| `groq/openai/gpt-oss-120b`                       | 131K tokens    | GPT-OSS on Groq LPU hardware |
| `groq/groq/compound`                             | 131K tokens    | Groq Compound agentic system |
| `groq/llama-3.1-8b-instant`                      | 131K tokens    | Fast, budget-friendly        |

### xAI (Grok)

| Model string           | Context window | Notes                      |
| ---------------------- | -------------- | -------------------------- |
| `xai/grok-4`           | 256K tokens    | Current flagship           |
| `xai/grok-4-fast`      | 2M tokens      | Fast variant               |
| `xai/grok-3`           | 131K tokens    | Previous gen, still active |
| `xai/grok-3-mini`      | 131K tokens    | Previous gen mini          |
| `xai/grok-3-mini-fast` | 131K tokens    | Previous gen fast          |

### Alibaba (Qwen)

| Model string              | Context window | Notes                      |
| ------------------------- | -------------- | -------------------------- |
| `alibaba/qwen3.6-plus`    | 1M tokens      | Latest flagship (Apr 2026) |
| `alibaba/qwen3-max`       | 262K tokens    | Flagship                   |
| `alibaba/qwen3-235b-a22b` | 131K tokens    | Large MoE model            |
| `alibaba/qwen3-32b`       | 131K tokens    | Mid-size, fast             |
| `alibaba/qwq-plus`        | 131K tokens    | Reasoning model            |

### ZhipuAI (GLM)

| Model string            | Context window | Notes               |
| ----------------------- | -------------- | ------------------- |
| `zhipuai/glm-5.1`       | 203K tokens    | Latest              |
| `zhipuai/glm-5`         | 80K tokens     | Flagship (Feb 2026) |
| `zhipuai/glm-5-turbo`   | 203K tokens    | Fast variant        |
| `zhipuai/glm-4.7`       | 203K tokens    | Previous gen        |
| `zhipuai/glm-4.7-flash` | 203K tokens    | Previous gen flash  |
| `zhipuai/glm-4.6`       | 205K tokens    | Previous gen        |
| `zhipuai/glm-4.5`       | 131K tokens    | Previous gen        |
| `zhipuai/glm-4.5-air`   | 131K tokens    | Previous gen lite   |

<Note>
  **Fallback routing.** If a model is not in Lua's approved list, the request is automatically routed through OpenRouter as a best-effort fallback. If that also isn't available, the request falls back to the default model (`google/gemini-2.5-flash`).
</Note>

***

## Static Model

The simplest form — one model for all requests:

```typescript theme={null}
export const agent = new LuaAgent({
  name: 'support-agent',
  persona: '...',
  model: 'openai/gpt-5.4',
  skills: [supportSkill]
});
```

**When to use:** When you want a specific model across all users and channels.

***

## Dynamic Model Resolver

Use a function to select the model per request. The resolver receives the full request with access to all platform APIs — `User`, `Baskets`, `Products`, `Data`, and more.

```typescript theme={null}
export const agent = new LuaAgent({
  name: 'smart-agent',
  persona: '...',
  model: async (request) => {
    const user = await User.get();
    return user.data?.isPremium ? 'openai/gpt-5.4' : 'google/gemini-2.5-flash';
  },
  skills: [mySkill]
});
```

The resolver must return a `'provider/model'` string synchronously or asynchronously.

***

## Common Patterns

### Premium vs free users

```typescript theme={null}
model: async (request) => {
  const user = await User.get();
  const tier = user.data?.subscriptionTier;

  if (tier === 'pro') return 'openai/gpt-5.4';          // flagship
  if (tier === 'standard') return 'openai/gpt-4.1';     // 1M context, balanced
  return 'google/gemini-2.5-flash';                      // default
}
```

### Channel-based selection

```typescript theme={null}
model: (request) => {
  // Use a faster/cheaper model for voice channels
  if (request.channel === 'voice') return 'google/gemini-2.5-flash-lite';
  // Use a more capable model for complex web requests
  return 'openai/gpt-5.4';
}
```

### Content-based routing

```typescript theme={null}
model: async (request) => {
  // Use a model with large context for document-heavy workflows
  const basketCount = await Baskets.getCount();
  if (basketCount > 50) return 'openai/gpt-4.1';  // 1M context
  return 'openai/gpt-5.4-mini';
}
```

### Environment-based

```typescript theme={null}
import { env } from 'lua-cli';

model: env('PREFERRED_MODEL') || 'google/gemini-2.5-flash'
```

***

## Default Model

If you don't set `model`, your agent uses `google/gemini-2.5-flash`. This is a fast, capable model with a 1M token context window — suitable for most use cases.

***

## Reasoning Effort

Most reasoning-capable models above — across Claude, GPT/o-series, Gemini, Groq, DeepSeek, xAI, and Qwen — support a tunable reasoning effort: how much the model "thinks" before responding. Set a default for your agent via `modelSettings.reasoning`:

```typescript theme={null}
export const agent = new LuaAgent({
  name: 'my-agent',
  persona: '...',
  modelSettings: {
    reasoning: { effort: 'low' },
  },
  skills: [mySkill],
});
```

`effort` uses one normalized scale (`'off' | 'minimal' | 'low' | 'medium' | 'high' | 'max'`) across every provider — Lua translates it into that model's native dialect (Claude's `thinking` budget/adaptive modes, OpenAI's `reasoningEffort`, Gemini's `thinkingConfig`, etc.), clamping to the nearest supported tier rather than erroring. A few models have narrower ranges: GPT's top-tier reasoning variant floors at medium, DeepSeek's reasoning model has no tier below high, and Qwen's reasoning is on/off only. A non-reasoning model ignores the setting entirely.

**Leaving `effort` unset** doesn't mean "no reasoning" — Lua's platform default is **adaptive reasoning where the model supports it** (Claude's newest generations, Gemini 2.5's dynamic thinking budget) and an **explicit low effort otherwise**, biasing toward lower cost and latency on turns that don't ask for deeper thinking.

See [LuaAgent → modelSettings](/api/luaagent) for the full field reference, including how this interacts with a per-request override.

***

## Related

<CardGroup cols={2}>
  <Card title="LuaAgent API" href="/api/luaagent" icon="code">
    Full constructor reference including the model param
  </Card>

  <Card title="Platform APIs" href="/concepts/platform-apis" icon="plug">
    APIs available inside a model resolver function
  </Card>
</CardGroup>