Model Selection

Overview

By default, every Lua agent uses Google Gemini 2.5 Flash. The model property on LuaAgent lets you choose a different model or select one dynamically based on the request context.

export const agent = new LuaAgent({
  name: 'my-agent',
  persona: '...',
  model: 'openai/gpt-5.4',  // ← add this
  skills: [mySkill]
});

Lua manages the API credentials. You don’t need to configure any API keys or provider accounts — Lua handles all LLM infrastructure on your behalf.Support for user-provided API keys (Bring Your Own Key) is coming in a future release.

Available Models

Google (Vertex AI)

Model string	Context window	Notes
`google/gemini-2.5-flash`	1M tokens	Default — fast reasoning, best price-performance
`google/gemini-2.5-pro`	1M tokens	Advanced reasoning for complex analysis
`google/gemini-2.5-flash-lite`	1M tokens	Budget-friendly in the 2.5 family
`google/gemini-3.1-pro-preview`	1M tokens	Most capable Gemini (preview)
`google/gemini-3-flash-preview`	1M tokens	Next-gen frontier flash (preview)
`google/gemini-3.1-flash-lite-preview`	1M tokens	Next-gen lite (preview)

OpenAI

Model string	Context window	Notes
`openai/gpt-5.4`	1.1M tokens	Current flagship
`openai/gpt-5.4-mini`	400K tokens	Fast flagship variant
`openai/gpt-5.4-nano`	400K tokens	Ultra-low-latency
`openai/gpt-5`	400K tokens	Strong general-purpose
`openai/gpt-5-mini`	400K tokens	Compact GPT-5
`openai/gpt-4.1`	1M tokens	Stable, previous gen
`openai/gpt-4.1-mini`	1M tokens	Budget variant of 4.1
`openai/gpt-4.1-nano`	1M tokens	Fastest 4.1 variant
`openai/o3`	200K tokens	Most powerful reasoning
`openai/o3-mini`	200K tokens	Fast reasoning
`openai/o3-pro`	200K tokens	Premium reasoning
`openai/o4-mini`	200K tokens	Latest cost-efficient reasoning

Anthropic

Model string	Context window	Notes
`anthropic/claude-opus-4-6`	1M tokens	Most intelligent
`anthropic/claude-sonnet-4-6`	1M tokens	Best speed/intelligence balance
`anthropic/claude-haiku-4-5`	200K tokens	Fastest model
`anthropic/claude-opus-4-5`	200K tokens	Previous gen, still active
`anthropic/claude-sonnet-4-5`	200K tokens	Previous gen, still active
`anthropic/claude-opus-4-1`	200K tokens	Older gen, still active

DeepSeek

Model string	Context window	Notes
`deepseek/deepseek-chat`	131K tokens	DeepSeek-V3, general purpose
`deepseek/deepseek-reasoner`	128K tokens	DeepSeek-R1, reasoning

Groq

Model string	Context window	Notes
`groq/llama-3.3-70b-versatile`	131K tokens	Production, tool use capable
`groq/meta-llama/llama-4-scout-17b-16e-instruct`	131K tokens	Latest Llama 4
`groq/openai/gpt-oss-120b`	131K tokens	GPT-OSS on Groq LPU hardware
`groq/groq/compound`	131K tokens	Groq Compound agentic system
`groq/llama-3.1-8b-instant`	131K tokens	Fast, budget-friendly

xAI (Grok)

Model string	Context window	Notes
`xai/grok-4`	256K tokens	Current flagship
`xai/grok-4-fast`	2M tokens	Fast variant
`xai/grok-3`	131K tokens	Previous gen, still active
`xai/grok-3-mini`	131K tokens	Previous gen mini
`xai/grok-3-mini-fast`	131K tokens	Previous gen fast

Alibaba (Qwen)

Model string	Context window	Notes
`alibaba/qwen3.6-plus`	1M tokens	Latest flagship (Apr 2026)
`alibaba/qwen3-max`	262K tokens	Flagship
`alibaba/qwen3-235b-a22b`	131K tokens	Large MoE model
`alibaba/qwen3-32b`	131K tokens	Mid-size, fast
`alibaba/qwq-plus`	131K tokens	Reasoning model

ZhipuAI (GLM)

Model string	Context window	Notes
`zhipuai/glm-5.1`	203K tokens	Latest
`zhipuai/glm-5`	80K tokens	Flagship (Feb 2026)
`zhipuai/glm-5-turbo`	203K tokens	Fast variant
`zhipuai/glm-4.7`	203K tokens	Previous gen
`zhipuai/glm-4.7-flash`	203K tokens	Previous gen flash
`zhipuai/glm-4.6`	205K tokens	Previous gen
`zhipuai/glm-4.5`	131K tokens	Previous gen
`zhipuai/glm-4.5-air`	131K tokens	Previous gen lite

Fallback routing. If a model is not in Lua’s approved list, the request is automatically routed through OpenRouter as a best-effort fallback. If that also isn’t available, the request falls back to the default model (google/gemini-2.5-flash).

Static Model

The simplest form — one model for all requests:

export const agent = new LuaAgent({
  name: 'support-agent',
  persona: '...',
  model: 'openai/gpt-5.4',
  skills: [supportSkill]
});

When to use: When you want a specific model across all users and channels.

Dynamic Model Resolver

Use a function to select the model per request. The resolver receives the full request with access to all platform APIs — User, Baskets, Products, Data, and more.

export const agent = new LuaAgent({
  name: 'smart-agent',
  persona: '...',
  model: async (request) => {
    const user = await User.get();
    return user.data?.isPremium ? 'openai/gpt-5.4' : 'google/gemini-2.5-flash';
  },
  skills: [mySkill]
});

The resolver must return a 'provider/model' string synchronously or asynchronously.

Common Patterns

Premium vs free users

model: async (request) => {
  const user = await User.get();
  const tier = user.data?.subscriptionTier;

  if (tier === 'pro') return 'openai/gpt-5.4';          // flagship
  if (tier === 'standard') return 'openai/gpt-4.1';     // 1M context, balanced
  return 'google/gemini-2.5-flash';                      // default
}

Channel-based selection

model: (request) => {
  // Use a faster/cheaper model for voice channels
  if (request.channel === 'voice') return 'google/gemini-2.5-flash-lite';
  // Use a more capable model for complex web requests
  return 'openai/gpt-5.4';
}

Content-based routing

model: async (request) => {
  // Use a model with large context for document-heavy workflows
  const basketCount = await Baskets.getCount();
  if (basketCount > 50) return 'openai/gpt-4.1';  // 1M context
  return 'openai/gpt-5.4-mini';
}

Environment-based

import { env } from 'lua-cli';

model: env('PREFERRED_MODEL') || 'google/gemini-2.5-flash'

Default Model

If you don’t set model, your agent uses google/gemini-2.5-flash. This is a fast, capable model with a 1M token context window — suitable for most use cases.

Reasoning Effort

Most reasoning-capable models above — across Claude, GPT/o-series, Gemini, Groq, DeepSeek, xAI, and Qwen — support a tunable reasoning effort: how much the model “thinks” before responding. Set a default for your agent via modelSettings.reasoning:

export const agent = new LuaAgent({
  name: 'my-agent',
  persona: '...',
  modelSettings: {
    reasoning: { effort: 'low' },
  },
  skills: [mySkill],
});

effort uses one normalized scale ('off' | 'minimal' | 'low' | 'medium' | 'high' | 'max') across every provider — Lua translates it into that model’s native dialect (Claude’s thinking budget/adaptive modes, OpenAI’s reasoningEffort, Gemini’s thinkingConfig, etc.), clamping to the nearest supported tier rather than erroring. A few models have narrower ranges: GPT’s top-tier reasoning variant floors at medium, DeepSeek’s reasoning model has no tier below high, and Qwen’s reasoning is on/off only. A non-reasoning model ignores the setting entirely. Leaving effort unset doesn’t mean “no reasoning” — Lua’s platform default is adaptive reasoning where the model supports it (Claude’s newest generations, Gemini 2.5’s dynamic thinking budget) and an explicit low effort otherwise, biasing toward lower cost and latency on turns that don’t ask for deeper thinking. See LuaAgent → modelSettings for the full field reference, including how this interacts with a per-request override.

LuaAgent API

Full constructor reference including the model param

Platform APIs

APIs available inside a model resolver function

​Overview

​Available Models

​Google (Vertex AI)

​OpenAI

​Anthropic

​DeepSeek

​Groq

​xAI (Grok)

​Alibaba (Qwen)

​ZhipuAI (GLM)

​Static Model

​Dynamic Model Resolver

​Common Patterns

​Premium vs free users

​Channel-based selection

​Content-based routing

​Environment-based

​Default Model

​Reasoning Effort

​Related

LuaAgent API

Platform APIs

Overview

Available Models

Google (Vertex AI)

OpenAI

Anthropic

DeepSeek

Groq

xAI (Grok)

Alibaba (Qwen)

ZhipuAI (GLM)

Static Model

Dynamic Model Resolver

Common Patterns

Premium vs free users

Channel-based selection

Content-based routing

Environment-based

Default Model

Reasoning Effort

Related