Skip to content

Multi-Provider AI

Talon is not tied to any single AI provider. Connect to OpenAI, Anthropic, Google, Groq, Ollama, or any service with an OpenAI-compatible API. Mix and match models across channels, switch on the fly, and run local models entirely offline.

ProviderModelsNotes
OpenAIGPT-4o, GPT-4o-mini, o1, o3, and moreOfficial API
AnthropicClaude Opus 4, Claude Sonnet 4, Haiku 3.5Direct API
Google GeminiGemini 2.0 Flash, Gemini 2.5 ProVia Google AI API
GroqLlama 3, Mixtral, GemmaUltra-fast inference
Together AILlama 3, Qwen, DBRXOpen model hosting
OllamaAny locally hosted modelFully private, no API key needed
Any OpenAI-compatible APIVariesSet a custom base URL

Models are specified using a provider/model format:

openai/gpt-4o
anthropic/claude-opus-4-5
anthropic/claude-sonnet-4-20250514
google/gemini-2.0-flash
groq/llama-3.3-70b-versatile
ollama/llama3.2
ollama/mistral

Add your providers and API keys to the Talon config:

providers:
openai:
api_key: sk-...
anthropic:
api_key: sk-ant-...
google:
api_key: AIza...
groq:
api_key: gsk_...
ollama:
base_url: http://localhost:11434
default_model: anthropic/claude-sonnet-4-20250514

The default model is used for all channels unless overridden:

default_model: openai/gpt-4o

Or change it at runtime without restarting:

Set default model to groq/llama-3.3-70b-versatile

Different channels can use different models. Point a channel at a fast, cheap model for quick tasks and a more capable model for complex work:

channels:
quick-tasks:
model: openai/gpt-4o-mini
deep-work:
model: anthropic/claude-opus-4-5
local-only:
model: ollama/llama3.2

Or switch a channel’s model on the fly:

Use anthropic/claude-opus-4-5 for this channel
Switch to ollama/mistral

Run models completely locally via Ollama. No API key, no data leaving your machine, no usage costs.

Terminal window
# Install a model locally
ollama pull llama3.2
ollama pull mistral
ollama pull codellama

Then use it in Talon:

default_model: ollama/llama3.2

Any service that implements the OpenAI chat completions API works with Talon. Set a custom base URL to point at your own deployment, a proxy, or a third-party compatible service:

providers:
my-custom:
base_url: https://my-llm-proxy.internal/v1
api_key: my-key

Then use it as:

my-custom/my-model-name

Model changes take effect immediately — no restart required. Ask the agent to switch, update your config file, or use the config management tool. The change applies to the next message in that channel.

Switch to openai/gpt-4o for the rest of this conversation
Use groq/llama-3.3-70b-versatile — I need a faster response

Set up separate channels with different models to compare responses on the same tasks:

channels:
compare-openai:
model: openai/gpt-4o
compare-claude:
model: anthropic/claude-sonnet-4-20250514
compare-local:
model: ollama/llama3.2

Send the same prompt to each channel and see how models differ in speed, quality, and style.