Multi-Provider AI

Talon is not tied to any single AI provider. Connect to OpenAI, Anthropic, Google, Groq, Ollama, or any service with an OpenAI-compatible API. Mix and match models across channels, switch on the fly, and run local models entirely offline.

Supported Providers

Provider	Models	Notes
OpenAI	GPT-4o, GPT-4o-mini, o1, o3, and more	Official API
Anthropic	Claude Opus 4, Claude Sonnet 4, Haiku 3.5	Direct API
Google Gemini	Gemini 2.0 Flash, Gemini 2.5 Pro	Via Google AI API
Groq	Llama 3, Mixtral, Gemma	Ultra-fast inference
Together AI	Llama 3, Qwen, DBRX	Open model hosting
Ollama	Any locally hosted model	Fully private, no API key needed
Any OpenAI-compatible API	Varies	Set a custom base URL

Model Format

Models are specified using a provider/model format:

openai/gpt-4o
anthropic/claude-opus-4-5
anthropic/claude-sonnet-4-20250514
google/gemini-2.0-flash
groq/llama-3.3-70b-versatile
ollama/llama3.2
ollama/mistral

Configuration

Add your providers and API keys to the Talon config:

providers:
  openai:
    api_key: sk-...
  anthropic:
    api_key: sk-ant-...
  google:
    api_key: AIza...
  groq:
    api_key: gsk_...
  ollama:
    base_url: http://localhost:11434

default_model: anthropic/claude-sonnet-4-20250514

Setting a Default Model

The default model is used for all channels unless overridden:

default_model: openai/gpt-4o

Or change it at runtime without restarting:

Set default model to groq/llama-3.3-70b-versatile

Per-Channel Model Override

Different channels can use different models. Point a channel at a fast, cheap model for quick tasks and a more capable model for complex work:

channels:
  quick-tasks:
    model: openai/gpt-4o-mini
  deep-work:
    model: anthropic/claude-opus-4-5
  local-only:
    model: ollama/llama3.2

Or switch a channel’s model on the fly:

Use anthropic/claude-opus-4-5 for this channel
Switch to ollama/mistral

Local Models with Ollama

Run models completely locally via Ollama. No API key, no data leaving your machine, no usage costs.

# Install a model locally
ollama pull llama3.2
ollama pull mistral
ollama pull codellama

Then use it in Talon:

default_model: ollama/llama3.2

Custom OpenAI-Compatible APIs

Any service that implements the OpenAI chat completions API works with Talon. Set a custom base URL to point at your own deployment, a proxy, or a third-party compatible service:

providers:
  my-custom:
    base_url: https://my-llm-proxy.internal/v1
    api_key: my-key

Then use it as:

my-custom/my-model-name

Switching Models Without Restarting

Model changes take effect immediately — no restart required. Ask the agent to switch, update your config file, or use the config management tool. The change applies to the next message in that channel.

Switch to openai/gpt-4o for the rest of this conversation
Use groq/llama-3.3-70b-versatile — I need a faster response

Comparing Models Side by Side

Set up separate channels with different models to compare responses on the same tasks:

channels:
  compare-openai:
    model: openai/gpt-4o
  compare-claude:
    model: anthropic/claude-sonnet-4-20250514
  compare-local:
    model: ollama/llama3.2

Send the same prompt to each channel and see how models differ in speed, quality, and style.