Models

LLM models act as a brain of the AI Agents, different models have different strengths and tradeoffs related to task complexity, latency, and cost. As we’ll see in the next section on Orchestration, you might want to consider using a variety of models for different tasks in the workflow.

Not every task requires the smartest model—a simple retrieval or intent classification task may be
handled by a smaller, faster model, while harder tasks like deciding whether to approve a refund
may benefit from a more capable model.

An approach that works well is to build your agent prototype with the most capable model for
every task to establish a performance baseline. From there, try swapping in smaller models to see 

if they still achieve acceptable results. This way, you don’t prematurely limit the agent’s abilities,
and you can diagnose where smaller models succeed or fail.

In summary, the principles for choosing a model are simple:

Principles
01 Set up evals to establish a performance baseline
02 Focus on meeting your accuracy target with the best models available
03 Optimize for cost and latency by replacing larger models with smaller ones where possible