How do Gemini models balance cost and capability?

 title: '(a) The fully autonomous Run 2 milestones as a function of the number of individual actions.'

The Gemini 2.X model generation spans the full Pareto frontier of model capability vs cost, allowing users to explore the boundaries of what is possible with complex agentic problem solving[1]. The Gemini 2.X family of models covers the whole Pareto frontier of model capability vs cost, shifting it forward across a large variety of core capabilities, applications, and use-cases[1].

Different models in the series have different strengths and capabilities[1]: Gemini 2.5 Pro is the most intelligent thinking model, while Gemini 2.5 Flash is a hybrid reasoning model with a controllable thinking budget, useful for complex tasks while controlling the tradeoff between quality, cost, and latency[1]. Gemini 2.0 Flash is a fast and cost-efficient non-thinking model for everyday tasks, and Gemini 2.0 Flash-Lite is the fastest and most cost-efficient model, built for at-scale usage[1].