What makes Gemini’s 'Thinking' unique?

 title: 'Figure 2 | Number of output tokens per second while generating (i.e. after the first chunk has been received from the API), for different models. Source: ArtificialAnalysis.ai, imported on 2025-06-15'

Gemini's 'Thinking' models are trained with Reinforcement Learning to utilize additional compute at inference time for more accurate answers[1]. These models can spend tens of thousands of forward passes during a 'thinking' stage before responding to a query[1]. This is integrated with other Gemini capabilities, such as multimodal inputs and long context, where the model decides how long to think before answering[1].

Users can also set a Thinking budget, which constrains the model to respond within a desired number of tokens, allowing for a tradeoff between performance and cost[1]. The Gemini 2.5 Thinking models are the most well-rounded reasoning models to date[1].