Google continues to innovate with the latest improvements in AI models. Today, we’re unveiling updated versions of Gemini 2.5 Flash and Gemini 2.5 Flash-Lite. These updates, available on Google AI Studio and Vertex AI, aim to provide better quality and efficiency in AI responses. In this blog, we’ll walk you through the improvements, new features, and what this means for developers building with Gemini.
What’s New in Gemini 2.5 Flash and Flash-Lite?
The latest versions of Gemini 2.5 Flash and Flash-Lite come with key updates that enhance both quality and speed, alongside a reduction in costs. Let’s break it down:
Key Improvements in Gemini 2.5 Flash and Flash-Lite
- Intelligence vs. End-to-End Response Time: We’ve made significant strides in improving the models’ overall intelligence and response times, leading to faster, smarter AI solutions.
- Output Token Efficiency: Gemini 2.5 Flash-Lite now has a 50% reduction in output tokens, significantly cutting down costs. Gemini 2.5 Flash has a 24% reduction in token usage, resulting in lower overhead and better scalability.

Features of the Updated Gemini 2.5 Flash-Lite
The updated Gemini 2.5 Flash-Lite has been optimized for three core themes:
- Better Instruction Following: The model has become much more adept at following complex instructions and system prompts, making it easier to integrate into a variety of applications.
- Reduced Verbosity: The model now delivers more concise answers, which helps reduce token usage and lowers latency for high-throughput applications.
- Improved Multimodal & Translation Capabilities: With enhanced audio transcription, image understanding, and translation accuracy, Gemini 2.5 Flash-Lite is better equipped for tasks that require diverse media processing.
Try it now: You can start testing the Gemini 2.5 Flash-Lite preview using the string: gemini-2.5-flash-lite-preview-09-2025.
Features of the Updated Gemini 2.5 Flash
The latest Gemini 2.5 Flash version includes improvements based on direct feedback from users, with key upgrades in:
- Better Agentic Tool Use: The model now performs better in multi-step, agentic applications. There is a noticeable 5% improvement on the SWE-Bench Verified test (from 48.9% to 54%).
- Increased Efficiency: The updated model uses fewer tokens and provides higher-quality outputs, reducing latency and costs while still delivering exceptional performance.

One early tester, Yichao ‘Peak’ Ji, Co-Founder and Chief Scientist at Manus, shared their excitement: “The new Gemini 2.5 Flash model offers a remarkable blend of speed and intelligence. We’ve seen a 15% leap in performance for long-horizon agentic tasks. Its outstanding cost-efficiency enables Manus to scale to unprecedented levels.”
Try it now: You can begin testing the Gemini 2.5 Flash preview with the string: gemini-2.5-flash-preview-09-2025.
How to Build with Gemini Models
Over the past year, we’ve focused on providing preview versions of our models, which allows developers to test the latest improvements, give feedback, and prepare for production-ready experiences. While the Gemini models are still evolving, these releases will play a critical role in shaping the next stable versions.
New Model Aliases: Simplifying Access to the Latest Models
To make it easier for developers to work with the most up-to-date Gemini versions, we’re introducing a -latest alias for each model family. This alias always points to the most recent model release, making it easier for developers to experiment with the latest features without needing to update code for each new version.
- For Gemini Flash models:
gemini-flash-latest - For Gemini Flash-Lite models:
gemini-flash-lite-latest
You can now access the latest models without worrying about keeping track of long model strings.
Stability and Updates
For developers who need more stability, we recommend using the gemini-2.5-flash and gemini-2.5-flash-lite model strings. We’ll always provide a 2-week notice (via email) before we update or deprecate any specific version linked to the -latest alias. This ensures you have ample time to test and adapt your applications to the latest models.

The Road Ahead: Pushing the Frontier of AI
The release of Gemini 2.5 Flash and Flash-Lite represents another leap forward in the evolution of AI. These models are designed to be faster, more efficient, and smarter, enabling developers to build even more powerful and scalable AI solutions.
We’re excited to see how you’ll use these updates to expand the possibilities of AI-powered applications. Stay tuned for further updates as we continue to push the boundaries of what’s possible with Gemini!