Introducing Gemma Scope 2 for Analyzing Language Model Behavior
The company has announced a new toolkit for interpreting language models - Gemma Scope 2. These tools will help researchers gain a deeper understanding of the internal decision-making processes in language models, which, despite their impressive capabilities, remain opaque.
Gemma Scope 2 supports all Gemma 3 models, ranging from 270 million to 27 billion parameters, and allows for tracking potential risks within their 'brains'. This is the largest release of interpretation tools from the AI lab to date, encompassing around 110 petabytes of data and training over 1 trillion parameters.
With Gemma Scope 2, researchers will be able to debug unexpected model behaviors and conduct audits of AI agents, accelerating the development of safe solutions for issues such as jailbreaks, hallucinations, and bias.
The new toolkit includes autoencoders and transcoders, enabling researchers to look inside the models and understand how their thoughts are formed and how they relate to the model's behavior. This is crucial for studying aspects such as discrepancies between the model's logic and its internal state.
Gemma Scope 2 also offers enhanced tools for analyzing complex internal processes, including new training techniques that help uncover more useful concepts and address shortcomings of the previous version. The chatbot behavior analysis tools will assist in exploring complex multi-step actions, such as failure mechanisms and the fidelity of reasoning chains.
Overview of Google's Achievements in 2025: Breakthroughs in Research
Google DeepMind Supports Genesis Mission to Accelerate Scientific Discoveries
Похожие статьи
Google AI Updates Announced in March 2026
Google announced AI updates in March 2026, including improvements to Gemini and Google Maps.
Announcing Key AI News from February
Google unveiled key AI updates, including new tools and partnerships at the AI Impact Summit in India.
Gemini in Google Sheets Achieves State-of-the-Art Performance
Gemini in Google Sheets has achieved remarkable results in spreadsheet editing.