Expanding AI safety research with the AISI institute in the UK

17 просмотров Источник
Expanding AI safety research with the AISI institute in the UK

Expanding AI Safety Research with the AISI Institute in the UK

A new memorandum of understanding has been signed, expanding collaboration with the UK's Artificial Intelligence Safety Institute (AISI). This partnership focuses on fundamental safety research to ensure the safe development and beneficial application of artificial intelligence.

Foundation of the Collaboration

AI has the potential to significantly improve human life by aiding in disease treatment, accelerating scientific discoveries, fostering economic growth, and addressing climate change issues. To achieve these benefits, safety and responsibility must be integral to development. Evaluating AI models for various potential risks is a critical part of our safety strategy, and external partnerships play a key role in this.

Since its inception in November 2023, we have been collaborating with AISI to test our most advanced models. We fully support AISI's goal of providing governments, industry, and society with scientific data on the risks associated with advanced AI, as well as possible solutions and mitigation measures.

We actively collaborate with AISI to improve AI model assessments, working on safety research to advance in this field, including recent studies on monitoring reasoning chains. Based on this collaboration, we are now expanding our partnership to include more fundamental research in various areas.

Partnership Details

  • Sharing access to proprietary models, data, and ideas to accelerate research progress
  • Joint reports and publications to disseminate results within the research community
  • Collaborative safety research combining the expertise of our teams
  • Technical discussions to address complex safety challenges

Key Research Areas

Our collaboration with AISI focuses on critical areas where Google DeepMind's expertise can contribute to creating safer and more secure AI systems:

  • Monitoring AI Reasoning Processes: We are developing methods to track AI reasoning processes, known as chains of thought (CoT). This complements interpretability research and enhances our understanding of how AI systems generate responses.
  • Understanding Social and Emotional Impacts: We are exploring the ethical implications of social-emotional misalignment, where AI models may not align with human well-being despite correctly following instructions.
  • Evaluating Economic Systems: We assess the impact of AI on economic systems by modeling tasks in various environments and predicting factors such as long-term effects on the labor market.

Joint Efforts for AI Benefit

The partnership with AISI is part of our broader mission to harness AI's benefits for humanity while minimizing risks. Our strategy includes foresight research, comprehensive safety training, rigorous model testing, and the development of improved tools and frameworks to understand and mitigate risks.

Effective internal governance is crucial for the safe development of AI, alongside collaboration with external experts who provide new insights and diverse expertise. The Google DeepMind Accountability and Safety Board oversees emerging risks, conducts ethical and safety assessments, and implements necessary technical and policy measures. We also collaborate with other external experts, such as Apollo Research, Vaultis, and Dreadnode, for thorough testing and evaluation of our models, including Gemini 3, our most advanced and safest model to date.

Additionally, Google DeepMind is a founding member of the Frontier Model Forum and Partnership on AI, focused on the safe development of advanced AI models and increasing collaboration on key safety issues.

We anticipate that the expanded partnership with AISI will lead to more robust AI safety approaches, benefiting not only our organizations but the entire industry and everyone interacting with AI systems.

Похожие статьи