Leading researchers at 59 organisations across the globe, including Leverhulme CFI, have authored a ground-breaking report proposing ten detailed, concrete steps AI companies should take to move towards more verifiable trustworthy AI development.
Focusing on tools and mechanisms that link ethical principles to practical implementation and oversight, the report complements and builds on previous work from Leverhulme CFI and others on the challenges of moving from principles to practice in AI ethics and governance.
The report summary that follows was first published on the website of our partner organisation, the Centre for the Study of Existential Risk.
Assessing the limits of ethics principles and codes of conduct – as well as the substantial impact AI development is having on communities around the globe – the report is a clarion call for AI developers worldwide to address the clear lack of trust in how AI is currently developed.
However, the report – Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims – also recommends ten interventions to move toward trustworthy AI development:
- A coalition of stakeholders should create a task force to research options for conducting and resourcing third-party auditing of AI systems.
- Organizations developing AI should run red-teaming exercises to explore risks associated with systems they develop and share best practices and tools.
- AI developers should pilot bias and safety bounties for AI systems.
- AI developers should share more information about AI incidents including through collaborative channels such as the Partnership on AI.
- Standard setting bodies should work with academia and industry to develop audit trail requirements for safety-critical applications of AI systems.
- Organizations developing AI and funding bodies should support research into the interpretability of AI systems, with a focus on supporting risk assessment and auditing.
- AI developers should develop, share, and use suites of tools for privacy-preserving machine learning that include measures of performance against agreed standards.
- Industry and academia should work together to develop hardware security features for AI accelerators or otherwise establish best practices for the use of secure hardware (including secure enclaves on commodity hardware) in machine learning contexts.
- One or more AI labs should attempt to comprehensively account for the computing power used in the context of a single project, and report on lessons learned regarding the potential for standardizing such reporting.
- Government funding bodies should substantially increase funding for computing power resources for researchers in academia and civil society, in order to improve the ability of those researchers to verify claims made by industry.
The co-authors come from a wide range of organisations and disciplines, including the Alan Turing Institute and the Partnership on AI; Cambridge University’s Centre for the Future of Intelligence and Oxford University’s Future of Humanity Institute; Google Brain and OpenAI, leading AI research companies; the Center for Security and Emerging Technologies, a US-based bipartisan think-tank; and other organisations.
The 72-page report identifies three areas (institutional, software and hardware) in which progress can be made on specific mechanisms.
It suggests that institutional mechanisms can shape incentives or constrain behavior of the people involved in AI development. They can help clarify an organization’s goals and values, can increase transparency regarding an organization’s AI development, can create incentives for organizations to act in ways that are responsible processes, and can foster exchange of information between developers. The authors call for AI developers to explore third-party auditing, red teams, safety and bias bounties and incident sharing.
Likewise, software mechanisms allow researchers, auditors, and others to understand the internal workings of an AI system. They can also help characterize how an AI system can be expected to behave when used in a particular setting. The proposed mechanisms are audit trails, interpretability, and privacy-preserving machine learning.
For hardware, mechanisms address who has what physical computing resources, and how they are accessed and monitored. It also involves how those resources are designed, manufactured, and tested. Hardware mechanisms aim to condition or constrain the behavior of actors who use these resources. The report emphasises the importance of secure hardware for machine learning, high-precision compute measurement, and computing power support for academia.
While the trustworthy development of AI has been highlighted in high-profile settings (e.g. the European Commission’s High-Level Expert Group on AI), a set of concrete, voluntary mechanisms that AI developers can adopt to make more verifiable claims has not yet been analysed comprehensively – until now.
Report co-author and Leverhulme CFI Associate Fellow, Haydn Belfield explained: “People understand the opportunities and challenges AI and machine learning bring. Almost all AI developers want to act responsibly, safely and ethically - but its been unclear what they can concretely do. No longer. It’s now time for AI developers to move beyond well-meaning ethical principles, and introduce concrete mechanisms to move towards trustworthy AI development.”
Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims is available to download and read at: