The UK’s AI Security Institute is teaming up with international partners to lead a £15m project focused on researching AI alignment.
The Alignment Project will also feature the Canadian AI Safety Institute, Canadian Institute for Advanced Research (CIFAR), Schmidt Sciences, Amazon Web Services, Anthropic, Halcyon Futures, the Safe AI Fund, UK Research and Innovation, and the Advanced Research and Invention Agency (ARIA).
It will pioneer new work designed to ensure AI systems always work as intended – a field that is becoming increasingly important as AI systems become more advanced and autonomous.
Misalignment broadly means AI systems that act against the goals, policies and requirements of their developers. It can be intentional – i.e., a threat actor subverting an AI system to attack a target – or unintentional, which is when misalignment happens because appropriate AI guardrails haven’t been put in place.
Read more on AI safety: OWASP Launches Agentic AI Security Guidance
According to Trend Micro, examples of misalignment could include:
- Model poisoning: Attackers inject or manipulate LLM training data, leading to biased outputs, incorrect decisions and sometimes injected backdoors
- Prompt injection: Threat actors craft a malicious prompt that overcomes the built-in guardrails of an LLM, effecting a type of system jailbreak
- Accidental disclosure: Poorly designed AI systems may inadvertently access and share privileged information to users
- Runaway resource consumption: If resource consumption is not properly bounded, AI components could work on sub-problems in a self-replicating manner, potentially DOSing the system
Science, technology and innovation secretary, Peter Kyle, said advanced AI systems are already exceeding humans in some areas, making the project more urgent than ever.
“AI alignment is all geared towards making systems behave as we want them to, so they are always acting in our best interests. This is at the heart of the work the institute has been leading since day one – safeguarding our national security and ensuring the British public are protected from the most serious risks AI could pose as the technology becomes more and more advanced,” he added.
“The responsible development of AI needs a co-ordinated global approach, and this fund will help us make AI more reliable, more trustworthy, and more capable of delivering the growth, better public services, and high-skilled jobs.”