Yoshua Bengio, regarded as one of the “godfathers” of artificial intelligence, has likened the now-ubiquitous technology to a bear. When we teach the bear to become smart enough to escape its cage, we no longer control it. All we can do after that is try to build a better cage.
This should be our goal with the generative AI tools rapidly coming to market today, both as standalone services and in countless integrations with existing products. While this lightspeed adoption seems inevitable, we are not too late to mitigate the growing risks associated with it – but we need to act fast.
Understanding Rogue AI
While most of the AI-related cyber threats grabbing headlines today are carried out by fraudsters and organised criminals, Rogue AI is where security experts are focusing their long-term attention.
The term “Rogue AI” refers to artificial intelligence systems that act against the interests of their creators, users, or humanity in general. While present-day attacks like fraud and deepfakes are concerning, they are not the only type of AI threat we should prepare for. They will remain in a cat-and-mouse game of detection and evasion. Rogue AI is a new risk, using resources that are misaligned to one’s goal.
Rogue AI falls into three categories: malicious, accidental, or subverted. Each has different causes and potential outcomes; understanding the distinctions helps mitigate threats from Rogue AI.
Malicious Rogues are deployed by attackers to use others’ computing resources. An attacker instals the AI in another system to accomplish their own goals. The AI is doing what it was designed to do, intended for malicious purposes.
Accidental Rogues are created by human error or inherent technology limitations. Misconfigurations, failure to test models properly, and poor permission control can result in an AI programme returning bad responses (like hallucinations), having greater system privileges than intended, and mishandling sensitive data.
Subverted Rogues make use of existing AI deployments and resources. An attacker subverts an existing AI system to misuse it and accomplish their own goals. Prompt injections and jailbreaks are nascent techniques subverting LLMs. The AI system is made to operate differently than it was designed to.
Building the Cage
The threats posed by rogue AI are complex and require a security philosophy that considers all factors involved: identity, application, workload, data, device, network and more. Trend is early to market with a systemic view of this issue. Building a new cage for this AI bear is not just about finding out when things have gone wrong—it’s about leveraging security to ensure that every layer of data and computing used by AI models is safe. This is a core tenet of Zero Trust security, which is critical with this new technology.
By approaching AI security holistically, we can prepare for the next generation of threats and vulnerabilities brought about by rogues. Security measures should include encrypted, authenticated and monitored data, infrastructure and communications used by AI services.
Defence in depth is key to protecting against Rogue AI. Strict policy and controls prevent runaway resource use. Examining AI systems in use detects misalignment of AI data or resource use. Detecting anomalies from AI use remains a last line of defence when dealing with the wholly unexpected.
The promise of the AI era is only powerful if it is secure. Rogue AI is already here, but it is not as prolific yet as it will be, as we move to prevalent AI agents. By adopting a comprehensive and proactive approach to security, we can reduce instances of rogue AI.
To read more about Rouge AI: