Context

Biases exist everywhere. But it’s not easy to detect them in the domain of technology, which boils down to ones and zeroes. In my work as a VP at IBM Security, as AI gains more and more ground, business leaders often ask me how to deal with biases that may not be obvious in an algorithm’s outcome.

We’ve seen inappropriate and unintended bias emerge from various industries’ use of AI, including recruiting and mortgage lending. In those cases, flawed outcomes were evident as bias was reflected in ways that relate to distinct features of our identity: gender, race, age. But I’ve spent a lot of time thinking about areas in which we don’t even realize AI bias is present. In a complex field like cybersecurity, how do we recognize biased outcomes?

AI has become a prime security tool, with research indicating that 69% of IT executives saying they can’t respond to threats without AI. However, whether used to improve defenses or offload security tasks, it’s essential that we trust that the outcome the AI is giving us is not biased. In security, AI bias is a form of risk—the more information, context, and expertise you feed your AI, the more you’re able to manage security risks and blind spots. Otherwise, various types of bias, from racial and cultural prejudices to contextual, industry-related forms of bias, can impact the AI. In order to be effective, AI models must be diverse. So how do we ensure this breadth, and what can go wrong if we don’t?

Here are the three areas I believe are integral to help prevent AI bias from harming security efforts.

The problem-solving algorithm

When AI models are based on false security assumptions or unconscious biases, they do more than threaten a company’s security posture. They can also cause significant business impact. AI that is tuned to qualify benign or malicious network traffic based on non-security factors can miss threats, allowing them to waltz into an organization’s network. It can also overblock network traffic, barring what might be business-critical communications.

As an example, imagine that an AI developer views one region of the world as safe, because it’s an ally nation, and another as malicious, because it’s an authoritarian regime. The developer therefore allows all the network traffic from the former to enter, while blocking all traffic from the latter. This type of aggregate bias can cause AI to overlook other security contexts that might be more important.

If computer scientists design AI algorithms without influence and input from security experts, the outcomes will be flawed. Because if the AI scientists aren’t working in lockstep with security teams to cull data, threat intelligence, and context, and then codify these insights, they may tune AI tools with some level of bias. As a result, mistrained AI-powered security systems may fail to identify something that should be identified as a fraud element, a vulnerability, or a breach. Biased rules within algorithms inevitably generate biased outcomes.

The source data

Data itself can create bias when the source materials aren’t diverse. AI that’s fed biased data is going to understand only a partial view of the world and make decisions based on that narrow understanding. In cybersecurity, that means threats will be overlooked. For instance, if a spam classifier wasn’t trained on a representative set of benign emails, such as emails in various languages or with linguistic idiosyncrasies like slang, it will inevitably produce false positives. Even common, intentional misuse of grammar, spelling, or syntax can prompt a spam classifier to block benign text.

The security influencers

AI models can suffer from tunnel vision, too. As a cyber threat’s behavioral pattern varies based on factors like geography or business size, it’s important to train AI on the various environments that a threat operates in and the various forms it takes on. For instance, in a financial services environment, if you build AI to only detect identity-based issues, it won’t recognize malicious elements outside that setting. Lacking broad coverage, this Al model would be unable to identify threats outside the niche threat pattern it was taught.

If computer scientists design AI algorithms without input from security experts, the outcomes will be flawed.”

But when a security team consists of professionals from various backgrounds, cultures, and geographies, with varying expertise, it can help AI developers feed a 360-degree perspective on many behavioral patterns of security threats into the AI to process. We must train systems against a diversity of problem statements to enable a range of scenarios to be represented in the AI model and, subsequently, help prevent gaps in its threat detection process.

If businesses are going to make AI an integral asset in their security arsenal, it’s essential they understand that AI that is not fair and accurate cannot be effective. One way to help prevent bias within AI is to make it cognitively diverse: The computer scientists developing it, the data feeding it, and the security teams influencing it should have multiple and diverse perspectives. Through cognitive diversity, the blind spot of one expert, one data point, or one approach can be managed by the blind spot of another, getting as close to no blind spots—and no bias—as possible.

So, to answer the questions I get from business leaders, you can only address biased outcomes that aren’t obvious if you know where to look. And in security, you have to look at the elements producing the outcome. That is where you monitor for bias—and that is where you correct it.

Aarti Borkar is a VP at IBM Security, where she is responsible for the vision, strategy, and execution for the business and builds ethical AI and bias-mitigation tools.