This is an edited excerpt from Trustworthy AI: A Business Guide for Navigating Trust and Ethics in AI by Beena Ammanath (Wiley, March 2022). Ammanath is executive director of the Global Deloitte AI Institute and leads Trustworthy & Ethical Technology at Deloitte. She has held leadership positions in artificial intelligence and data science at multiple companies, and is the founder of Humans For AI, an organization dedicated to increasing diversity in AI.


With AI model training, datasets are a proxy for the real world. Models are trained on one dataset and tested against another, and if the results are similar, there is an expectation that the model functions can translate to the operational environment. What works in the lab should work consistently in the real world, but for how long? Perfect operating scenarios are rare in AI, and real-world data is messy and complex. This has led to what leading AI researcher Andrew Ng called a “proof-of-concept-to-production gap,” where models train as desired but fail once they are deployed. It is partly a problem of robustness and reliability.

When outputs are inconsistently accurate and become worse over time, the result is uncertainty. Data scientists are challenged to build provably robust, consistently accurate AI models in the face of changing real-world data. In the information flux, the algorithm can meander away, with small changes in input cascading into large shifts in function.

To be sure, not all tools operate in environments prone to dramatic change, and not all AI models present the same levels of risk and consequence if they become inaccurate or undependable. The task for enterprises as they grow their AI footprint is to weigh robustness and reliability as a component of their AI strategy and align the processes, people, and technologies that can manage and correct for errors in a dynamic environment.

To that end, we start with some of the primary concepts in the area of robust and reliable AI.

Robust vs brittle AI

The International Organization for Standardization defines AI robustness as the “ability of an AI system to maintain its level of performance under any circumstances.” In a robust model, the training error rate, testing error rate, and operational error rate are all nearly the same. And when unexpected data is encountered in operation or when the model is operating in less-than-ideal conditions, the robust AI tool continues to deliver accurate outputs.

For example, if a model can identify every image of an airplane in a training dataset and is proven to perform at a high level on testing data, then the model should be able to identify airplane pictures in any dataset, even if it has not encountered them previously. But how does the airplane-identifying model perform if a plane is pink, photographed at dusk, missing a wing or viewed at an angle? Does its performance degrade, and if so, at what point is the model no longer viable?

When small changes in the environment lead to large changes in functionality and accuracy, a model is considered inelastic or “brittle.” Brittleness is a known concept in software engineering, and it is apt for AI as well. Ultimately, all AI models are brittle to some degree. The different kinds of AI tools we use are specific to their function and their application. AI does only what we train it to do.

There is another component to this. Those deploying and managing AI must weigh how changing real-world data leads to degrading model accuracy over time. In the phenomenon of “model drift,” the predictive accuracy of an AI tool decreases as the underlying variables that inform the model change. Signals and data sources that were once trusted can become unreliable. Unexpected malfunctions in a network can lead to changes in data flows.

An AI that plays chess is likely to remain robust over time, as the rules of chess and the moves the AI will encounter are predictable and static. Conversely, a natural language processing (NLP) chatbot operates in the fluid landscape of speech patterns, colloquial language, incorrect grammar and syntax, and a variety of changing factors. With machine learning, unexpected data or incorrect computations can lead a model astray, and what begins as a robust tool deteriorates to brittleness, unless corrective tactics are employed.

Developing reliable AI

The European Commission’s Joint Research Centre notes that assessing reliability requires consideration of performance and vulnerability. Reliable AI performs as expected even given inputs that were not included in training data, what are called out-of-distribution (OOD) inputs. These are data points that are different from the training set, and reliable AI must be able to detect whether data is OOD. One challenge is that for some models, OOD inputs can be classified with high confidence, meaning the AI tool is ostensibly reliable when in fact it is not.

Take an autonomous delivery robot. Its navigation AI is optimized to find the most direct path to its destination. The training dataset has all the example data the AI needs to recognize sidewalks, roads, crosswalks, curbs, pedestrians, and every other variable—except railroad tracks intersecting a pathway. In operation, the robot identifies rail tracks in its path, and while they are OOD, the AI computes high confidence that the tracks are just a new kind of footpath, which it follows to expedite its delivery. Clearly, the AI has gone astray due to an OOD input. If it is not hit by a train, it validates for the delivery robot, “this is a viable path” and may look for other rail tracks to use. And the operators may be none the wiser – until a train comes along.

Reliable AI is accurate in the face of any novel input. This is different from average performance. A model that offers good average performance may still yield occasional outputs with significant consequences, hampering reliability. If an AI tool is accurate 80% of the time, is it a trustworthy model? A related matter is resilience to vulnerabilities, be they natural outcomes from operation or the result of adversarial exploits.

Lessons in data reliability

The quality of a model is only as good as the training and testing data used to develop it. Without confidence in the data quality vis-à-vis its representation of the real world, the model’s outputs may not reliably deliver accurate outputs in the operational environment. For the U.S. Government Accountability Office, data reliability hinges on:

  • Applicability – Does the data provide valid measures of relevant qualities?
  • Completeness – To what degree is the dataset populated across all attributes?
  • Accuracy – Does the data reflect the real world from which the dataset was gathered?

These are cross-cutting components of trustworthy data, as well as AI. Datasets need to be sufficiently curated and in some cases labeled or even supplemented with synthetic data, which can compensate for missing data points or fill in for protected information that cannot (or should not) be used in training. Data must also be scrubbed for latent bias, which skews model training and leads to undesirable outputs or predictions.

As with the A