How Explainable AI Builds Trust and Accountability

Businesses have already plunged headfirst into AI adoption, racing to deploy chatbots, content generators, and decision-support tools across their operations. According to McKinsey, 78% of companies use AI in at least one business function.

The frenzy of implementation is understandable — everyone sees the potential value. But in this rush, many organizations overlook the fact that all neural network-based technologies, including every LLM and generative AI system in use today and for the foreseeable future, share a significant flaw: They are unpredictable and ultimately uncontrollable.

As some have learned, there can be real fall-out as a result. At one Chevrolet dealer that had deployed a chatbot to its website, a customer convinced the ChatGPT-powered bot to sell him a $58,195 Chevy Tahoe for just $1. Another customer prompted the same chatbot to write a Python script for complex fluid dynamics equations, which it happily did. The dealership quickly disabled the bots after these incidents went viral.

Last year, Air Canada lost in small claims court when it argued that its chatbot, which gave a passenger inaccurate information about a bereavement discount, “is a separate legal entity that is responsible for its own actions.”

This unpredictability stems from the fundamental architecture of LLMs. They’re so large and complex that it’s impossible to understand how they arrive at specific answers or predict what they’ll generate until they produce an output. Most organizations are responding to this reliability issue without fully recognizing it.

The common-sense solution is to check AI results by hand, which works but drastically limits the technology’s potential. When AI is relegated to being a personal assistant — drafting text, taking meeting minutes, summarizing documents, and helping with coding — it delivers modest productivity gains. Not enough to revolutionize the economy.

The true benefits of AI will arrive when we stop using it to assist existing jobs and instead rewire entire processes, systems, and companies to use AI without human involvement at every step. Consider loan processing: if a bank gives loan officers an AI assistant to summarize applications, they might work 20-30% faster. But deploying AI to handle the entire decision process (with appropriate safeguards) could slash costs by over 90% and eliminate almost all the processing time. This is the difference between incremental improvement and transformation.

The path to reliable AI implementation

Harnessing AI’s full potential without succumbing to its unpredictability requires a sophisticated blend of technical approaches and strategic thinking. While several current methods offer partial solutions, each has significant limitations.

Some organizations attempt to mitigate reliability issues through system nudging — subtly steering AI behavior in desired directions so it responds in specific ways to certain inputs. Anthropic researchers demonstrated the fragility of this approach by identifying a “Golden Gate Bridge feature” in Claude’s neural network and, by artificially amplifying it, caused Claude to develop an identity crisis. When asked about its physical form, instead of acknowledging it had none, Claude claimed to be the Golden Gate Bridge itself. This experiment revealed how easily a model’s core functioning can be altered and that every nudge represents a tradeoff, potentially improving one aspect of performance while degrading others.

Another approach is to have AI monitor other AI. While this layered approach can catch some errors, it introduces additional complexity and still falls short of comprehensive reliability. Hard-coded guardrails are a more direct intervention, like blocking responses containing certain keywords or patterns, such as precursor ingredients for weapons. While effective against known issues, these guardrails cannot anticipate novel problematic outputs that emerge from these complex systems.

A more effective approach is building AI-centric processes that can work autonomously, with human oversight strategically positioned to catch reliability issues before they cause real-world problems. You wouldn’t want AI to directly approve or deny loan applications,  but AI could conduct an initial assessment for human operators to review. This can work, but it relies on human vigilance to catch AI mistakes and undermines the potential efficiency gains from using AI.

Building for the future

These partial solutions point toward a more comprehensive approach. Organizations that fundamentally rethink how their work gets done rather than simply augmenting existing processes with AI assistance will gain the greatest advantage. But AI should never be the last step in a high-stakes process or decision, so what’s the best path forward?

First, AI builds a repeatable process that will reliably and transparently deliver consistent results. Second, humans review the process to ensure they understand how it works and that the inputs are appropriate. Finally, the process runs autonomously – using no AI – with periodic human review of results.

Consider the insurance industry. The conventional approach might add AI assistants to help claims processors work more efficiently. A more revolutionary approach would use AI to develop new tools — like computer vision that analyzes damage photos or enhanced fraud detection models that identify suspicious patterns — and then combine these tools into automated systems governed by clear, understandable rules. Humans would design and monitor these systems rather than process individual claims.

This approach maintains human oversight at the critical juncture where it matters most: the design and validation of the system itself. It allows for exponential efficiency gains while eliminating the risk that AI unpredictability will lead to harmful outcomes in individual cases.

An AI might identify potential indicators of loan repayment ability in transaction data, for instance. Human experts can then evaluate these indicators for fairness and build explicit, understandable models to confirm their predictive power.

This approach to explainable AI will create a clearer divide between organizations that use AI superficially and those that transform their operations around it. The latter will increasingly pull ahead in their industries, able to offer products and services at price points their competitors can’t match.

Unlike black-box AI, explainable AI systems ensure humans maintain meaningful oversight of the technology’s application, creating a future where AI augments human potential rather than simply replacing human labor.

The post How Explainable AI Builds Trust and Accountability appeared first on Unite.AI.