Curtis Chou
Posts
The Era of AI Explainability

The Era of AI Explainability

An Attempt to Understand AI

Curtis Chou
December 26, 2024

Introduction

In today’s world, we have reached an inflection point with AI. People and businesses are rapidly adopting it, becoming a standard in today's market.

While the AI revolution commands global attention, there's an overlooked yet crucial niche: the market for AI explainability solutions. Throughout my time in computer science, AI research, and startup ecosystems, I’ve realized that very few people if any, truly understand how AI outputs are generated or why a model gives a particular response. Much of the space treats AI models and LLMs like black boxes: you provide input and receive output. Understanding model behavior and why a particular model structure works to improve applications is a challenging task that few people can explain. And the people that can explain them cannot explain them well.

AI models however don’t need to be fully explained to an extent. Just like how the human brain has not been fully explained but it works. And that is all that matters. So if AI models work why do we care to understand them?

AI Explainability Explained

Explainable AI describes why the AI model made a prediction. Interpretable AI describes how it makes the prediction.

Splunk

I see AI explainability as the ability to observe, evaluate, and understand foundational AI models and AI applications such that AI can improve towards Artificial General Intelligence (AGI) in a safe and regulated manner.

AI explainability goes hand in hand with AI observability and interpretability. There are differences but it is common to use these terms interchangeably which I do here.

Why is AI Explainability Important?

AI explainability is important because, in contexts where AI holds significant trust, these models and their responses must be explainable.

How can you prove a model response is unbiased? How do you reduce model hallucinations (bad responses)? How do you know your AI application complies with safety and regulatory practices? How do you improve AI application processes to better suit use cases?

Take Booking.com and their AI Trip Planner for example. Booking.com’s AI Trip planner was initially dealing with problems such as limited data for personalization, performance issues, response evaluation gaps, and more. However, after implementing AI observability tools from Arize AI, they had a 13% increase in accuracy and a fivefold reduction in response times, enhancing user satisfaction and efficiency in the travel planning process (Source).

What happens if you don’t have practices in place to observe your AI application? Situations like Air Canada’s chatbot error lawsuit occur.

In the near future where AI is standard in all applications, how will companies distinguish themselves? Through specialization and better AI. Better AI applications that are safe and compliant are the results of AI that can be easily iterated and evaluated through explanation.

However, alignment with safety and regulatory practices is where I envision the biggest use case for AI explainability tools. This includes safety in regards to AI scaling towards AGI, pioneered by Anthrophic’s AI Safety Levels. And this also includes safety in terms of addressing and preventing misuse of current AI applications for illegal means such as stealing personal data or building weapons.

Once we progress AI to the point where AGI is the standard, AI observability tools will likely play a role in monitoring, evaluating, and understanding AGI such that catastrophic risks don’t occur. This demonstrates an important need for AI explainability.

Outlook

OpenAI alone currently has over 1 million paying business users across their enterprise products, including ChatGPT Enterprise, Team, and Edu (Source). From this, we can infer a wide range of businesses using AI wrappers for their AI applications/features.

AI observability tools are primarily built for these enterprise/SMB AI wrappers, to improve applications and features. Notable startups building these tools include Fiddler AI, Arize AI, and TruEra (acquired by Snowflake).

However, progression toward AGI will likely happen through AI scaling and research, not directly through these enterprise/SMB observability tools. These tools may aid the progression but are primarily for improving AI specialization, safety, and risk prevention in enterprises and SMBs.

Progression towards AGI will happen through AI scaling. It is worth mentioning that AI scaling might be starting to hit its limits. For context, AI has generally been scaled with bigger model architectures, more data, and more compute to achieve better results. But it is slowly becoming evident that bigger may not always lead to better.

“OpenAI’s experience with its next-generation Orion model provides one data point. At 20% of its training process, Orion was matching GPT-4’s performance—what scaling laws would predict. But as training continued, the model’s gains proved far smaller than the dramatic leap seen between GPT-3 and GPT-4. In some areas, particularly coding, Orion showed no consistent improvement, despite consuming significantly more resources than its predecessors.”

The Batch

However, many AI leaders such as Sam Altman (OpenAI CEO) and Dario Amodei (Anthrophic CEO) have expressed sentiment against AI scaling hitting a wall. If we are hitting a wall, new model architectures for AI scaling will likely be found through AI research settings that use their own AI observability/explainability tools. These distinct tools solely focus on advancement towards AGI while companies like Fiddler and Arize build their tools for better AI specialization in business use cases.

In the end, AI companies are just SaaS companies. This is why building a production-level observability tool for foundational model improvements is not a viable business opportunity. There are only a few big players building foundational models (OpenAI, Anthrophic, Perplexity, etc) and the foundational model market is already becoming an oligopoly due to technical/financial moats. Therefore, the AI observability tools out there are geared towards enterprises/SMBs, a scalable market. Nonetheless, this is still a market with a high barrier of entry due to the technical aptitude required to build these tools, further demonstrating its potential.

But why don’t foundation model companies just build these tools themselves? They arguably could but the priority task for these companies is advancement towards AGI, not domain specialization. Whoever gets to AGI first in the foundation model market wins.

To reiterate, AI explainability tools are important for stronger specialized AI products among enterprises and SMBs. I envision these tools to enable companies to develop high-value AI tailored for specialized domains and gain even greater importance in the era of AGI for safety and regulation.

In essence, I’m writing all this to shed more light on the growing market of enterprise/SMB AI explainability tools and the impact they could have on our AI future.