jump to navigation

A Path to AI Explainability April 23, 2021

Posted by Peter Varhol in Machine Learning, Technology and Culture.
Tags: , ,
1 comment so far

I mentioned a while ago that I once met Pepper, the Softbank robot that responds to tactile stimulation.  This article notes that Pepper can now introspect, that is, work through his “thought” processes aloud as he performs his given activities.  I find that fascinating, in that what Pepper is really doing explainable AI, which I have been writing about recently.

The result is that Pepper can not only carry out verbal instructions, but also describe what he is doing in the process.  I don’t really understand how to code this, but I do appreciate the result.

Explainable AI is the ability of an AI system to “describe” how it arrived at a particular result, given the input data.  It actually consists of three separate parts – transparency, interpretability, and explainability.  Transparancy means that we need to be able to look into the algorithms to clearly discern how they are processing input data.  Explainability means that we might want to support queries into our results, or to get detailed explanations into more specific phases of the processing. 

Further, it appears that Pepper, through talking out his instructions (I really dislike using human pronouns here, but it’s convenient) is able to identify contradictions or inconsistencies that prevent him from completing the activity.  That frees Pepper to ask for additional instructions.

That’s an innovative and cool example of explainability, and extends to the ability of the AI to ask questions if the data are ambiguous or incomplete.  We need more applications like this.

Why Testing Needs Explainable Artificial Intelligence April 19, 2021

Posted by Peter Varhol in Algorithms, Machine Learning, Software development.
Tags: , , , , ,
add a comment

Many artificial intelligence/machine learning (AI/ML) applications produce results that are not easily understandable from their training and input data.  This is because these systems are largely black boxes that use multiple algorithms (sometimes hundreds) to process data and return a result.  Tracing how this data is processed, in mathematical algorithms, is an impossible task for a person.

Further, these algorithms were “trained” or adjusted based on the data used as the foundation of learning.  What is really happening there is that the data is adjusting algorithms to reflect what we already know about the relationship between inputs and outputs.  In other words, we are doing a very complex type of nonlinear regression, without any inherent knowledge of a casual relationship between inputs and outputs.

At worst, the outputs from AI systems can sometimes seem nonsensical, based on what is known about the problem domain.  Yet because those outputs come from software, we are inclined to trust them and apply them without question.  Maybe we shouldn’t.

But it can be more subtle than that.  The results could pose a systemic bias that made outputs seem correct, or at least plausible, but are not, or at least not ethically right.  And users rarely have recourse to question the outputs, making them a black box.

This is where explainable AI (XAI) comes in.  In cases where the relationship between inputs and outputs is complex and not especially apparent, users need the application to explain why it delivered a certain output.  It’s a matter of trusting the software to do what we think it is doing.  Ethical AI also plays into this concept.

So how does XAI work?  There is a long way to go here, but there are a couple of techniques that show some promise.  It operates off of the principles of transparency, interpretability, and explainability.  Transparency means that we need to be able to look into the algorithms to clearly discern how they are processing input data.  While that may not tell us how those algorithms are trained, it provides insight into the path to the results, and is intended for interpretation by the design and development team.

Interpretability is how the results might be presented for human understanding.  In other words, if you have an application and are getting a particular result, you should be able to see and understand how that result was achieved, based on the input data and processing algorithms.  There should be a logical pathway between data inputs and result outputs.

Explainability remains a vague concept while researchers try to define exactly how it might work.  We might want to support queries into our results, or to get detailed explanations into more specific phases of the processing.  But until there is better consensus, this feature remains a gray area.

The latter two characteristics are more important to testers and users.  How you do this depends on the application.  Facial recognition software can usually be built to describe facial characteristics and how they match up to values in an identification database.  It becomes possible to build at least interpretability into the software.

But interpretability and explainability are not as easy when the problem domain is more ambiguous.  How can we interpret an e-commerce recommendation that may or may not have anything to do with our product purchase?  I have received recommendations on Amazon that clearly bear little relationship to what I have purchased or examined, so we don’t always have a good path between source and destination.

So how do we implement and test XAI? 

Where Testing Gets Involved

Testing AI applications tends to be very different than testing traditional software.  Testers often don’t know what the right answer is supposed to be.  XAI can be very helpful in that regard, but it’s not the complete answer.

Here’s where XAI can help.  If the application is developed and trained in a way where algorithms show their steps in coming from problem to solution, then we have something that is testable.

Rule-based systems can make it easier, because the rules form a big part of the knowledge.  In neural networks, however, the algorithms rule, and they bear little relationship to the underlying intelligence.  But rule-based intelligence is much less common today, so we have to go back to the data and algorithms.

Testers often don’t have control over how AI systems work to create results.  But they can delve deeply into both data and algorithms to come up with ways to understand and test the quality of systems.  It should not be a black box to testers or to users.  How do we make it otherwise?

Years ago, I wrote a couple of neural network AI applications that simply adjusted the algorithms in response to training, without any insight on how that happened.  While this may work in cases where the connection isn’t important, knowing how our algorithms contribute to our results has become vital.

Sometimes AI applications “cheat”, using cues that do not accurately reflect the knowledge within the problem domain.  For example, it may be possible to facially recognize people, not through their characteristics, but through their surroundings.  You may have data to indicate that I live in Boston, and use the Boston Garden in the background as your cue, rather than my own face.  That may be accurate (or may not be), but it’s not facial recognition.

A tester can use an XAI application here to help tell the difference.  That’s why developers need to build in this technology.  But testers need deep insight into both the data and the algorithms.

Overall, a human in the loop remains critical.  Unless someone is looking critically at the results, then they can be wrong, and quality will suffer.

There’s no one correct answer here.  Instead, testers need to be intimately involved in the development of AI applications, and insist on explanatory architecture.  Without that, there is no way of comprehending the quality that these applications need to deliver actionable results.

Design a site like this with WordPress.com
Get started