The Five Pillars of Trustworthy AI
Key Points
- AI chatbots can produce hazardous misinformation, exemplified by a model that falsely recommended a toxic “aromatic water” recipe mixing ammonia and bleach.
- IBM proposes five pillars for trustworthy AI, beginning with **Explainability**, where the system’s reasoning must be clear enough for domain experts to understand and validate without needing AI expertise.
- The second pillar, **Fairness**, requires AI to avoid bias by training on diverse data sets—such as inclusive object‑ and facial‑recognition datasets—to ensure equitable performance across all groups.
- **Transparency** demands that AI systems are not opaque “black boxes”; users must be able to inspect and verify the underlying algorithms and decision processes before trusting the outcomes.
Sections
- Trustworthy AI: Explainability Explained - The speaker highlights hazardous AI hallucinations, references IBM’s five trust pillars, and illustrates the importance of explainability through a symptom‑diagnosis scenario.
- Transparency, Robustness, and Privacy in AI - The speaker outlines key trustworthy‑AI pillars—providing a clear view into algorithms, models, and training data, ensuring systems can resist attacks and data poisoning, and protecting user information from unwanted disclosure.
Full Transcript
# The Five Pillars of Trustworthy AI **Source:** [https://www.youtube.com/watch?v=nB_EjxoP-6w](https://www.youtube.com/watch?v=nB_EjxoP-6w) **Duration:** 00:05:29 ## Summary - AI chatbots can produce hazardous misinformation, exemplified by a model that falsely recommended a toxic “aromatic water” recipe mixing ammonia and bleach. - IBM proposes five pillars for trustworthy AI, beginning with **Explainability**, where the system’s reasoning must be clear enough for domain experts to understand and validate without needing AI expertise. - The second pillar, **Fairness**, requires AI to avoid bias by training on diverse data sets—such as inclusive object‑ and facial‑recognition datasets—to ensure equitable performance across all groups. - **Transparency** demands that AI systems are not opaque “black boxes”; users must be able to inspect and verify the underlying algorithms and decision processes before trusting the outcomes. ## Sections - [00:00:00](https://www.youtube.com/watch?v=nB_EjxoP-6w&t=0s) **Trustworthy AI: Explainability Explained** - The speaker highlights hazardous AI hallucinations, references IBM’s five trust pillars, and illustrates the importance of explainability through a symptom‑diagnosis scenario. - [00:03:02](https://www.youtube.com/watch?v=nB_EjxoP-6w&t=182s) **Transparency, Robustness, and Privacy in AI** - The speaker outlines key trustworthy‑AI pillars—providing a clear view into algorithms, models, and training data, ensuring systems can resist attacks and data poisoning, and protecting user information from unwanted disclosure. ## Full Transcript
About a year ago, I did a video where I suggested
that a chatbot might hallucinate or be poisoned into giving a recommendation
that you make a common household cleaning solution out of ammonia and bleach.
Well, that was a hypothetical.
It turns out it's true.
In fact, there was an AI chatbot that came out some months later
and recommended a recipe for an aromatic water mix.
Now, that sounds delicious. Who wouldn't want a tall glass of that?
Well, it turns out the ingredients ammonia and bleach.
Those are toxic.
Don't mix those together and definitely don't drink it.
So that AI is not one you can believe in.
What we want is a trustworthy AI,
and IBM came out with five pillars, or principles, of trustworthy AI.
These are the things that we want to expect from an A.I..
And let's take a look at what they are.
The first one is Explainability.
We want the AI to be able to explain itself and
be understandable by someone who is an expert in that particular domain.
So let's take an example of maybe I go to a chatbot and I give it the following symptoms.
I have red itchy eyes, I have a runny nose, I'm sneezing.
Okay, what would you think from that?
A doctor who is a domain expert in that
is probably going to say you've got an allergy or something along those lines.
What they're not going to say is you have a broken leg.
That would be an example of an unexplainable AI.
The explainable one, the expert in that domain can look at it and say,
"Yeah, I can see how you would come up with those things and come up with that particular diagnosis."
It makes sense.
And notice that domain expert doesn't have to understand anything about the way A.I. works.
They don't have to be a technology expert.
They're an expert in that domain of knowledge.
Okay. Let's take a look at the second pillar.
The second is about fairness.
That is, the AI should not be biased toward
or against any particular population or any particular group.
Let's take an example.
Let's say we have an object recognition system
that's based on AI, and it's been trained on a whole bunch of different squares.
So it recognizes those.
However, what happens when I give it some stuff like this?
It really can't recognize those very well because it hasn't seen enough of them.
There's not enough of that in its training database.
So what we need to do is make sure that it sees a diverse set of objects
so that it can make the right recognition.
Another example of this might be in facial recognition,
where, again, we need to use diverse faces in order to make sure our AI is fair.
Our third principle of trustworthy AI is transparency.
And in transparency, what we're trying to get here
is we don't want a black box, a system that just says, "Trust me,"
because we don't know if we can trust it or not.
We need to be able to verify. Then we can trust.
So what I need is a transparent box, a box I can see into.
And what would I see into it if it was an AI?
I want to be able to see things like the algorithms that are used.
I want to see the model that has been used.
And I want to see the data that was used to train this thing.
I want to know where the model came from.
I want to know where the data came from.
Those are the kinds of things that let me see in and give me more confidence
that in fact, this thing, from a technical standpoint,
is going to be something I can believe in.
Our fourth principle of trustworthy AI is robustness.
In this sense, what we mean is we want the system to be able to withstand attack.
It should remain true to itself.
It shouldn't be able to be compromised by outsiders who have malicious intent.
So, for instance, if I have this really valuable data or model that's in the system,
these are sort of the crown jewels and this is what the system runs on.
I don't want to allow an attacker to be able to get to that.
I need to be able to repel those attacks, make sure that they can't poison the data,
make sure they can't steal the model,
make sure that this system will continue to work.
And as a cybersecurity guy, this is one of these principles that I'm most focused on.
Our fifth principle, or pillar, of trustworthy AI is privacy.
In this case, I want to make sure that what goes in the chatbot
stays in the chat bot and it doesn't get shared with everyone else.
So for instance, we don't want a case where your data is our business model.
We want a case where your data is your data.
We don't want the chatbot spying on you,
or the information you put into it, going out and being shared with the rest of the world.
So I want some sort of protection that says,
"what I'm putting in, it's still my data. I don't want it shared with the whole wide world."
So now you see the five pillars or principles of trustworthy AI,
explainability, fairness, transparency, robustness, and privacy.
These are the things that we should expect from vendors who are supplying us with AI.
That way we can ensure that the AI serves us
and not the other way around.
Thanks for watching.
If you found this video interesting and would like to learn more about cybersecurity,
please remember to hit like and subscribe to this channel.