Framework for Securing Generative AI
Key Points
- Generative AI expands the attack surface, prompting 80 % of executives to doubt its trustworthiness due to cybersecurity, privacy, and accuracy concerns.
- A security framework is needed that protects every stage of the AI pipeline—data collection, model training/tuning, and inference/usage.
- Threats to the data stage include poisoning (injecting malicious bias), exfiltration (stealing sensitive training data), and accidental leakage, all of which can degrade model performance or expose confidential information.
- Mitigation starts with thorough data discovery and classification to identify sensitive assets, followed by controls that safeguard data integrity, prevent unauthorized access, and ensure safe model deployment.
Sections
- Framework for Securing Generative AI - The speaker explains how generative AI widens the attack surface, cites executive trust concerns, and outlines a comprehensive framework that secures data, models, and inference to mitigate privacy, accuracy, and cybersecurity risks.
- Securing AI Training Systems & Models - The speaker outlines essential safeguards—data classification, encryption, multifactor access controls, continuous monitoring, and strict provenance checks—to protect training pipelines and ensure that imported models are trustworthy.
- LLM Threats: Prompt Injection, DoS, Theft - The speaker outlines three major attacks on large language models—prompt injection to bypass guardrails, denial‑of‑service overload to cripple performance, and model extraction theft that reconstructs the model from repeated queries.
- Governance and Security for Generative AI - The speaker outlines the need for a comprehensive governance layer that ensures fairness, bias mitigation, drift control, regulatory compliance, and ethical operation of generative AI, while also leveraging AI to strengthen cybersecurity—creating a reciprocal “AI for security and security for AI” framework.
Full Transcript
# Framework for Securing Generative AI **Source:** [https://www.youtube.com/watch?v=pR7FfNWjEe8](https://www.youtube.com/watch?v=pR7FfNWjEe8) **Duration:** 00:13:12 ## Summary - Generative AI expands the attack surface, prompting 80 % of executives to doubt its trustworthiness due to cybersecurity, privacy, and accuracy concerns. - A security framework is needed that protects every stage of the AI pipeline—data collection, model training/tuning, and inference/usage. - Threats to the data stage include poisoning (injecting malicious bias), exfiltration (stealing sensitive training data), and accidental leakage, all of which can degrade model performance or expose confidential information. - Mitigation starts with thorough data discovery and classification to identify sensitive assets, followed by controls that safeguard data integrity, prevent unauthorized access, and ensure safe model deployment. ## Sections - [00:00:00](https://www.youtube.com/watch?v=pR7FfNWjEe8&t=0s) **Framework for Securing Generative AI** - The speaker explains how generative AI widens the attack surface, cites executive trust concerns, and outlines a comprehensive framework that secures data, models, and inference to mitigate privacy, accuracy, and cybersecurity risks. - [00:03:12](https://www.youtube.com/watch?v=pR7FfNWjEe8&t=192s) **Securing AI Training Systems & Models** - The speaker outlines essential safeguards—data classification, encryption, multifactor access controls, continuous monitoring, and strict provenance checks—to protect training pipelines and ensure that imported models are trustworthy. - [00:07:53](https://www.youtube.com/watch?v=pR7FfNWjEe8&t=473s) **LLM Threats: Prompt Injection, DoS, Theft** - The speaker outlines three major attacks on large language models—prompt injection to bypass guardrails, denial‑of‑service overload to cripple performance, and model extraction theft that reconstructs the model from repeated queries. - [00:11:53](https://www.youtube.com/watch?v=pR7FfNWjEe8&t=713s) **Governance and Security for Generative AI** - The speaker outlines the need for a comprehensive governance layer that ensures fairness, bias mitigation, drift control, regulatory compliance, and ethical operation of generative AI, while also leveraging AI to strengthen cybersecurity—creating a reciprocal “AI for security and security for AI” framework. ## Full Transcript
Your attack surface just got a lot bigger. How? Well, generative AI. This technology
that is taking the world by storm and can do amazing stuff, also introduces some new threats
with it and new risks. In fact, 4 out of 5 executives have said that they're not sure they
can really trust generative AI because they have concerns specifically related to cybersecurity.
They have concerns related to privacy and accuracy of what the system can do. If we
don't address those issues, we don't have trust and we're not able to take full advantage of the
technology. So what do we need to do in order to secure generative AI? Well, we need a framework
that will allow us to take a look at these issues. So let's talk a little bit about what this is and
how it works. Well, what starts is we have data. We use this data from a lot of different sources,
and we use it to train and tune our models, which are the next component of this framework. We use
the data to train the model and then ultimately the model is used for inferencing. The inferencing
is where we get our output. So this is how the thing generally works. In fact, we have to look
at how we're going to secure each one of those. How do I secure the data? How do I secure the
model? How do I secure the usage? And it turns out there's a couple of other things we need to
take a look at that I'll show you before the video is over. Okay, let's take a look at securing the
data itself. This is the data that we're going to use to train and tune the model. Therefore,
a bad guy is going to look at that and see what they can do to it because that's what they do. If
it exists, they're going to try to mess it up. This is kind of why we can't have nice things,
I'll tell you. And what they're going to do in this case, one possibility is poison the data. -
If they poison the data, then basically they introduce some inaccuracies or things like that
that go then into the training and tuning of the model. And now the model produces bad results as
a consequence of that. Another thing that could happen in this space is exfiltration. An exfil
attack would be where someone breaks into this system and pulls data away from it. And this could
maintain a large set of sensitive information. And another version of that is leakage. Leakage might
be a more unintentional versus exfiltration is an intentional where someone has come in. But the
effect is the same. This sensitive data--and we may use a lot of really sensitive data to train our
model and tune it because that will make it more accurate and more useful in our use case. But as a
result, now this ends up with a big bullseye on it that somebody is going to try to hit. Therefore,
we're going to have to try to secure that. How could we do that? Some things that we should
be taking note of is we should do data discovery and classification. I need to know where all the
sensitive data is. Maybe these data sources were secured when they were in their various databases.
But now that we brought all this together as a training system, maybe we didn't apply the same
controls and we need to. Classify that so that we know what level of security and protections are
necessary. We want to also use cryptography so that we make sure that the data that does leak
out or gets exfiltrated out is of no harm. No one can read it, unless they're an authorized user. We
also add things like access controls, which make sure that only authorized people can get into this
system using strong multifactor authentication (MFA) and things like that. And then also we need
to monitor the system. I want to know if this has occurred so that I can then take remedial
actions and do the right sort of countermeasures and limit the exposure. Okay, now let's talk
about securing the model itself. Now, how does this stuff work? It turns out very few people
develop models for their organization because it's very computationally expensive, requires a lot of
effort. So most people are leveraging existing models. In many cases, they get them from open
source. And knowing where you're getting your model from makes all the difference in the world.
So typically I would import models from a lot of different trusted sources. But if I'm not careful,
maybe I get a model that I think is trusted and it's not. Maybe it's a copy of a trusted
one that has been modified in some way. And so that means I really have to consider basically
the models as a supply chain, and I have to do supply chain management of the models themselves
that go into whatever system I'm going to use. So in this case, I need to be able to make sure that
I can trust what my sources are, and I need to be able to know that there are not bad things,
not just bad information, actual malware. There have been researchers that have done a proof of
concept to show that malware could be introduced through a machine learning model. So we have to
be able to look for those kinds of things. Also, in many cases, we're going to use APIs as a way
to communicate out to other services--maybe to communicate with a model if we're not hosting
it directly ourselves. That API path could also be a path where an attacker could come in and
introduce some sort of error that then affects our system. Other things--if we use plug-ins that
elevate privilege, well, I need to be concerned about the privilege escalation that can occur and
limit what it's able to do so that it's not able to also make changes here, or put bad information
out, or even modify parts of my system. I want to make sure that I have a human in the loop to
guard against some of those kinds of things. And then as well, I've been talking about IT attacks,
but how about an IP attack, intellectual property? I need to be able to make sure that I'm not using
copyrighted works in one of these sources, because then I could be sued for using information that's
not really I'm permitted to be using. So what kinds of things should we be doing to guard
against this? Well, I need to scan my systems just like I do for other types of malware and
look for any type of of of harmful code that's being introduced into the system. I need to be
able to harden my system. That basically means I'm going to remove any unnecessary services or change
all the default user ID and passwords. All of those kinds of things. The system should be able
to withstand an attack. And I do that hardening by removing and limiting the attack surface as much
as I can. Then I use things like role-based access control. Again, trying to make sure that something
like this can't do more than it's supposed to be able to do. And I can do that on a per user,
per account basis. And then look at my sources, vet those sources, make sure that they
are trustworthy, make sure that they don't contain copyrighted materials that will get me in trouble
legally and things like that. So while these are more IT, this is IP. Okay, we're going to take a look at how do we secure the
usage of our generative API? We've looked at the data, the model--how it's used. And it turns
out one of the main ways that this system can be manipulated by a bad guy, a malicious
actor, is through what's called prompt injection. And prompt injection is in fact the #1
on a list of top 10 vulnerabilities mentioned by OWASP, if you're familiar with that organization.
So prompt injection. In that case, we have a user that comes in and basically tries to jailbreak the
large language model or generative model that's behind this by issuing commands, having the system
do things that it was not intended to do. Trying to get around the guardrails that are put there.
It's almost a semantic attack. It's an attack with the words on the knowledge itself in many cases.
So we basically with that can also sometimes bias the output and therefore the information that
comes out of the system is now not as trustworthy as it should be. Another type of attack someone
may do is a denial of service. If I send enough attacks into the system that are complex, then
maybe it starts bogging the system down. Because I can ask a question much faster than a system
can answer it. So we send in a bunch of these and now the system becomes overwhelmed and it can't
keep up anymore. And then the last one that I'll mention here is model theft. In this case, someone
might not be able to just break in and steal your model if you've done a good job up here in the
storing of your data and securing all of that. But what if they just put a number of different
queries into the system and then take the output back and do that again and again and again. They
can basically mine the model in order to get the information that they want. Then they could
potentially go build their own version of this. And now they've essentially stolen your model in
the process. So what can we do to guard against some of these threats? Well, one of the things we
should do is monitor. In this case, monitor the inputs into the system and put guardrails around
those. We'll never be able to put all the kinds of semantic guardrails that we need to make sure
that someone can never get in. But we need to at least try and do the best job we can. There also
will probably be a new class of tools that we're starting to see emerge. One example of this is
machine learning, detection and response. So this is we've done detection and response and security
for a long time, but something that's specifically built for machine learning, maybe even generative
models. This is going to be a new emerging area that we can start looking at. And then finally,
some of our fundamentals using security information, event management system, or a SOAR,
a security orchestration, automation and response system that is monitor all of this so that I can
see the abnormal inputs. I can see if the system is under duress, if it's being overloaded, and
things like that. If someone's carrying data out. I could use some of those type of capabilities in
order to be aware. So there you have a way that you can secure the usage as well.
Okay, now for the big reveal of the two elements I told you I would talk about at the end. AI, generative AI,
does not exist in a vacuum. It runs on IT systems. Therefore, we have an infrastructure underneath
all of this--traditional computers. And the way we do security for those is something that I've
talked about before. That's what supports all of these systems and makes it possible for them to do
what they do. And what I've discussed before, is this thing called the CIA Triad: confidentiality,
integrity and availability. Those are the concerns that we have in securing an IT system. Those
concerns don't go away just because now we move to an AI system. We have to do everything we've
always done, plus a little bit more. And that little bit more is all of these kinds of things
that I've been talking about in the video. But we can't ignore the fundamentals of securing the
infrastructure itself and doing the basic blocking and tackling that we always have to do on every IT
system. And then finally, the last element here is one of governance. Not so much a security concern,
but it's a big concern for the functional operation and correct results of the system.
I need to be able to direct, manage, and monitor how the generative AI works. I need to ensure that
it's fair, that it's not biased, that the model doesn't have drift over time because someone has
introduced some incorrect information in it. I need to be able to ensure that that's not
occurring. I need to be able to keep up with regulatory compliance and requirements issues.
And ultimately make sure that the system operates ethically. So I need a governance layer here as
well. Put all of these things together and this is a security framework for generative AI. Now,
if we think about--to summarize--what I've talked about in another video, is using
AI to create better security. And there are a lot of possibilities of what generative AI can
do in order to make us better. Make this a force multiplier for doing the best cybersecurity we've
ever done. And then what has been the subject of this video is how we can use security to make sure
that the AI is secure. So it's AI for security and security for AI. If we get all of that right,
then we win. Thanks for watching. If you found this video interesting and would like to learn
more about cybersecurity, please remember to hit like and subscribe to this channel.