The Ultimate guide to security for GenAI
The Nightfall Team
June 17, 2024
The Ultimate guide to security for GenAIThe Ultimate guide to security for GenAI
The Nightfall Team
June 17, 2024

The Ultimate guide to security for GenAI

On this page

From customer service chatbots to enterprise search tools, companies are fervently seeking new ways to leverage both first-party and third-party LLMs to stay on the cutting edge of innovation. However, in this mad scramble, it’s all too easy to lose sight of data stewardship. Read on to learn about the security risks posed by building and deploying generative AI (GenAI) apps, as well as the best practices that developers need to ensure model safety.

What are the risks of deploying GenAI applications?

There are three key risks that developers need to consider concerning AI models’ inputs, training data, and outputs. Let’s delve into each of these risks with specific examples.

Get an overview of the challenges that enterprises face when deploying GenAI.

Accidentally inputting sensitive data

Human error plays a huge part in accidental data leaks, as well as data breaches. (In fact, Verizon found that nearly three quarters of all data breaches involved some sort of “human element.”) In line with this, AI models are often deployed in environments where employees and customers alike can accidentally input sensitive data. For this reason, it’s crucial for developers to identify any downstream risks to AI models, especially if models rely on third-party LLMs or data annotation platforms.

Consider this example: A customer service agent is using a GenAI-powered chatbot tool to respond to customers more quickly and efficiently. However, one customer “over-shares” by including their credit card number in their message. Without certain guardrails (like content inspection and filtering), the AImodel could store or train on that data in the future, leading to a slew of security and privacy issues.

If sensitive data is indeed exposed to an AI model, there are a number of downstream risks that could arise, including, most notably, data reconstruction attacks or training data extraction attacks. In this particular example, a threat actor could leverage the chatbot model’s outputs to infer some of its training data, and uncover the customer’s credit card number in the process. In this worst-casescenario, not only is the customer’s data exposed, but also the company is at risk of noncompliance with leading standards like PCI-DSS.

Exposing sensitive training data

AI models require incredibly high volumes of data to perform as intended, with minimal hallucinations, inconsistencies, or biases. But what data are you inputting into your model? And can you train your model with that data? These are both vital questions for developers to consider, especially since there’s a high likelihood that a customer or employee will accidentally input sensitive data at some point in time.

For instance, if sensitive data isn’t properly filtered, and is used to train a first-party LLM, the company may be found noncompliant, leading to legal issues, costly fines, and loss of customer trust.

Failing to scale access and permissions

It’s not only a model’s training data and inputs that matter—developers also have to consider the model’s inferences and outputs as well. For instance, what employees have access to a given model’s outputs? And is that access scoped appropriately throughout the company?

Imagine that a third-party contractor wants to use their employer’s AI-powered enterprise search tool to conduct research for a project. However, that search tool was trained on a large corpus of company data, including confidential meeting notes, budget spreadsheets, and other sensitive information— which is far beyond what a contractor needs to know to perform their duties.

At this juncture, the least privilege principle puts developers at a difficult crossroads: Limit employee access to GenAI search tools, or curb the functionality of said search tools by removing large swathes of training data.

What are best practices for building and consuming AI models?

With the above risks in mind, it’s readily apparent that developers must build and maintain strict trust boundaries and guardrails to protect AI models’ inputs, training data, and outputs. So what are the best steps to take? First, let’s talk about an essential tool: Content filtering.

Filtering sensitive data via APIs or SDKs

Content filtering is an essential safeguard that intercepts sensitive data before it’s submitted as an input to third-party LLMs or included in a model’s training dataset. This technique is crucial for avoiding data exposure, data breaches, and noncompliance. And the best part? When done right, content filtering should have no impact on model performance, as models don’t need sensitive data in order to generate a cogent response.

Now, let’s take a look at content filtering in action through the lens of the Nightfall Developer Platform. After integrating with Nightfall’s lightweight APIs, developers can take the following steps to set up programmatic content filtering.

  • Choose an out-of-the-box detection rule, or customize your own inline detection rule to pinpoint and remediate sensitive data by deleting, redacting, or masking it. You can also choose to automate prompt redaction by specifying a redaction config.
  • Send an outgoing prompt in a request payload to the Nightfall API text scan endpoint.
  • Review the response from Nightfall’s API, which will contain any sensitive findings as well as your newly redacted prompt.
  • Submit the redacted prompt to your AI model via the API or SDK client. Here’s how these steps might look in a real-world example...

Here’s how these steps might look in a real-world example...

Learn how to navigate data privacy for AI in customer support.

With the Nightfall Developer Platform in place, developers can ensure that sensitive data isn’t exposed to third-party LLMs or other downstream services, such as annotation platforms.

Building a firewall for AI

For developers who rely on third-party LLMs to build their AI tools, it’s essential to consider the shared responsibility model of AI. This model posits that while any given LLM provider will have some basic safeguards in place, it’s up to each individual company to secure their data and workflows involving said LLM. This is precisely where a “firewall” for AI may prove useful, both for peace of mind, as well as for ensuring continuous compliance with leading standards.

In short, a “firewall” for AI is a third-party service—like Nightfall—which moderates interactions with LLM providers. AI firewalls have two key benefits, including:

  1. Monitoring inputs for sensitive data, and scrubbing that sensitive data before it can be transmitted to a third-party LLM. By doing this, services like Nightfall can prevent downstream data leaks that could result from LLM overfitting, lack of filtering, or other errors involved in model training.
  2. Identifying and stopping malicious or harmful outputs that could result from prompt injection attacks and other instances of model abuse. Nightfall’s AI-powered  detection  platform can accurately understand the context surrounding prompts and responses, which not only helps to identify deleterious prompts, but also to do so with fewer false positive alerts.

Now, how does one deploy an AI firewall? Solutions typically come in the form of APIs, SDKs, or reverse proxies. At Nightfall, we recommend taking the API route, as it offers the following benefits:

  • Easy installation and extensible workflows that fit seamlessly with third-party LLM providers like OpenAI
  • Faster and more accurate content inspection using AI, as opposed to rule-based approaches, which require reverse proxies and often lead to workflow latencies
  • Pain-free scalability that can grow alongside model usage, all without compromising security orperformance

In short, the Nightfall Developer Platform adds an incredibly flexible yet highly effective layer of protection and compliance for apps and tools that leverage third-party LLMs.

Applying the least privilege principle

As mentioned above, the least privilege principle limits employee access to data on a strictly need-to- know basis. When it comes to training, building, and deploying AI, companies should consider implementing the following best practices to ensure that sensitive data doesn’t fall into the wrong hands.

  • Create policies that outline what data can and can’t be used to train AI models, while taking data sensitivity, relevance, and compliance requirements into account. Enterprise-grade DLP tools like Nightfall provide the option to either adopt a pre-configured detection rule or create an inline detection rule to address specific compliance needs.
  • Enforce strict role-based access controls to ensure that only essential employees have access to training data.
  • Conduct regular audits of training datasets and data sources to ensure that they don’t contain sensitive data.
  • Keep detailed records of data sources that are used in training in order to ensure audit readiness.

These best practices serve a twofold purpose: Protecting sensitive company and customer data from exposure, while also enhancing data governance throughout the model training and deployment process.

TL;DR: How can developers build and deploy AI models safely in an enterprise setting?

AI development is becoming an essential part of technological innovation, but security teams and developers alike must ensure that they are embracing that innovation with the proper safeguards in place. In order to avoid the pitfalls of data leaks, data exposure, and noncompliance, companies should implement practices that monitor and protect AI models’ inputs, training data, and outputs.

Visit to see how Nightfall can intercept and protect sensitive data in real time, before it’s sent to AI models. Want to test drive the Nightfall Developer Platform for yourself? Schedule your personalized demo at

Nightfall Mini Logo

Getting started is easy

Install in minutes to start protecting your sensitive data.

Get a demo