Worried About Leaking Data to LLMs? Here’s How Nightfall Can Help.

Updated: 5/21/24

Since the widespread launch of GPT-3.5 in November of last year, we’ve seen a meteoric rise in generative AI (GenAI) tools, along with an onslaught of security concerns from both countries and companies around the globe. Tech leaders like Apple have warned employees against using ChatGPT and GitHub Copilot, while other major players like Samsung have even go so far as to completely ban GenAI tools.

Why are companies taking such drastic measures to prevent data leaks to LLMs, you may ask? Follow along for a bird’s eye view of what can happen to data after it’s sent to an LLM like OpenAI.

Data can be stored indefinitely on LLM servers

Say you’re a medical professional that wants to use ChatGPT to summarize a patient visit. However, you accidentally submit the patient’s name and medical diagnosis code as part of your prompt. As soon as you submit that prompt, it will automatically transit to OpenAI’s servers. And since OpenAI doesn’t automatically filter for sensitive data, it’s likely that your patient’s PII and PHI will be stored and used by OpenAI for an indeterminate length of time. OpenAI is deliberately vague about how long they hold on to prompts, stating that they keep customer data “as long as we need in order to provide our service.” You might try to solve the problem by deleting your prompt from your ChatGPT history, however, you’d find that it’s difficult, if not impossible, to do so. This inability to access and redact the patient’s data poses a risk to not only your patient, but also to your practice’s HIPAA compliance.

Data can be used to train LLMs

If OpenAI has your patient’s data on it’s servers, it’s likely a matter of time before it uses that data to “fine-tune” its services. OpenAI has some measures in place to redact PII from training data “where feasible;” however, it’s important to note that data scrubbing techniques only work to an extent, and can still result in leaked PII down the line. According to a 2023 paper, “[Data] scrubbing is imperfect and must balance the trade-off between minimizing disclosure and preserving the utility of the dataset.” In other words? Accuracy and security can be diametrically opposed when it comes to training large LLMs—and it’s possible for threat actors to exploit this.

Data can be reconstructed by threat actors

Just as we memorize data to teach ourselves about complex subjects, LLMs do the same—even to the extent that they “memorize entire input samples, including seemingly irrelevant information” like PII and PHI. The team behind a 2021 paper was able to leverage this vulnerability to “extract hundreds of verbatim text sequences” from GPT-2’s training data.

In its latest iteration, GPT-4 has been trained to reject requests involving data extraction or reconstruction. With this and other measures in place, some may point out that it’s unlikely for ChatGPT to regurgitate sensitive data to future users. However, recent events involving Windows and GitHub Copilot have shown that threat actors can find workarounds as long as they provide enough reference material.

To close the loop on our example, this means that it’s entirely possible for a threat actor to access the patient name and medical diagnosis that you submitted in your original ChatGPT prompt.

So what’s the solution?

If we take LLM storage, training, and leakage into account, we can see that there are three possible ways for users and developers to mitigate risk, including:

Enhancing security teams’ visibility into GenAI tools
Remediating sensitive data before it’s transmitted and stored in LLM servers
Filtering out sensitive data so that it isn’t used to train new LLMs

Starting today, Nightfall AI is proud to offer all of these capabilities as part our latest product offering: Nightfall for GenAI. Nightfall for GenAI is comprised of three products, each of which provides a different dimension of DLP:

Nightfall for ChatGPT plugs into end-users’ browsers to monitor, detect, and redact sensitive data before it’s sent to OpenAI.
Nightfall Sensitive Data Protection remediates sensitive data stored in cloud-based apps (including those that have third-party AI sub-processors).
Nightfall’s Firewall for AI provides a robust API library to ensure that no PII, PCI, or PHI is included in LLM training data.

Nightfall for GenAI is our latest step towards Nightfall’s overarching mission: To equip users with the tools and security know-how that they need to keep their data safe in the cloud.

Want to explore Nightfall’s latest cloud DLP offerings for yourself? Try our free 14-day trial of Nightfall for ChatGPT, or schedule a demo to learn more about Nightfall Sensitive Data Protection and Nightfall's Firewall for AI.

Worried About Leaking Data to LLMs? Here’s How Nightfall Can Help.

On this page

Data can be stored indefinitely on LLM servers

Data can be used to train LLMs

Data can be reconstructed by threat actors

So what’s the solution?

Schedule a live demo