Blog

Nightfall’s new GenAI detectors are revolutionizing the cloud DLP landscape. Here’s how.

by
The Nightfall Team
,
December 14, 2023
Nightfall’s new GenAI detectors are revolutionizing the cloud DLP landscape. Here’s how. Nightfall’s new GenAI detectors are revolutionizing the cloud DLP landscape. Here’s how.
The Nightfall Team
December 14, 2023
On this page

Intro

Nightfall AI is excited to announce a new generation of detectors powered by generative AI (GenAI). Read on to learn more about recent advancements in our PII, PHI, secrets, and images detectors—as well as how they stack up against competitors like AWS Comprehend, Google DLP, and Microsoft Purview.

Parameters

Nightfall’s data science team started by studying over 5,000 data samples for each of our top detectors, including our detectors for:

  • PII: Name, date of birth, street address, email address, phone number, social security number (SSN), driver’s license number, Vehicle Identification Number (VIN), US Individual Taxpayer Identification Number (ITIN)
  • PCI: Bank routing number, IBAN code, swift code
  • PHI: ICD10 Code, ICD10 Description, National Provider Identifier (NPI), Medical Beneficiary Identifier (MBI), US Health Insurance Claim Number, FDA Drug Name, FDA Drug Code
  • Secrets: API key, database connection string, password
  • Images: Passport, driver’s license, social security card, credit card

For a full overview of Nightfall’s detectors, visit out comprehensive detector glossary.

Results

Let's get a closer look at Nightfall’s findings overall, as well as for PII, PCI, PHI, secrets, and images use cases.

Detection precision for Nightfall AI, Google DLP, AWS Comprehend, and Microsoft Purview

Overall

Nightfall’s GenAI detectors display 2x greater precision compared to AWS Comprehend, Google DLP, and Microsoft Purview.

How does this measure of precision affect a security team’s experience? In short, higher precision translates to fewer false positives, and consequently, fewer alerts for security teams to manage. In this case, a 2x increase in precision corresponds to a 4X decrease in false positive alerts.

Nightfall also found a direct relationship between quantity of alerts and operational costs; for example, a 4x reduction in alerts leads to a 4x reduction in operational costs for security teams.

Supported

Nightfall’s “supported” findings only cover the detectors that compare directly to Nightfall’s. We found discrepancies in coverage in these three use cases:

  • Secrets: AWS and Google only scan for API keys for their own respective platforms. Nightfall, on the other hand, scans for API keys for many different platforms.
  • PHI: AWS and Google don’t offer a specialized PHI detector, and thus were not included in this finding. While Microsoft does have a HIPAA compliance detector, it did not respond to Nightfall’s sample data.
  • Images: Google DLP and Microsoft Purview use Optical Character Recognition (OCR) to scrape images for text. On the other hand, Nightfall uses an image classification model to identify sensitive documents at a glance by recognizing the documents’ format.

PII and PCI

Nightfall measured our PII and PCI detection using a combination of six detectors, four of which are powered by a Convolutional Neural Network (CNN). Nightfall uses LLM-generated embeddings from this CNN to more precisely evaluate the context surrounding possible findings. As a result, Nightfall’s PII detection is 1.5x more precise, and Nightfall’s PCI detection is 2X more precise than that of AWS, Google, and Microsoft.

PHI

HIPAA defines PHI as health data (such as a diagnosis or drug prescription) that can be traced to a uniquely identifiable individual. With this in mind, detectors need to be able to map the relationships between patients and their health data. For instance, if a detector only detects ICD10 diagnosis codes, but can’t tie them to unique individuals, then the detector will present a high volume of false positive PHI findings.

Nightfall leverages a powerful transformer model to map health information to specific individuals—ensuring that only true PHI is flagged. AWS and Google offer detectors for some of the individual entities needed to detect PHI; however, they can’t be directly compared to Nightfall’s PHI detector, which considers a combination of PII and PHI.

Secrets

Nightfall’s transformer model has been fine-tuned on 125 million parameters to detect secrets in even the most complex edge cases. Nightfall’s secrets detection is 2x more precise than the competition’s. However, it’s worth noting that AWS and Google only scan for API keys for their own respective platforms, whereas Nightfall scans for API keys across many platforms.

Images

Nightfall’s image classification model is 3x more precise than Optical Character Recognition (OCR), which is used by both Google and Microsoft. Image classification models rely on a much larger corpus of data, and as a result, can contextualize images based on their overall format as opposed to just the text they contain. This means that Nightfall can recognize sensitive documents even when images are blurry, grainy, or otherwise difficult to read.

Nightfall's image classification model vs. OCR-based detection

Conclusion

At Nightfall, we’re dedicated to protecting your sensitive data, wherever it is in the cloud. Our latest advancements in detection are a huge stepping stone in keeping businesses more secure, while also helping them to save on both time and operational costs. And the best part? These four new detectors are just the beginning.

Want to see Nightfall’s latest detectors in action? Click here to schedule a free demo.

Nightfall Mini Logo

Getting started is easy

Start protecting your data with a 5 minute agentless install.

Get a demo