Nightfall AI is excited to announce a new generation of detectors powered by generative AI (GenAI). Read on to learn more about recent advancements in our PII, PHI, secrets, and images detectors—as well as how they stack up against competitors like AWS Comprehend, Google DLP, and Microsoft Purview.
Nightfall’s data science team started by studying over 5,000 data samples for each of our top detectors, including our detectors for:
- PII: Name, date of birth, street address, email address, phone number, social security number (SSN), driver’s license number, Vehicle Identification Number (VIN), US Individual Taxpayer Identification Number (ITIN)
- PCI: Bank routing number, IBAN code, swift code
- PHI: ICD10 Code, ICD10 Description, National Provider Identifier (NPI), Medical Beneficiary Identifier (MBI), US Health Insurance Claim Number, FDA Drug Name, FDA Drug Code
- Secrets: API key, database connection string, password
- Images: Passport, driver’s license, social security card, credit card
For a full overview of Nightfall’s detectors, visit out comprehensive detector glossary.
Let's get a closer look at Nightfall’s findings overall, as well as for PII, PCI, PHI, secrets, and images use cases.
Nightfall’s GenAI detectors display 2x greater precision compared to AWS Comprehend, Google DLP, and Microsoft Purview.
How does this measure of precision affect a security team’s experience? In short, higher precision translates to fewer false positives, and consequently, fewer alerts for security teams to manage. In this case, a 2x increase in precision corresponds to a 4X decrease in false positive alerts.
Nightfall also found a direct relationship between quantity of alerts and operational costs; for example, a 4x reduction in alerts leads to a 4x reduction in operational costs for security teams.
Nightfall’s “supported” findings only cover the detectors that compare directly to Nightfall’s. We found discrepancies in coverage in these three use cases:
- Secrets: AWS and Google only scan for API keys for their own respective platforms. Nightfall, on the other hand, scans for API keys for many different platforms.
- PHI: AWS and Google don’t offer a specialized PHI detector, and thus were not included in this finding. While Microsoft does have a HIPAA compliance detector, it did not respond to Nightfall’s sample data.
- Images: Google DLP and Microsoft Purview use Optical Character Recognition (OCR) to scrape images for text. On the other hand, Nightfall uses an image classification model to identify sensitive documents at a glance by recognizing the documents’ format.
PII and PCI
Nightfall measured our PII and PCI detection using a combination of six detectors, four of which are powered by a Convolutional Neural Network (CNN). Nightfall uses LLM-generated embeddings from this CNN to more precisely evaluate the context surrounding possible findings. As a result, Nightfall’s PII detection is 1.5x more precise, and Nightfall’s PCI detection is 2X more precise than that of AWS, Google, and Microsoft.
HIPAA defines PHI as health data (such as a diagnosis or drug prescription) that can be traced to a uniquely identifiable individual. With this in mind, detectors need to be able to map the relationships between patients and their health data. For instance, if a detector only detects ICD10 diagnosis codes, but can’t tie them to unique individuals, then the detector will present a high volume of false positive PHI findings.
Nightfall leverages a powerful transformer model to map health information to specific individuals—ensuring that only true PHI is flagged. AWS and Google offer detectors for some of the individual entities needed to detect PHI; however, they can’t be directly compared to Nightfall’s PHI detector, which considers a combination of PII and PHI.
Nightfall’s transformer model has been fine-tuned on 125 million parameters to detect secrets in even the most complex edge cases. Nightfall’s secrets detection is 2x more precise than the competition’s. However, it’s worth noting that AWS and Google only scan for API keys for their own respective platforms, whereas Nightfall scans for API keys across many platforms.
Nightfall’s image classification model is 3x more precise than Optical Character Recognition (OCR), which is used by both Google and Microsoft. Image classification models rely on a much larger corpus of data, and as a result, can contextualize images based on their overall format as opposed to just the text they contain. This means that Nightfall can recognize sensitive documents even when images are blurry, grainy, or otherwise difficult to read.
At Nightfall, we’re dedicated to protecting your sensitive data, wherever it is in the cloud. Our latest advancements in detection are a huge stepping stone in keeping businesses more secure, while also helping them to save on both time and operational costs. And the best part? These four new detectors are just the beginning.
Want to see Nightfall’s latest detectors in action? Click here to schedule a free demo.