Confluence DLP (Data Loss Prevention): The Essential Guide

The Atlassian ecosystem provides thousands of companies with the ability to collaborate remotely through powerful, feature-rich SaaS applications like Confluence. Over the least year, the rise of remote work has meant many companies have hosted their internal information hubs on Confluence. As such tools become the norm across companies, big and small, the amount of sensitive information stored in these systems will increase. This means that organizations need to prioritize minimizing the risk of exposure within cloud environments.

Read this online guide, for free, to learn how knowledge wikis like Confluence can increase the exposure risk for sensitive data like PHI, PII, passwords, other secrets, and how you can mitigate this risk. You can also download this guide here.

How is sensitive data exposed in Confluence Cloud?

Typical uses for Confluence include hosting public documentation for projects like bug tracking for open-source projects or release notes. Development teams create this information to host publicly on purpose, but sometimes sensitive data can be inadvertently exposed to anyone who navigates to the correct URL.

Confluence sites tend to be more frequently open to public view. In a sample of 5,000 Jira Software Cloud sites, there were 273 Jira sites with publicly viewable issues and 1,214 Confluence sites with publicly viewable spaces. The larger number of viewable Confluence sites compared to Jira is likely explained by companies having documentation and release notes on a publicly viewable space.

Information leaked through public Confluence spaces has included customer portal passwords, customer requests, email conversations, meeting notes, and product order data — all of which can contain data that companies must protect.

How does Confluence Cloud introduce data security risks?

SaaS applications like Jira allow for collaboration between a multitude of users. However, this high volume of activity, combined with the always-on nature of SaaS systems, can increase the risk that data security best practices aren’t followed. This can result in PII, credentials, secrets, and other sensitive information being exposed to the wrong parties.

To better understand how these attributes of SaaS applications interact to impact data security risk, watch the following video.

[youtube:8-PRyhZMcZI]

‍

What types of data are at risk of exposure in Confluence pages and spaces?

Credentials & secrets are most frequently exposed in Confluence, due to its common use as an internal wiki for product and engineering teams. However, other types of unexpected sensitive data have been identified in Confluence by customers using Nightfall. Here are a few types of data that may be exposed in Confluence:

API keys & access tokens for third party services, e.g. AWS, Stripe, Twilio, etc.
Cryptographic keys (SSH, PGP, etc.)
Certificates (SSL, TLS, etc.)
Passwords and login credentials
Database credentials
UUIDs, cookies, etc.
Credit card numbers
Customer PII

Can I limit data security risk by managing permissions in Confluence?

Permissions settings in Confluence is only half the battle. While proper permissions settings can help reduce data exfoliation risk in Confluence, this is just one step you must take to protect sensitive information. Permissions best practices should include the following:

Include stakeholders as Confluence Admins. Identify stakeholders who will help manage your Confluence accounts and enforce proper user hygiene. Active admins will be your first line of defense in securing your spaces and enforcing data policies.

Understand how permissions levels and restrictions work together. Confluence provides permissions controls at multiple levels and understanding this is key to making sure that no one is authorized to view, add, modify, export, or delete data in Confluence. Security leaders must ensure that the admins who have access to their Confluence admin console understand how permissions function in Confluence.

Monitor logs to track permissions changes across your spaces. Confluence provides product-specific audit logs that administrators within those respective services can access. Reviewing these logs can provide insight into who created, edited, or deleted a space, as well as who is making changes to groups and user permissions. This step will help in the moderating of Confluence spaces and the Atlassian organizations they’re part of.

Properly onboard and offboard members. Confluence allows Admins to securely onboard new team members by only inviting users with a designated domain name in their email account. Organization admins who manage org-wide permissions should also be sure to remove users who have left the company.

Key takeaway: All paid instances of Confluence have three levels of permissions.

Global permissions are broad and site-wide
Space permissions uniquely apply to the space specified by an administrator (usually the space creator)
Page restrictions allow admins to restrict the view or editing of specified pages by specific groups or users.

In addition to adapting good permissions practices, you need to invest in tools like data loss prevention that provide the appropriate level of visibility to monitor activity in Confluence to identify instances of inappropriate sharing of sensitive data.

What is data loss prevention (DLP) in Confluence?

Data loss prevention is an access control that ensures confidential information is kept on a need-to-know basis. DLP does this by:

Scanning for content within messages and files to determine whether an unauthorized disclosure of business-critical information has occurred.
Providing automated remediation on the basis of your established data security policies.
Providing alerts and analytics that help organizations understand risk and employee behavior over time.

Organizations need to use tools like DLP in order to put into place controls that will help enforce data security best practices by preventing unauthorized parties from accessing documents and folders with sensitive information.

Why is data loss prevention (DLP) essential for protecting data in Confluence?

Since Confluence is document and media heavy, it can be hard to detect when and where sensitive info could be leaked within the platform. This can lead to increased data exposure risk as well as data compliance violations. There are no mature cloud-native DLP products for Confluence. Atlassian does not have a native DLP product, and many CASBs cannot support Atlassian apps.

While some CASBs purport to connect to Confluence, they can only see data in transit and are unable to find pre-existing sensitive content.

Nightfall is the only Confluence DLP solution on the market that:

Enables customers to scan their entire existing Confluence instance.
Flexibly configure multiple different DLP policies.
Apply policies to particular locations (Spaces and Pages) within Confluence.
Can create multiple detection rules that specify whether data is deemed sensitive in any instance, or whether it is deemed sensitive only in combination with other data.
Provides a comprehensive approach where appropriate data sharing won’t be flagged, reducing false positives and noise.

What is Nightfall DLP?

Nightfall is a platform to discover, classify and protect sensitive data across cloud SaaS & cloud infrastructure.

Nightfall supports compliance efforts with a number of industry standards like PCI DSS, GDPR, HIPAA, CCPA, and much more.
Nightfall works by continuously monitoring data flowing in and out of data silos and classifying that data with machine learning. Data marked as sensitive can be automatically quarantined, deleted, and redacted with workflows.
Deploy a targeted remediation strategy with comprehensive, context-rich scan results that contain direct links to policy violations within Confluence.
Nightfall integrates with Confluence via Oauth 2.0, meaning you can get started immediately. Integrate in seconds, then tell Nightfall which Spaces and Pages should be scanned in real-time for API keys, encryption keys, passwords, and more.

How does Nightfall differ from existing platforms?

Nightfall DLP is the industry’s first cloud-native data loss prevention solution designed to discover, classify, and protect sensitive data in cloud environments.

Leverage dozens of detectors, including Nightfall’s best-in-class core of machine-learning trained detectors, to detect a wide range of sensitive content types such as standard PII, names, ID numbers, financial information, addresses, credentials, secrets, custom regexes and word lists, and more.

Discover sensitive data across all Confluence Spaces (including Personal Spaces), Pages, Blog Posts, Attachments, Comments and Archived items.
Secure your Confluence with DLP scans for a wide range of file types — including plaintext, Office (Google Office, Open Office, msft Office), pdf, html, xml, all popular image file types (jpeg, png, etc), compressed files (zip, tar, etc).

We designed Nightfall to address the low accuracy of tools that rely on traditional methods like regexes or entropy thresholds. Nightfall can be used to discover and protect against both PII and credential leakage across all your Confluence pages and spaces.

What are the key features of Nightfall DLP for Confluence?

Nightfall helps organizations manage DLP in Confluence with these features:

Quickly and easily connect Nightfall to your Confluence in minutes with our out of the box integration.

Fully customize your scans with Nightfall’s robust detection engine with dozens of detectors, including our proprietary machine-learning trained detectors to detect a wide range of sensitive content types such as standard PII (names, ID numbers, financial information, and addresses) credentials & secrets, custom regexes & word lists, and more.

Configure granular Detection Rules and set confidence levels within the Nightfall dashboard to determine which data is considered sensitive, either standalone or in combination with other data.

Build flexible data detection policies based on custom data detectors (e.g. regexes & word lists) and multiple policies to target your DLP scans to certain locations (e.g. Spaces or Pages) or timeframes.

Discover sensitive data across all Confluence Spaces (including Personal Spaces), Pages, Blog Posts, Attachments, Comments, and Archived items, with context-rich results.

Nightfall’s best-in-class DLP includes machine learning based optical character recognition (OCR) for unstructured data, enterprise-grade security, and high accuracy detection via deep learning — all within a single pane of glass, intuitive UI for configuring your DLP policies.

Does DLP detect files too?

Nightfall supports a broad set of file types including but not limited to xls/xlsx, doc/docx, csv, plain text, ppt/pptx, PDF, HTML, and more.

How do I get started?

To get started with Nightfall, request a free Confluence risk assessment, schedule a call with our sales team, or contact us directly at sales@nightfall.ai with any questions.