Skip to main content

Data Security

In today’s highly connected world, cybersecurity threats and insider risks are a constant concern. Organizations need to have visibility into the types of data they have, prevent the unauthorized use of data, and identify and mitigate risks around that data. The following sections will cover why data security is essential, common data security risks, and data security best practices to help protect your organization from unauthorized access, theft, corruption, poisoning or accidental loss.

What is data security?

Data security is a set of practices and procedures designed to secure data from unauthorized access, theft, corruption, poisoning or accidental loss to preserve data privacy, confidentiality, integrity and availability. Data security should help protect sensitive data (personally identifiable information, financial information, health information and intellectual property) throughout the data lifecycle, from creation to destruction. It should encompass everything from the physical security of hardware and storage devices to administrative and access controls, security of software applications, and data governance policies.

Here’s more to explore

graphic

Security Best Practices for the Modern Data Platform

Learn proven techniques from lakehouse security experts

Watch webinar
Lakehouse — the Evolution of Data Management

The Data Lakehouse Platform for Dummies

How to power all your analytics on one platform

Get the eBook
Intro to Databricks Lakehouse

Lakehouse Fundamentals

Get up to speed on lakehouse by taking this free on-demand training

Start now

Why is data security important?

Data is one of the most critical assets for any organization today, so the importance of data security cannot be overstated. Data protection should be a priority for every business in every industry. This is increasingly important as data security breaches continue to grow. According to Check Point Research, global cyberattacks grew 38% in 2022 compared to 2021.

Protecting data is critical because data loss or misuse can have severe consequences for an organization, including reputational damage, inaccurate ML models, loss of business and loss of brand equity. In addition, if an organization’s intellectual property is compromised, its ability to compete may be permanently affected. According to the IBM Cost of a Data Breach Report 2023, the average cost of a data security breach in 2023 was $4.45 million, 15% more than in 2020.

In addition to the costs related to reputational damage, failure to comply with regulatory requirements can result in fines for noncompliance. The General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) both impose fines on organizations that fail to secure their data properly. Under GDPR, data breaches can lead to penalties of up to 4% of an organization’s annual revenue.

Data security risks

There are several types of data security risks. Some of the most common are:

Malware

Malware can include worms, viruses or spyware that enable unauthorized users to access an organization’s IT environment. Once inside, those users can potentially disrupt IT network and endpoint devices or steal credentials.


Ransomware

Ransomware infects an organization’s devices and encrypts data to prevent access until a ransom is paid. Sometimes, the data is lost even when the ransom demand is paid.


Phishing

Phishing is the act of tricking individuals or organizations into giving up information like credit card numbers or passwords or access to privileged accounts. The intent is to steal or damage sensitive data by pretending to be a reputable company with which the victim is familiar. External attackers may also pose as legitimate users to access, steal, poison or corrupt data.


Distributed denial-of-service (DDoS) attacks

A DDoS attack targets websites and servers by disrupting network services to overrun an application’s resources. The perpetrators behind these attacks flood a site with traffic to slow website functionality or cause a total outage.


Human error

Employees may accidentally expose data to unintended audiences as they access it or share it with coworkers. Or an employee may sign in to company resources over an unsecured wireless connection.


Insider threats

Disgruntled employees may intentionally expose data or seek to profit from data theft.

Data security tools and strategies 

Data security tools and strategies enhance an organization’s visibility into where its critical data resides and how it is used. When properly implemented, robust data security strategies not only protect an organization’s information assets against cybercriminal activities but also promote data loss prevention by guarding against human error and insider threats, two of the leading causes of data breaches today. Some common types of data security tools include: 

  • Data encryption: Uses an algorithm to scramble normal text characters into an unreadable format. Encryption keys then allow only authorized users to read the data.  
  • Data masking: Masks sensitive data so that development can occur in compliant environments. By masking data, organizations can allow teams to develop applications or train people using real data. 
  • Data erasure: Uses software to overwrite data on any storage device completely. It then verifies that the data is unrecoverable.
  • Access management: Includes policies, audits and technologies to ensure that only the right users can access technology resources.
  • Role-based access management: Controls access to resources where permitted actions on resources are identified with roles rather than individual subject identities.

Data security best practices

Data security best practices include data protection tools such as those outlined in the previous section as well as auditing and monitoring. Data security best practices should be leveraged both on-premises and in the cloud to mitigate the threat of a data breach and to help achieve regulatory compliance. Specific recommendations can vary but typically call for a layered data security strategy architected to apply a defense-in-depth approach to mitigate different threat vectors.

Data governance is an essential security best practice. Data governance includes the policies and procedures governing how data is made available, used and secured. Governance establishes processes that are enforced across organizations to ensure compliance and data security while also enabling users to access the data they need to do their jobs.

Securing data on the Databricks Data Intelligence Platform

The Databricks Data Intelligence Platform provides end-to-end security to ensure data is accessed properly, by authorized individuals, while helping organizations meet compliance requirements. Databricks provides comprehensive security to protect your data and workloads, including encryption, network controls, data governance, and auditing. We also offer best practices whitepapers, a security analysis tool, and terraform templates in the Databricks Security and Trust Center.

Databricks’ data governance solution, Unity Catalog, offers fine-grained data governance, centralized metadata and user management, centralized data access controls, data lineage, and data access auditing to seamlessly govern structured and unstructured data, machine learning models, notebooks, dashboards, and files on any cloud or platform.

 

databricks security features

Back to Glossary