top of page

8 data protection challenges and how to prevent them

Data Protection requires administrative and operational controls; technical controls, such as access management, authorization and encryption; and physical controls that protect the computing hardware and work environments.

Many organizations focus on technical controls and protections, said Rebecca Herold, founder and CEO of consultancy Rebecca Herold & Associates, but they're not addressing the administration protections, such as documented policies and procedures to make employees aware, or sufficiently monitoring the physical aspects. "So, it is just piling into the ongoing breaches that you read about every day in many industries around the world," she added.

Everyone rightly focuses on the regulated data, including personal, card holder and healthcare information. But there's a lot more that needs attention. "It's expanded," said Heidi Shey, analyst at Forrester Research. "It's not just the source code that people have historically been worried about or secret formulas."

Data protection challenges mount

New data protection challenges can involve IoT sensor data collection, algorithms, APIs, and machine learning and AI models that companies have developed. "Many times, consumer IoT devices such as smart cameras and smart vehicles are being used to support the business," Herold explained. "And there is data that is shared that is collected by those products that usually the business isn't even aware of."


With the potential for financial loss, reputational harm and penalties for noncompliance with data privacy protection regulations, security leaders need to track the location and ownership of the company's data at rest, in transit and in use and understand the security risks. That includes the data encryption and sensitivity level, potential impact on the business if the data is compromised and the dependencies between the data and other applications.

Globally, the 35 highest data privacy violation fines in 2023 totaled $2.6 billion, according to Forrester. The EU issued 19 of those fines to companies that failed to comply with the General Data Protection Regulation (GDPR), federal and state regulatory bodies in the U.S. issued 15 and South Korea issued one.

Today's businesses are facing a complex combination of data protection challenges, including data overload, risk management, cyberattacks, access rights, distributed cloud environments, AI and generative AI, human error, and new and stricter privacy regulations.

1. Too much data

In addition to sensitive corporate information such as intellectual property, trade secrets and other structured data in the form of formatted text and data sets, there's a lot of unstructured data, like audio and video, that's not necessarily considered sensitive but might need to be safeguarded.

"People are working in a remote or in a hybrid way," Shey said. "You're having Teams meetings or Zoom meetings and you are recording these meetings. The recordings and the transcripts of these meetings may have sensitive information." Companies need to establish the retention period of those files, even though those who missed a meeting might still want access to them.

"Data lifecycle management," Shey explained, "is challenging because companies sometimes struggle with making decisions about data retention, what should that policy look like, how long should we hold on to this data?" It's not uncommon to see internal struggles between security leaders who consider data a potential liability and business stakeholders who want to hold onto data to mine more information from it. While companies might have regulatory requirements outside of records retention, some regulations are not that specific. "It's based on what the business decides," Shey said, "so you do find that a lot of companies are hesitant to delete data."

2. Data privacy risks

Implementing an enterprise privacy program requires identifying data types, such as personal data (name, address, birth date, and account information) associated with employees, customers and suppliers, as well as developing security measures to protect sensitive information and ensure compliance.

It's critical to map out the business environment, including data the company collects, creates and stores, and to communicate the various regulations that apply. What data, for example, is protected by the GDPR or the California Consumer Privacy Act (CCPA) and should not be acquired? When data is stored for an extended period after its intended use, transferred across borders or belongs to a child, companies can face stiff penalties. Meta and TikTok were hit with fines totaling about $1.7 billion in 2023 for GDPR-related violations.

Currently, GDPR and federal data protection regulations don't directly address AI. Differentiated privacy and federated learning tools use mathematical techniques to offer protections to individuals when personal data is used in data sets. Private data, for example, might be used to train AI models, but it's not searchable.

Today's enterprises face a complex array of challenges to protect their data and comply with regulations.

3. Ransomware attacks

Ransomware, which encrypts a company's files and makes them unreadable until the victim pays a ransom for the decryption key, remains a major concern for businesses, with the threat of financial loss and reputational damage from sensitive data exposure. Attackers are increasingly targeting data backups and storage, forcing tool vendors to provide functions like immutable backup and other data resiliency features to protect against attacks.

Ransomware was a factor in 24% of breaches, with system intrusion in 94% of those attacks, according to Verizon's 2023 Data Breach Investigations Report (DBIR) report. The top three vectors for these attacks were email, desktop sharing software and web applications.

"How do we reduce the capability for malware on our networks to perform reconnaissance, to move laterally and to do a broad-based infection across a network," said Jason Garbis, principal and founder of consultancy Numberline Security, who co-chairs the Cloud Security Alliance's Zero Trust Working Group. Companies need to make devices and services resilient to ransomware attacks, he advised, by creating a small enough "blast radius" so only a user device or server gets infected. "If you've got hundreds of systems or thousands of systems infected," he explained, "you are in a much worse place than if you have one or two."

4. Data loss prevention

A data loss prevention (DLP) strategy consists of policies, tools and techniques to help security teams increase data visibility and protect sensitive data against unauthorized use in enterprise and cloud environments. DLP software monitors data entering and leaving the network, sending alerts when it detects suspicious activity, such as unauthorized data transfers. It can help security leaders enforce data protection regulatory compliance.

Modern DLP logs files and events and might include real-time data analysis and user behavior analytics. Some cloud providers offer DLP services that scan and classify different types of data, offering masking, tokenization and redaction for highly sensitive information, like credit card numbers. Deploying DLP software is one of the CIS Critical Security Controls for data protection best practices. But effective implementation for endpoints, on-premises networks and multi-cloud environments can present data protection challenges, especially when it comes to integration, scaling and the use of endpoint agents, which can drain resources and slow productivity.

"How do I classify my data?" said Garbis, raising questions about the challenges of implementing an effective DLP strategy. "How do I protect it, and what does that even mean to protect it, while keeping users productive?"

5. Access and authorization

Enterprises today deal with a hybrid workforce that sets its own hours and uses multiple access methods from remote and on-site locations on computers and mobile devices to log in to the company's network environments and resources. Third-party suppliers might also access the company's resources. Access control management requires enforcement of processes for revoking data access such as automatic device lockout and remote wipe capabilities if portable devices are lost or stolen.

An untethered workforce can complicate privilege access management and access rights -- who needs access to the data at what time. DBIR research found 406 incidents of privilege misuse, 288 resulting in data disclosure, the majority of which were for financial gain. Credentials are even harder to control. Stolen credentials from weak passwords and brute-force attacks were the entry point for 86% of web application breaches. Of 1,404 incidents reported by DBIR, 1,315 (94%) resulted in data disclosure, primarily for financial gain.

6. Human error

Sensitive data exposure often comes down to human error: lost smartphones, shared credentials or an accidental email containing confidential information sent to unauthorized employees or people outside the organization. The top reasons for human error-related breaches, according to DBIR, were misdeliveries -- sending information to the wrong person (43%); publishing errors -- showing physical documents to the wrong audience (23%); and misconfigurations of hardware, software and cloud services (21%). Developers caused the most error-related breaches, followed by system administrators and end users.

Failure to back up data can make a bad situation worse. Global management consulting firm KPMG lost its Teams chat history in August 2020 when a Microsoft 365 admin attempted to remove an individual's retention policy and inadvertently deleted employee communications across the company. Without Microsoft 365 backup, there was no way to recover the information.

7. AI and generative AI threats

Like cloud services, AI tools and algorithms are more different than alike, Herold noted. Some AI models have strong management frameworks with strict policies against using real personal data or data from live environments to train them. "Others are going out there and scraping data as much as they can off of social media sites and everywhere else," Herold said, creating a huge number of issues surrounding exposure of personal data and intellectual property when the data is used in training AI and machine learning models. "AI," she said, "is really highlighting the fact that a lot of intellectual property might be going out and being exposed."

Threat actors are also using AI capabilities. An unidentified intruder used stolen credentials and an API with AI capabilities to infiltrate T-Mobile's computer systems in November 2022, the company disclosed in an SEC filing in January 2023. The attacker managed to steal 37 million customer records, containing addresses, phone numbers and dates of birth. The attack followed a series of high-profile data breaches and settlements against the besieged carrier.

A shareholder filed a lawsuit in September 2022 claiming that in 2018 T-Mobile had started an "aggressive and reckless plan" at the behest of its largest stockholder, Deutsche Telekom, to centralize customer data and credentials for use in the training of AI models.

8. Secure cloud data

Distributed cloud environments offer greater flexibility and potential cost savings for moving data and other applications. But data protection issues, such as data breaches, locality, misconfigurations, lack of patching, shadow IT, insecure APIs and vulnerabilities in cloud storage, cause concerns for many companies.

"The cloud has so many advantages," said W. Curtis Preston, technology evangelist at consultancy Sullivan|Strickler and host of the Backup Wrap-up podcast, "but because you never actually see the company and you never actually see the service, you don't get to verify that they are doing things in a normal way." It's also important to ask basic questions about data protection, he added: "Where is my data located? Is my data encrypted? Is it encrypted in my environment or is it encrypted when it leaves my environment? How are you protecting my data against ransomware attacks?"



The capability to discover and classify data using machine learning and AI is increasingly built into tools and platforms. "Especially with the cloud environments, we are seeing data security posture management," Shey said. "It's the new shiny that people are turning to to help them identify what data they have in cloud environments."

bottom of page