There are two types of companies in the world today – those that run on data, and those that will run on data

The volumes of data being produced by companies everywhere are growing at an exponential rate. For InfoSec professionals, the challenge comes from trying to make sense of all this data with the time, budget and staffing resources available. Doing so can help ensure it’s properly protected from potential cyber-attacks, but it’s a challenge that’s only going to get harder in the weeks, months and years ahead.

The most effective way to address this challenge is through proper prioritisation, and in the case of growing data, prioritisation comes from data classification. This article will explain what data classification is, dispel some myths surrounding it, and explain why it’s so important to data security.

What is data classification?

Data classification is a process of consistently categorising data based on specific and pre-defined criteria, meaning it can be efficiently and effectively protected. For instance, confidential earnings reports or M&A plans are obviously far more sensitive than last month’s staff rota, but unless they are classified accordingly, all of these documents could be receiving the same level of data protection.

Classification can be driven by governance, company compliance, regulation (HIPAA, PCI, or CCPA), protection of intellectual property (IP), or perhaps most importantly, by the need to simplify your security strategy (more about that later).

Before getting started, every company needs to answer a few key questions in order to help define classification buckets. These are as follows:

• What are the data types? (structured vs unstructured)

• What data needs to be classified?

• Where is the sensitive data?

• What are some examples of classification levels?

• How can data be protected and which controls should be used?

• Who is accessing the data?

As a side note, data discovery is closely aligned with classification. After all, before you can classify data, first you have to find it. Data discovery needs to look at the endpoint, on network shares, in databases and in the cloud, to be as comprehensive and effective as possible.

Dispelling some common myths about data classification

Many security professionals shy away from data classification due to deep rooted misconceptions about it. Below are three of the most common, along with some simple truths dispelling them:

● Myth one: It has a long time to value

Automated classification drives insights from day one. Automation for both context and content brings order to all your sensitive data; quickly and easily.

Data collection and visibility can continue until the organisation is prepared to deploy and operationalise a policy. Even without a policy, insights from automated data classification can drive security improvements.

● Myth two: It’s too complicated to be useful

Many data classification projects get bogged down because of overly complex classification schemes. When it comes to classification more is not better; more is just more complex.

PricewaterhouseCoopers, Forrester, and AWS all recommend starting with just three categories. Doing so can dramatically simplify getting your program off the ground. If, after deployment, more are needed, then decisions will be driven by data instead of speculation.

● Myth three: It’s just another level of bureaucracy

Data classification can be an enabler and a way to simplify data protection. By understanding what portion of your data is sensitive, resources are allocated appropriately.

Everyone understands what needs to be protected. Sensitive and regulated data is prioritised; public data is given lower priority, or destroyed, to eliminate future risk to its theft.

Effective data classification helps protect against all cyber threats

The value of classification was once limited to protection from insider threats. However, with the growth in outsider threats, classification takes on a new importance. It provides the guidance for information security pros to allocate resources towards defending the crown jewels against all threats.

Internal actors cause both malicious and unintentional data loss. With a classification program in place, the mistyped email address in a message with sensitive data is flagged. Files that are intentionally being leaked are classified as sensitive and get the attention of security solutions, such as Data Loss Prevention (DLP).

On the other hand, external threat actors seek data that can be monetised. Understanding which data within your organisation has the greatest value, and the greatest risk for theft, is where classification delivers value. By understanding the greater potential impact of an attack on sensitive data, advanced threat detection tools escalate alarms accordingly to allow more immediate response.

Organisations generate data every day. This comes as no surprise. However, what might be surprising is the accelerating volume at which the data is being created. InfoSec professionals responsible for protecting digital data, need a new approach to stay ahead of the data deluge. Data classification gives them the best first step on the road to effective data protection.


About the Author

Ben Cody is Senior Vice President of Product Management at Digital Guardian. Digital Guardian is no-compromise data protection. The company’s cloud-delivered data protection platform is purpose-built to stop data loss by both insiders and outsiders on Windows, Mac and Linux operating systems. The Digital Guardian Data Protection Platform performs across the corporate network, traditional endpoints, and cloud applications.

Featured image: ©SIAMRAT.CH