We’re big fans of January around here, enjoying, as we do, the prospect of starting the new year off the right way and looking at things afresh. There’s something special about the focus that comes with new beginnings and the excitement of starting new projects after the whirlwind of the holiday season. The one thing that’s hard about January is that we have to undertake all of those projects without a very critical item, one that, for most of the rest of the year, is in ready supply and helps us get through.
We’re obviously talking about bread.
It’s not our fault that we eat bread from October 31 to January 1 like the world will end if we don’t eat every carbohydrate on the planet. As a culture, we’ve developed the idea that bread is not only appropriate at every meal, it’s the best snack, the best appetizer, and, in some cases, the animating purpose behind going to a restaurant. Ten percent of American households go through more than three loaves of bread per week. In the distant future, historians and anthropologists will look at our time and assume that our economy was built on bread. Like many people, then, October 31-January 1 is Bread Season for us, and we take it very seriously. But once the new year rolls around, it’s time to pare back because, according to the No Fun Brigade (like cardiologists), eating two entire baguettes at every meal is less “culturally unacceptable” and “really not that good for you, Jay.”
You Are What You Eat
The Bread Model approach to living is, obviously, unsustainable over the long run: you can’t just overconsume all the time and expect to feel great. Oddly enough, the exact same is true about data — you can take in all you want, but eventually it will turn out that you take in more than you want, are able to use, or are legally allowed to have. We’ve talked about the Hoarders approach to data collection before, and with good reason: even smaller companies have signed onto Facebook-style data collection, assuming (without any evidence) that doing so will somehow help their business grow. But given that current total data generation in the United States runs to several exabytes (thats a billion gigabytes) per day, we’re absolutely drowning in data — how much of it do we really need?
Taking a Good Long Look
To answer that question, you first have to understand what data you already take in, and why. Looking at both your current data assets allows you to identify how the data you bring in actually ties to your operations. Amid the noise of daily operations, it can be difficult to identify which individual datasets really make a difference in how a business runs, and, sometimes, it’s easier to simply say “just keep it all.”
That’s a perilous strategy. Consider pulling in data by scraping social media in an effort to understand customers. Social media is a fascinating topic when it comes to privacy compliance, because our nonstop tweets, snaps, pins, and bucket challenge videos gone awry generate an enormous amount of personal data that lives in the public sphere, effectively forever. Each piece of information constitutes a small part of the mosaic comprising our digital presence, and all of it can be analyzed, bundled, or sold. But, the question is, does this type of data even matter or count in terms of personal data when it comes to privacy regulations?
Laws in the US have traditionally been narrow in their definition of what constitutes personal information. In Florida, for instance, the data breach notification law only applies when the data access includes a person’s first name, last name, and another crucial piece of information, such as Social Security number, credit card number, health insurance policy number, or email address and password. In other words, unless the breach involves the “crown jewels” of personal information, there’s no obligation to report, because the rest of the data is not protected.
GDPR, CCPA, and the oncoming slate of new laws take a very different approach, one based on more than the purely transactional. These laws don’t focus on “PII,” but instead on “personal data.” The concept of personal data is far broader under GDPR and CCPA, and it is about much more than email passwords or credit card numbers. Every piece of data that does identify or could identify a person is “personal data,” which means traditional information like names, email accounts, and tax ID numbers, are certainly personal data. But it also means everything from an IP address to a Facebook photo are as well.
This is a distinction with a difference, especially when it comes to companies unwittingly becoming subject to the privacy and data laws without realizing it. A company that markets and sells its goods in the US and Italy, for instance, cannot use the same methods for all of its customers unless the methods are GDPR compliant. And a data breach at a US company that processes data on Floridians and Belgians may have no duty to notify the Florida Department of Legal Affairs, but may need to notify the Belgian Data Protection Authority within 72 hours.
Of course, one method to avoid an unnecessarily complicated data security policy is to simply create a broadly compliant privacy approach for your entire company, without regard to the origin of the data. But to the extent that is neither practicable nor economical, you need to have a firm grasp over what data you have, and whether it constitutes personal data under the Regulation. At the same time, think now about how to create data security policies that emphasize security at every stage, and remember that if you don’t need the data, don’t keep it. The panoply of privacy laws out there may be complicated, but they aren’t impossible to understand if you begin by identifying your data.
Time to Pare Down
Okay, great, but is identifying what data you bring in enough? Hardly. Remember that data is not permanent unless we decide to make it so. We’re accustomed to the view that, once you secure information, you should hold onto it forever. Treating the information your company gathers in the same way you’d see on an episode of Hoarders is a dangerous approach, but a common one. Businesses amass a huge amount of information, much of it irrelevant or of marginal use, but they keep it just because it seems like it might be easier than deleting. They have also been taught for years in the U.S. that data gathered today may someday be useful in the future. This is an understandable viewpoint but dangerous because inadvertent data hoarding puts you at odds with one of the key principles of good privacy practice: data minimization.
Minimization is the concept that a data controller should keep only the amount of data necessary to accomplish its ends, and not compile a portfolio of extraneous information on customers or clients. For the EU, this is about the citizen’s fundamental right to privacy, preventing too much of their personal information from being held by those who do not need it. Under the GDPR, the less a company needs to know, the less it must collect. Therefore, asking for personal address details prior to shipping goods to a customer makes sense, but asking for their marital status probably does not.
If that seems like an outlier example, it shouldn’t. There are countless examples of entities accumulating huge stores of information not directly tied to the good or service they provide to customers. Consider a company that requires website users to include confidential information to verify identity — mother’s maiden name, for instance, which is a favorite target for identity thieves. If the company is hacked, even if the customer has closed their account, the risks are substantially higher for all involved than if the company had deleted the information when it was no longer necessary. In the same way, the company could have, and maybe should have, deleted the customer’s home address and phone number for the same reasons. Minimizing the data you possess decreases the risk of harm if it is stolen.
There are other ways to minimize data, ranging from creating pseudonyms for customers (“pseudonymization”), to encryption, to “fading,” where details of an account are gradually scrubbed over time. For instance, if a company once held a customer’s email address, home address, phone number, and credit card information, they could delete the credit card information after a week or two, the phone number after a few more weeks, the home address after a few months, and eventually even the email address and name after sufficient time has passed. Fading keeps a customer’s crucial information available only as long as it is necessary and deletes it when it is not. Email service providers, or ESPs, have long promoted the process of fading email addresses to ensure that you send emails only to those users that engage (even minimally) with prior attempts to reach them. On the other hand, if the customer has made no purchases in seven years, and never opens your marketing or newsletter emails, why keep their email address, name, and credit card number any longer? Much of that information is likely out of date anyway.
The bottom line is that data minimization makes sense for all involved. It protects the customer’s data and lowers the risk of harm to the company in the event of a breach. None of this is to say that companies should scrub all of their data too aggressively; indeed, data may be extremely valuable and a crucial component of a data sharing partnership. Instead, the point is to carefully analyze the data you possess, why you possess it, and what you’re really doing with it. No CEO wants to find out that there was a data breach and that there wasn’t any good reason for the company to have kept the sensitive data that was lost.
So look through the data you have, catalog it, and decide whether it is valuable enough to keep, especially if it was created for a now-defunct data partnership. In other words, if you don’t need the data for a partnership, and you don’t need it for your own operations, do you really need it? You may conclude that the answer is yes and keep the data, but at least you will have done so after giving the subject real thought, and not simply because of “business-as-usual” inertia. It’s a lot like the bread analogy earlier: data is only good when it’s fresh, you actually want it, and you’ve made the conscious decision that it’s worthwhile. If you take the “more is always better approach” to data (or bread), your appetite won’t be the only thing that’s spoiled.