There is an unconventional data partnership structure based around controlling the data that others have about you. Remember, most businesses gather datasets from their operations. However, each record of those datasets is usually referencing one particular entity, like a person or a business or a product. There is an entire partnership structure around the entities that this data references, controlling or even influencing the records that are in these platforms. In many respect, the structure of these relationships is more complicated than the data science it takes to analyze the information they collect.
Take for example your own personal credit report. The credit report for individuals is derived from many different data sources and then aggregated and scored, using different methodologies at each credit reporting agency. For this reason, there is an enormous amount of data about you, but fundamentally, you don’t control this data, nor can you prevent its effect upon your life. That effect, in the case of credit reporting, is substantial.
A few decades ago, the credit agencies themselves began to open up this content to monitoring. Immediately, new data companies arose that would allow you to monitor or be alerted to the data inside these platforms and what that data said about you. The personal credit monitoring industry is now a multi-billion-dollar industry, but what exactly are they selling? It’s not the monitoring that really is being sold, because that, by itself does little to help the individual whose credit rating is being harmed by mistakes on their credit report. Instead, they are selling the ability to request changes or fix errors on these data records held by the major credit agencies. Of course, that process can sometimes take years, by which point the harm has already been done.
Another such business exists for businesses to control or influence all of the data available about their company or business locations. Firms like Yext, Google, Yelp, and TripAdvisor have businesses or segments that sell the ability to clean up bad data about a business, its locations, products, biographies, or any other attribute of data about them. Companies either pay for this access to clean and manage the data or they agree to broader terms and conditions as the verified owner of the business.
The reason control over content and information about a person or a business is so important is that errors in this information can have substantial negative effects. For a business, if their retail store locations are not correct on a map or in local search engines, they may lose substantial retail revenue. For an interesting demonstration of this, open up Google search in an incognito browser window and type in “why did my business” into the search bar. Google, through its massive auto suggestion platform will suggest 4 or more possible searches you are likely to be looking for based on what you’ve typed in. These are the most common and highest correlated suggestions across trillions of searches, keywords, and results. By the time you get to the letter “i” in the word “business,” you will see these top two suggestions:
“why did my business fail”
“why did my business disappear from google maps”
This means that of all the questions you might be asking, Google thinks that the two most likely questions you are posing are why your business failed or why your business disappeared from Google Maps. The suggestions below the search box are called Google Suggest, which is powered by a number of factors, including the volume of searches for a particular phrase as well as how many times users interact with that phrase (meaning they select it) among the other phrases options. These two suggestions are number 1 and 2 on the list, and have been for years, regardless of geographic location – another factor that influences suggested search query terms. While we don’t know the exact formula for Google’s calculation, the world appears to search for these two queries more than all other questions beginning with “why did my business…”
The same can be said sadly for those that have been victims of identity theft or faulty credit reporting in their own personal lives. For these people, the horrifying discovery of a terrible personal credit rating is usually at the most inopportune time, when they are making a life decision around the purchase of a home or a marriage proposal. The immense personal value placed upon credit ratings require a proactive approach where people can pay for access to these records and the ability to correct them.
Allowing data subjects the right to participate in cleaning up their records (either through a purchased license or, increasingly, as an apology for a breach) is a new approach in data partnership strategies. There are benefits to all involved with these partnerships, because the data company will improve their quality with first-party data direct from the person or the company about which each record refers, even as that individual or company benefits from accurate data about them. Expect to see more of these types of services evolve in the future, with more and more industries embracing this process and partnership type.
The concept of controlling data is expanding significantly with the introduction of privacy laws around the globe. GDPR, CCPA, DGPD, and the laws modeled after them have set a new tone for the application of law to data use-cases. Importantly, while focused on privacy issues, each of these laws also incorporates significant levels of control for those individuals. The GDPR specifically outlines that individual “data subjects” have the ability to ask any company aggregating, controlling, or processing personal data records about them to provide details on what types and categories of data the company has about them. From there, the individual data subject can ask for the removal of that data (sometimes called “the right to be forgotten”) or the ability to correct information.
Control over data will remain a debate in the public sphere and among businesses for some time. There is the initial debate, of course, about whether individuals “own” their data or whether the data is simply an extension of their personhood in digital form. There’s also the question of whether activities online constitute labor — that is, because they create value for companies that extract and use their data, online users are essentially performing unpaid work for large platforms and startup companies alike. Issues like these will dominate legal and scholarly debates for years to come, but one fundamental fact remains: control over data is not an academic issue, but instead a real, pragmatic question of what kind of authority data subjects have over the data they create. The sooner you develop an answer to that question, the better.