Humanity Dictates It's Time to Constrain the Wild West of Data

Data privacy

By Hessie Jones, CogWorld Reporter, Toronto  |  April 23, 2018 

More than ever before, the consumer has become fully aware of the extent to which they are being surveilled, and influenced or some would argue, manipulated. This post discusses the increasingly on-demand market and how data has enabled this. It also challenges the relationship between the business and its customers and introduces an era of more balanced control of data and the inevitable privacy and security regulations that will drive ethical data use going forward.

Remember PRISM2013? Edward Snowden and the NSA scandal took to the airwaves, in much the same manner as Facebook/Cambridge Analytica scandal has today. The very act of surveillance moved the discussion from violation of privacy to personal violation, where no stone was left unturned. My friend Julie and I had this discussion on FB at the time. This event has made me question the integrity of the industry and my role it in.

Google Home

Data has always been the engine to drive insights... and revenue

The last decade of Big Data was a dawning of a new era: the explosion of data driven by technology, increased connectivity, compute power and advanced data capabilities. This has vastly increased our understanding of things... and people. Context is NOW everything and we, as humans, are now the centre of focus.

For business, information is what drives profits. This is the primary mandate of all corporations. Make no mistake, shareholder value supersedes customer value. Determining what motivates a consumer to spend, where, when and WHY are the questions marketers have tried to answer for decades. It stands to reason the more we understand the triggers that drive behaviour, the more we can sell products. Companies have spent millions on consumer research from A.C Nielson to Gartner and through Usage and Attitude studies, to not only understand market trends but how this translates to individual desires and motivations.

About 4 years ago, I wrote this post: ​Quid Pro Quo: The Ultimate Dance Between the Brand and Consumer​. The advent of big data also cultivated a new movement in social engagement. Suddenly this wealth of data gave corporations the information they needed to augment their transactional information with social contexts to define influences in content, conversations, reviews etc. to further refine the understanding of the consumer.

Sentiment and personality (like the Big 5) measures have garnered attention in the last 5 years to really delve into how, when and to what extent these indicators influence purchase behaviour. Ad networks have increased their footprints to bring more sites into their folds, to create some semblance of understanding of where and in what sequence the customer navigates his/her path online to infer consumer propensities.

Around 2006 while I was at Yahoo!, I reported to Hunter Madsen, a guru who spearheaded behavioural targeting before it was a “thing.” Understanding what content users consumed in our network, what they searched for, how they interacted across our platform, and leveraging adjacent factors like location and demographics helped develop precise targeting profiles. Natural Language Processing (NLP) has also come a long way to detect aspiration vs. intent to purchase within social posts. Technology has gotten so sophisticated that the ability to use past behaviour/mentions as a predictor of consumer outcomes exists today.

As Cathy O’Neil, from her book, “Weapons of Math Destruction” writes,

"Mathematicians and statisticians were studying our desires, movements and spending power. They were predicting our trustworthiness, and calculating our potential as students, workers, lovers and criminals... This was the Big Data economy and it promised spectacular gains. A computer program could speed through thousands of resumes or loan applications in a second or two and sort them into neat lists, with the most promising candidates on top. This not only saved time but also was marketed as objective and fair.”

The goal in all of this was to provide the most relevant communication to the consumer that would align with his/her propensity to purchase that item: Right message, right time. Easy, right? Not so, but we’ve gotten closer to delivering this.

Today, the increased availability of personal data from social networks and third party sites that enable increased contextualization at an individual level, in near real-time.

By 2010, Cathy O’Neill states that mathematics was asserting itself as never before into “human affairs, and the public largely welcomed it”... ​until now​.

Data has been society’s savior... for the most part

There are countless examples of efficiencies and progress to make critical decisions in areas like poverty, disaster recovery, healthcare, education etc.

Remember the earthquake that left Christchurch, New Zealand largely in ruins? In the years that followed, the city used real-time information to help with the rebuilding efforts: from cameras, traffic systems, utilities, water quality. As the city rebuilds, these data points are analyzed to measure the economic progress and guides the city in how it rebuilds to minimize future earthquake impacts.

Utility companies are also using data to provide more information about energy consumption to customers, generally lowering costs and putting the onus on the customer to make better decisions about their energy usage.

In healthcare, data has always been a stumbling block because of the fragmentation of the industry and transforming a largely paper-based information set to a datafile. Healthcare communities are transforming slowly and the aggregation of data is being used to predict epidemics, cure and prevent disease, and overall, improve the quality of life. The focus is understanding as much about patient history to determine potential warning signs, and identifying correlations with disease. Doing this at scale across similar patient use cases can make prevention all the more plausible.

For consumers, technology has made our lives simpler and more efficient. Our SPAM messages are automatically flagged or deleted from our emails. Our work days are more organized. We now have more access to information at our disposal in logging our health or fitness habits, or finding the job that best fits our requirements. We’ve become used to relying on technology to finding the best deals for the products we want to buy and our expectations for fulfillment on-demand have heightened the convenience services of ​Uber​, ​Foodora, Netflix and ​KeyCafe​ among others.

Facebook has been instrumental in enabling more contextual user insights (more than any other publishing or ad platform) to drive more personalized messages. This has created a very lucrative ad platform that has been the mainstay of success for Facebook. While they started out as a social network to hangout and share ideas with like-minded people, it quickly turned into a monetization opportunity with the user as the main actor and draw for salivating advertisers.

Facebook Privacy

Remember when Zuckerberg said this a few years ago? “Privacy is dead.”

Back in 2010, Zuckerberg justified, 

“If people share more, the world will become more open and connected. And a world that’s more open and connected is a better world.”

Eight years later, the beloved social network would be instrumental, because of recent events with Cambridge Analytica, in changing the way our data is shared, analyzed, reused or sold. The consumer’s acquiescence to FREE social access in exchange for corporate access and analysis of individual information will no longer be the standard.

Despite countless examples of data’s use for bettering society, there are equal number of examples that have questioned government and business’ use of data. The use of data, however unintentional, could ultimately harm the consumer. And while the objective is to mitigate business risk or streamline costs the resulting algorithms unleash biases and discrimination and perpetuates it across new populations.

Two examples: 1) Consider what Harvard Professor ​Latanya Sweeney​ discovered when she was viewing online ads specifically for companies offering background checks services: “racially-associated-names” triggered ads that were linked to criminal activity.

“After learning that a Google search for her own name surfaced an ad for a background check service hinting that she’d been arrested, Harvard University professor Latanya Sweeney set out to investigate whether race shaped online ad results. She searched over 2,000 “racially associated names​” to determine if names previously identified by others as being assigned at birth to more black or white babies" turned up ad results that indicated a criminal record...”

Her conclusion:​ “Sweeney concluded that ​so-called black-identifying names​ were significantly more likely to be accompanied by text suggesting that person had an arrest record, regardless of whether a criminal record existed or not.”

2) ​Staples​ was charged with price discrimination when it was discovered they were offering varied pricing based on customer “estimated income levels.” While unintended, the higher prices were targeted to people who lived in rural areas and had significantly lower income levels than those who received the discounted prices.

I encourage you to read Cathy O’Neil’s “​Weapons of Math Destruction​” as it contains a comprehensive view of the mathematical models that are in existence today, both “unregulated and uncontestable” even when there is evidence of transgression or error.

The culmination of media exposure to this data manipulation will create distinct boundaries in regulation in policies and data governance, especially in the wake of AI.

Will increased privacy impede progress in the wake of AI?

It’s no longer a question if consumers are willing to give up conveniences and ready-access to information to fiercely guard what is so rightly theirs. It’s no longer a question whether health care advancements to help predict the occurrence of disease will be stalled because of consumer-controlled data.

The real danger with AI is to proceed with ​today’s lawless and Wild West practices ​without considering the potential societal impacts if feedback loops begin to find patterns at unprecedented rates from data feeds coming in from other data feeds and so on. Without rules, without transparency, without auditing standards, without proper disclosure, the danger is AI running amuck and amplifying current biased models and eventually impacting decisions that can have detrimental societal effects.

The goal of AI should be to advance efficiency of current systems, and improve decisions for the overall benefit of humanity. If this is the panacea, then humans should have the right to control what data can be used, analyzed and resold. They have the right to give consent to the “context” in which data are used, whether they are securely processed, and whether they are treated with integrity.

They have a right to question whether specific personally identifiable information (PII) is required to achieve the desired result.

Under the ​European General Data and Protection Regulation (GDPR), basic information is protected.

  • name, address and ID numbers

  • Web data such as location, IP address, cookie data and

    RFID tags

  • Health and genetic data

  • Biometric data

  • Racial or ethnic data

  • Political opinions

  • Sexual orientation

Ann Covoukian, former Privacy Minister in Canada, architected the concept, ​Privacy by Design​. This notion is privacy that is built and programmed into each layer of fabric of any product, with the intention to create a transparent functional system that is authorized to process personally-identifiable data using fair and ethical standards defined by legal policy. It will mitigate the occurrence of current black box AI and organizations will be legally mandated to incorporate its use going forward. This has been embedded ​in the GDPR which rolls out May 25, 2018​. The impacts of this will be global as we are already witnessing.

Privacy shouldn’t impede progress in this new world of Artificial Intelligence if everyone were to comply with to same standards and guidelines for data use.

More than ever, consumers need to be informed

Consumers have to be informed and they need to remain vigilant in questioning business intentions. Hopefully, the GDPR will be the vehicle to sustain this. ​The goal to make profits enables business to cross the proverbial line in determining to what extent and in what circumstances consumer information be used. We should not discount the partnerships that big tech and telcos have with government, including DARPA or Palantir, to advance data contextualization into systems for security, and also for surveillance. The most recent news about Palantir validates the need for controls when surveillance of individuals en masse becomes more commonplace: Palantir Knows Everything About You. In this article, this ominous warning speaks volumes:

The software combs through disparate data sources—financial documents, airline reservations, cellphone records, social media postings—and searches for connections that human analysts might miss. It then presents the linkages in colorful, easy-to-interpret graphics that look like spider webs. U.S. spies and special forces loved it immediately; they deployed Palantir to synthesize and sort the blizzard of battlefield intelligence. It helped planners avoid roadside bombs, track insurgents for assassination, even hunt down Osama bin Laden. The military success led to federal contracts on the civilian side... The U.S. Department of Health and Human Services: detect Medicare fraud. The FBI uses it in criminal probes. The Department of Homeland Security: to screen air travelers and keep tabs on immigrants. 

...there was no mistaking the implications of the incident: All human relations are a matter of record, ready to be revealed by a clever algorithm. Everyone is a spidergram now.

We’ve seen Alibaba, Tencent and Baidu’s relationship with the Chinese government to surface those insights on individuals and ​create a society of compliance and trust​ by instituting a Social Credit System. We’ve also witnessed instances south of the border where the executive branch has tried to aggregate voter data and centralize it. The states claimed jurisdiction and refused this, and rightly so. To do otherwise would cause more harm than good.

What I wrote 4 years ago in my ​Quid Pro Quo post​ still rings true today:

“The balance here lies in ensuring the consumer doesn’t have the business by the noose. On the flip side, business must understand, in aggregate as well as the consumer level, the information that impacts parts of the purchase cycle... then both [business and consumer] have to come to an agreement of give-and-take.

  • What do I, as a business have to do to retain you, as a customer?
  • What am I, as a customer, willing to give you, the business to keep me satisfied and coming back?”

From a historical perspective, we have to learn from the past and visualize what we want our future to be. What we enable in business needs to be viewed from a ten thousand foot level. I take that wisdom from a blog call Writingya:

“[Facebook CEO] ]Mark Zuckerberg says privacy is no longer a social norm. When he said that, I tweeted — and I never tweet, that’s how angry I was — it may not be relevant for the social network, but it is for intimacy and democracy. Technology makes people stupid. It can blind you to what your underlying values are and need to be. Are we really willing to give away our Constitutional and civil liberties that we fought so hard for? People shed blood for this, to not live in a surveillance society. We looked at the Stasi and said, ‘That’s not us.’ And now we let Apple do it and Google do it? For what?”