The Enterprise Leader's Quick-start Guide to
What every leader should be thinking about right now in order to limit biases and communicate critical data stories more effectively
Biases exist naturally; it’s simply part of the human condition. Still, biases significantly influence how each of us analyze, interpret, and communicate data.
What’s more, those who aren’t classically-trained data practitioners (those who never set out to build a career in research, analytics, or data science, but now find themselves in a data-focused role) can often feel as if they’re “flying blind,” unaware of what they simply don’t know or can’t control.
This resource is intended to get to the crux of how to improve the quality of your data and mitigate biases so that you can learn to trust your data and make more well-informed decisions; it’s a “quick-start guide” because it outlines the most important activities every enterprise leader should be prioritizing right now in order to improve data efficacy and make better decisions.
How Understanding Cognitive Bias Improves Data Storytelling
Obviously, if left unchecked, biases can result in less objective decision making, or worse, potentially costly financial mistakes.
So, if you’re in a position where you’re tasked with collecting, interpreting, or communicating data (nearly everyone these days), start by better understanding the types of cognitive biases that are most likely to affect your judgement.
It’s worth pointing out that cognitive bias in and of itself is a topic deserving of a much lengthier and complex discussion than what we can possibly cover here. So, we’ll only aim to briefly introduce a couple of the most common biases we’ve encountered in our own data work, particularly when working with large, complex organizations.
Firstly, confirmation bias pops up time and time again, particularly when collecting, analyzing, and presenting data. Confirmation bias, as the name suggests, occurs when we select data that supports our own personal arguments or hypotheses. That is, it’s a shortcut that seeks to use data to confirm our own personal beliefs. Confirmation bias occurs frequently, especially in large organizations, where personal agendas can sometimes take priority, particularly in a highly bureaucratic setting where leaders often jockey to advance their own priorities.
Secondly (and for many of the same reasons), anchoring bias often runs rampant inside larger organizations or within larger functional teams where data analysis is rather siloed or compartmentalized. Anchoring bias, again as the name implies, occurs when you rely too heavily or "anchor" yourself to one trait or piece of information when making data-driven decisions. As a result, your interpretation and analysis often then impacts everything downstream from your anchor point.
We won’t pretend to be the go-to experts here, but would offer this piece of guidance — make it a point to try to stay current with the latest research and opinions coming from the more academically-focused voices in the analytics field.
Doing so is often an effective way to better understand where others are placing their bets. The field of visual analytics is still relatively “raw” in the sense that we’re learning more and more each day about both: (1) how humans interpret and analyze information, and (2) how to mitigate some of our natural biases as a means to more effectively analyze information and data.
For instance, a recently published doctoral dissertation (August 2020) examined the effects of detecting and mitigating human bias in visual analytics, specifically, exploring—among other things—the latest bias mitigation strategies that have proven promising or effective.
The research references two high-level categories of bias mitigation techniques — a priori and real-time. While a priori bias mitigation strategies seek to reduce bias before the analysis stage (often in the form of educational training that examines past errors to inform future decision-making), real-time bias mitigation techniques aim to identify and mitigate bias at the point of analysis, in real-time. The thought process here is clear: “If biased decision making processes can be assessed and measured in real-time, bias mitigation strategies can do more than simply educate analysts beforehand.” That is, they may be able to reduce bias at the point of interpretation and analysis.
Induction bias, selection bias, and survivorship bias are also common cognitive biases that exist when working with data in any capacity. It’s a bit much to cover each of those in detail here in this guide, so, if you’re looking to go a bit deeper, we’d recommend this closer look at cognitive bias in data science.
There are a variety of bias mitigation techniques, yet one in particular is perhaps most intriguing given the prevalence and promise of more sentient machine learning abilities. These techniques are often labeled “machine-initiative” because they rely on visual analytic tools to play a central role in bias mitigation, where the machine operates as an unbiased collaborator that can act on behalf of the interpreter, or take initiative, to mitigate potentially biased analysis processes.
How Tracing Data Back to the Source Improves Data Storytelling
More than 70 percent of organizational leaders have said that suboptimal data quality has negatively impacted their business decisions.
Now, there are a number of contributing factors at play here, yet it’s important to recognize that one of those factors is cognitive bias. In this case, individuals who are tasked with interpreting and analyzing data often assume that the data they’re working with is clean and accurate. In many cases, that’s correct. In many others, it’s an incorrect and consequential assumption, or a loosely held bias about the data itself.
The unfortunate reality of deep-rooted inconsistencies or glaring biases in your source data is the resulting fragility of your decision-making process. Most, especially those downstream from the source data, seldom think to question the data they’re working with until there’s a clear and compelling reason to do so. The alternative path is much more effective; that is, to entertain your own curiosities, ask questions of both your team and your data, and entertain other possibilities outside of the ones you seek. Thus, to reduce unnecessary risk, it’s vital that each individual make a concerted effort to thoroughly inspect and question critical source data.
This step is necessary both at an organizational and functional level.
Yet, in many cases, this step is either lacking entirely, or it’s been done rather haphazardly. Thus, it’s worth spending the time and energy necessary to map existing data flows within your specific function or team.
When doing so, identify any “dead ends” or silos you encounter along the way. Even if there is a governance team who’s working on a similar initiative, it’s still worth a second look. You want to intimately understand where critical data is housed, how it’s collected, and where it moves over time (both upstream and downstream). Even if you find that everything checks out, at worst, you’re verifying the cleanliness and accuracy of critical data sets, which is hardly a bad thing.
There are a variety of tools that can assist you in this process, many of which will allow you to simplify and automate some of the critical tasks involved in discovering, profiling, and indexing data. In fact, if your organization already has a dedicated data governance team (or a group or business function acting in a similar capacity), it’s likely that an enterprise data intelligence platform is already in place. If that’s indeed the case, check with that team to better understand how the tool and technology stack can be used to assist in this process. Any sound data governance practice will be actively using these kinds of tools at a global level.
Still, despite the clear benefits of auditing, cataloging, and organizing existing data, only 20% of organizations say they publish data provenance and data lineage information internally, and most of those who don’t publish such information say they have no plans to start. Don’t make this mistake. Making this type of information readily accessible to all teams is one of the most important steps toward making data and its sources more transparent for a more data-driven culture, which, if maintained and governed properly, is critical to improving data quality directly at the source.
Implement a system of checks and balances, ensuring all critical assumptions are double-checked
How a System of Checks and Balances Improves Data Storytelling
Bias mitigation techniques take many forms; some are more intricate and complex than others.
Still, one of the best ways to reduce bias in your analysis and reporting process is to include those who will bring varied perspectives to the same set of data. While it seems like a no-brainer, many teams instead tend to limit data interpretation and analysis to a small group of individuals—usually those in an analyst role or similar—who may lack the specific domain expertise needed in order to understand the full picture.
Instead, seek out other individuals who have the domain expertise that you or your team might lack. It’s worth noting that you’re not looking to gather perspectives just for the sake of doing so. Rather, you’re looking for those specific individuals who, by the very nature of their work, have both a vested interest in the data and thus should have an accompanying seat at the table. These individuals can often provide help to identify flaws, patterns, or supply context that otherwise may go unnoticed — or worse, misinterpreted.
Finally, for many of the same reasons, it’s worth implementing a review process to ensure that the narrative that you’re trying to communicate is clear, accurate, and actionable.
It’ll be difficult (if not impossible) to remove all biases from your reporting. That’s okay. Your goal isn’t perfection; it’s simply unattainable. Instead, you’re looking to implement the safeguards necessary to ensure that your team and organization are making the best decisions with the best data available. To do so consistently still requires a keen human eye, especially given our increased reliance on machine learning and artificial intelligence (AI) models and tools, particularly those that are used to collect, store, analyze, and visualize data.
In fact, Gartner predicts that, by 2023, 75% of large organizations will hire AI behavior forensic, privacy, and customer trust specialists to reduce reputation risk as a direct result of biased data. In addition, large organizations like Facebook, Google, Bank of America, NASA, and others have moved to appoint forensic specialists who primarily focus on uncovering undesired bias in AI models before they’re deployed. These specialists are validating models during the development phase and continue to monitor them once they’re released into production, as unexpected bias can be introduced because of the differences between training and real-world data.
Safeguards like that of a peer review process (or similar) are critically important; the more thorough those safeguards, the easier it will be to identify, correct, and reduce incidences of obvious or unexpected bias. Doing so is still one of the most effective methods you have in order to ensure that you’re not relying on inaccurate assumptions or biased preconceptions as the foundations of your data storytelling. Hopefully, more trained eyes means fewer mistakes.
Most organizations would readily recognize that they’re sometimes making ineffective decisions because of inaccurate or incomplete data. What’s more, anyone who’s charged with interpreting data is also subject to bias, which is a fundamental part of the human condition.
It’s important to recognize that biases exist naturally. In fact, it’s nearly impossible to eliminate biases altogether when working with mass volumes of data.
Still, it’s important to remember that data is hardly, if ever, 100% objective; it’s humans who give meaning to these figures, draw inferences from their presentation, and define meaning through our personal interpretations, respectively. It’s up to each individual, then, to better understand the ways in which biases affect our own perceptions and to take an active role in limiting the effects of cognitive bias in our interpretation, understanding, and reporting of critical data.