The Role of Color Theory in Data Visualization
Colors affect our perception of information we take in from the world around us, so it’s important to properly use color in your data visualizations. Colors can either aid in communication or distract from it.
As a data practitioner seeking to create effective visualization, you should familiarize yourself with the principles of color theory and what effects they can have in your visual displays — especially if you aren’t classically trained in design. So let’s dive in.
How Our Brains React to Color
After visual input hits the retina, the information flows into the brain, where information such as shape, color, and orientation is processed in as little as 13 milliseconds according to MIT neuroscientists. This is known as preattentive processing (PaP). It’s a product of the rods and cones, the two forms of photoreceptors within the retina, quickly being drawn to the various ways that shape and color are constructed in the world around us.
Because of this, a carefully selected color palette helps to harness the pre-attentive processing powers of the human brain, making insights clearer and easier to find. A badly chosen color palette obscures the information your users need to understand, and makes your data visualization less effective — all the more reason for you to be well-versed in the principles of color theory as a data practitioner.
Color Theory Principles
Color theory is the study of color from both scientific and subjective perspectives to understand both how it influences human perception and how it can be utilized to influence better decision-making in communication and design.
The study of color theory provides an immense body of practical guidance to visual design — far too much detail for us to fully get into here — but there are a few elements that you should care about most when it comes to data visualization:
Color anatomy encompasses the various aspects that make up color and how we view it. These aspects are broken down into three parts: Hue, Saturation, and Lightness values.
While not all designers may agree that color is perfectly synonymous with hue, at its most basic, hue simply refers to color or shade based on how it is perceived and processed. Different hues can be selected to represent different values or categories in your visualizations.
Saturation refers to the brightness of a color in relation to the area it occupies. High saturation produces vibrant colors whereas low saturation produces duller, whiter colors. Be mindful of too much saturation, as it can overwhelm your chart and make it difficult to find other visual elements.
Lightness is closely related to saturation, but makes use of tints and shades — degrees of black and white — instead of brightness of a color. In both cases though, what you are left with is a striking scale of intensity which can be used to further showcase differentiation in your interface. (See color schemes below.)
Color harmony refers to the principle that certain colors, when used in combination, can create visual contrast or cohesion. As a data practitioner, you should examine how you can use the color wheel and color harmony to make smart palette decisions. Depending on what kind of story you want to tell with your data, you can use different arrangements to maximize the impact.
Understanding the above concepts will help you choose the right color scheme for your visual display. Color schemes can be repeated to emphasize similarity or contrasted to differentiate what’s important from what’s not. They can make use of saturation and lightness to help represent degrees of value or intensity — all of which aids to create visual hierarchy. These different color schemes should be incorporated often into visual displays to reinforce intended narratives, depending on which best tells your data story.
For example, a monochromatic palette is best for scenarios where your intention is to suggest that your data is sequential or varies in degree instead of kind. This is all dependent upon how tint and shade are utilized to show varying levels of lightness in the visualization.
Analogous pairings, colors that sit beside each other on the color wheel, provide a more varied alternative for sequential data visualization. While still remaining separate from one another visually, they create far less contrast than colors with dissimilar hues, which gives the perception that the items are still closely related but different in some way. This can be helpful in visualizing different, but equally important metrics or other data points.
Complimentary color pairings are counterparts that represent the strongest possible contrast for those two colors — like red and green, blue and yellow. Making use of these kinds of pairings is an easy shortcut to give the perception that the things they represent are opposing, such as "positive" and "negative" impact or gains versus losses.
All this isn’t to say that it's suddenly necessary for you to memorize complementary, analogous, or triadic color pairings; what does matter here is that you are familiar with these principles to the extent that you can leverage colors to provide adequate meaning in your data visualizations.
Best Practices with Color
Within our data visualization work here at RevUnit, we’ve gathered some best practices when it comes to applying color to your data visualizations.
Roughly 4 percent of the population has some sort of color blindness, with most being men. The form of color blindness most of us are familiar with causes confusion between certain shades of red and green, though there are also forms of color blindness that cause blue and yellow shades to look similar.
As green and red are a complimentary color pairing commonly used to represent positive and negative values, overlooking color accessibility in this way is more common than you might think.
It’s up to all data practitioners to be mindful of these nuances, and many already are — a host of color selection tools have been developed to assess how your visualizations might look to those with color perception deficiencies. We highly recommend you take a look at these tools to ensure your visualizations are accessible to everyone who may view them.
As a general rule, you should always pick the same color to represent the same thing; be consistent with your color selection and what it represents in your visualizations. Humans naturally perceive color as a pattern, so when they are presented with a color across multiple charts, they will assume it is a representation of the same object or entity.
When that color represents different things in different visualizations, that’s when you can run into trouble as a data practitioner. Consistency in your color selection is vital for increased clarity and limited confusion in your visualizations.
Don’t Just Rely on Color
Don’t use color as a crutch in your visualizations. Without a set of axes or values applied, colors cannot be used to accurately portray anything precisely. It more often serves to aid in perceiving length or position.
Just be careful to remember that while color can aid in strengthening your data storytelling, you’re still working with raw numbers that require more than just color in order for their full impact to be understood. Never use color as a sole indicator in a visualization. It adds a lot to your visualization to be sure, but what we’re going for here is data storytelling, not color storytelling.
Don’t Use Too Much Color
It’s also important to not go overboard with your color selections; too many will simply be overwhelming. As a rule of thumb, the Data Visualization Society recommends limiting your palette size to 10 or fewer colors. Once you go beyond this threshold, your audience will tend to have trouble distinguishing between groups in your visualization.
More broadly, you should use color as a functional tool in your data visualizations — not an aesthetic. This is especially true when seeking consistency in your visualizations. While it’s nice to have your data look pleasing to the eye, having your data tell a clear story is far more important. Be intentional about your color selection, and be sure your choices serve the purpose, not the appearance, of your data story.
Don’t Default to Brand Colors
In many cases, data practitioners are asked to use a brand color palette in their visualizations. Although brand colors can be a starting point, there needs to be a give-and-take that balances the principles of color we outlined above.
Brand colors will not always give you the necessary range or diversity of color that you require in a particular visualization, nor will every chart you create require multiple colors. In addition, many brand color combinations aren’t ADA-accessible. To avoid this, data practitioners should work with their brand teams to incorporate brand colors only where necessary, and use them less frequently as visual indicators.
A compromise with your brand team may be including the brand colors on the navigation bar with the logo, or on informational screens, but not in the visualizations themselves. Or, if a primary brand color is blue, then using extensions of that color in your graphics.
Like we said before, color theory represents an immense area of study that you’re certainly not going to get through in an afternoon. So if you’re to take anything away from this article, it should be that you are intentional and consistent with your color choices in your data visualizations.
Always consider your color choices as carefully as possible when presenting visualizations to others. Proper palette selection will always make delivering your intended message that much easier (and faster).