Effective use of color in data visualization is crucial for communicating your message clearly. However, color is inherently subjective—it varies by person, culture, and even across generations. While no single “perfect” color scheme exists, following some general guidelines can help you create more readable, appealing, and accessible graphics.
15.1 What to consider?
As mentioned in Chapter 6, you can have a lot of fun with choosing color palettes (e.g., MetBrewer). However, there are caveats you need to be aware of.
15.1.1 Think about the nature of the variable
First thing, before selecting a color palette, determine the nature of the variable you want to display. In general, there are four categories:
For variables with an inherent order, choose a palette that represents a gradual change. For example, we want to use the color to indicate the body mass of penguins.
Transitioning between hues to indicate progression. For example, we want to use the color class to show the species, and then the color varariton within the same class to show the body mass.
Additionally, whenever possible, always verify that your plot is color-blind friendly. Many modern palettes are designed to be distinguishable for viewers with common forms of color vision deficiency. If you are interested, you can read more here.
15.2 My Go-To Solution: The colorspace Package
Based on these considerations, my preferred tool is the colorspace package. It provides a unified and flexible interface to generate color palettes tailored to the type of data you are visualizing.
For example, consider the following code that uses colorspace to create a scatterplot of penguins data:
The color scale function–scale_color_discrete_qualitative()–is part of the colorspace package. Its naming convention follows the structure:
scale_<aesthetic>_<datatype>_<colorscale>()
<aesthetic>: The visual attribute (e.g., color or fill).
<datatype>: The type of data (e.g., discrete for qualitative data).
<colorscale>: The palette type (e.g., qualitative, sequential, diverging).
DON’T use the rainbow color palette.
Although visually striking, rainbow palettes can distort data interpretation because the perception of colors is non-linear and may mislead the viewer.
Use viridis package for continuous sequential variables.
Another popular package is viridis package, which offers excellent color palettes. As an example,
From https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html
However, it is mainly optimized for continuous sequential variables and might not be as flexible as colorspace for all types of data.
15.3 Additional Useful Resources
color-palette-finder: An interactive tool to help you choose the right color palette for your visualization.
配色事典: A visually rich resource with a wide range of palettes. I like having a physical copy when choosing colors.
Traditional colors: A collection of traditional colors used in China and Japan. There are many other color palettes avaiable online from different cultures. Try to explore them!
ColorPick Eye Dropper: A tool in Chrome that allows you to pick colors from any image and convert them to HEX code.