On This Page: Nominal
Data Ordinal Data Continuous
On this page we discuss and give examples of several distinct ways that
color can be used to label data. We also point out some pitfalls and suggest
Color is a powerful
tool for labeling graphic elements and has been extensively used in
and engineering data visualization. Labeling applications can be broken
down by the type of data involved, which determines the constraints
choice of labeling colors:
Nominal data only
fall into distinct classes. They have no ordinal or quantitative structure.
With nominal data the color label only indicates membership in a non-quantitative
class (e.g., labeling the lines of a graph).
The particular colors
chosen need only to be discriminable from each other and identifiable
from the legend.
Line Graphics. In simple line graphics color has long been used
to distinguish among lines. If the conditions represented by the
lines differ on only one dimension, other line characteristics (e.g.,
type, stroke width) can be used as a separate, correlated coding
that is useful to users with anomalous color vision. If the conditions
vary on multiple dimensions, color and other line characteristics
can be used independently to represent overlapping groupings.
Problems: Failure of discrimination or identification.
The usual causes are trying to label too many classes (six to ten
is usually the maximum--four or five is easier to achieve), too
small symbols or stroke widths, too similar colors, and poor luminance
contrast between symbols and backgrounds.
Solutions: Use fewer colors (i.e., code some of the distinctions
with other graphic dimensions--for example, symbol shape), increase
stroke widths on symbols and lines, use maximally separated colors,
and choose symbol and line colors that have moderate luminance-contrast
with the backgrounds.
More about discrimination
lie in classes that can be arranged in an ordered sequence on the
basis of some ordinal relationship, such as "greater than/less than".
They can be further subdivided into monopolar (increasing or decreasing)
and bipolar (both increasing and decreasing from zero or neutral).
data the labeling colors of the graphic elements must be not only
discriminable and identifiable, but also visually ordered. The color
assignments have to express the sequential relationships among the
graphic elements. This can be achieved with a hue sequence, a saturation
sequence, a lightness sequence, or some combination of the dimensions.
Monopolar hue sequences can be obtained by mixtures of varying amounts
of two non-opponent hues, i.e., some pair other than red/green or
yellow/blue. Saturation and lightness are naturally visually ordered.
Combinations of saturation and lightness work well (see example
below). For bipolar ordinal labeling a combination of saturation
and lightness in two hues works well.
Labels. Color is often used to associate one of several quantities
or attributes with an area on a graphic. Such "choropleth"
maps are used widely to display such variables as economic or environmental
Problems: In addition to failure of discriminability or
identifiability, the colors may fail to form a visual sequence.
Solutions: Use saturation and lightness rather than hue.
about color in maps, see the ColorBrewer
color selection tool.
data have not only sequential order but metric spacing as well.
There are two kinds, interval-scale data and ratio-scale data. A
difference of one unit on an interval scale is the same size over
the whole scale. This is also true of a ratio scale, with the added
constraint that there is a true zero on the scale. A quantity which
has twice the scale value of another is twice as large in magnitude.
of interval-scale and ratio-scale data has all of the constraints
of the above classes and a few more. The visual relationships among
the colors are intended to express the quantitative relationship
among the elements. For interval-scale data elements this means
that two pairs of data that differ by the same amount should be
labeled with two pairs of colors that have the same visual difference,
e.g., the same brightness difference. A ratio-scale data element
that is twice the magnitude of a second data element should be labeled
with a color with twice the visual magnitude of the label for the
second, e.g., twice the brightness.
In the best
designed quantitative cases the relationships among the colors express
the quantitative relationships well enough to reveal "the big
picture"-- trends, groupings, and other structure -- without
the user having to laboriously recover each value from the scale.
This generally requires some art and judgment. The quantitative
relationships between the visual responses to the labeling colors
are not often readily calculated due to dependence on the viewing
context. A skillful mapping will approximately preserve values and
rates of change without introducing patterns in the colors that
are not in the data.
Problems: In addition to failure of discriminability or
identifiability, the visual spacing of the coding colors may not
conform to the spacing of the coded quantities.
Examples: Visualization. Data values that vary over a 2D or
3D space are often color-coded in scientific graphics, using a technique
known as "pseudocoloring". In this plot, positive values
of a bipolar variable are plotted in various lightnesses of cyan
and negative values in magenta. Topographic maps are also frequently
labeled in this way. In the NOAA aviation map below, terrain elevations
are coded by a smooth gradient of colors. Higher elevations are
in darker browns, lower elevations are in greens.
Solutions: Under some circumstances one can approximate
the metric spacing of the data by selecting colors from CIE (nominally)
uniform color spaces, e.g., L*u*v*. Colors can be viewed in the
L*u*v* space with the color tool of
about color metrics .
about color labeling in