Graphs are effective techniques for communicating quantitative information with graphical means. But to achieve effectiveness, five things must be carefully considered.
We use graphs for visual communication in business, statistics, research, and other areas, where quantitative data represents the key source of the communication.
Graphs communicate quantitative information that is derived from quantitative and categorical data.
The main purpose of graphs is not to show exact values of data, but to represent relations and connections between quantities. If our purpose is to report exact numbers, then it is way more effective to use tables. I do not consider tables as visualization techniques since tables have values encoded as text and not as visual structures. This is the main difference between graphs and tables. Graphs are especially useful for making comparisons. If we want to expose changes, differences, similarities, and deviances in data, then graph might be the best tool.
There are five things about graph that need our attention when designing graphs:
- visual structures,
- axes and background,
- scales and tick marks,
- grid lines,
Visual structures (or data objects) represent values in graph. Usually, those structures are bars, points, and lines. Before we decide which visual structure is most suitable for our data, it is good to ask ourselves:
- What message do I want to send with the graph to the audience ?
- Do I have appropriate data for that message?
Example: If we want to show how our sales are performing for different products and point out differences, then it might be most logical thing to use bars instead of lines or points. When message is defined, all we need now is appropriate data.
Bars are useful for showing comparisons, deviances or rankings. Whether we choose vertical or horizontal bars, they should always start on the axis since their lengths represents their values. Usually, we fill bars with single colour if we have only one category of data. We use different colours for bars only if we have two or more categories of data. In such case, legend is needed. Colours can also help us to make a contrast that enables bars to stand out from the background. We should never use special patterns (e.g. moire effect) to distinguish different categories of bars — it will annoy our viewers. We should also avoid using borders around bars. Those borders are useful only if we want particular bar to stand out. Another important thing to consider is the distance between bars and the width of bars. We shouldn't allow bars of different categories overlap each other. The best thing is to consider ratio between around 1 : 0,5 (width of bars : distance between bars).
Lines represent continuity and work the best when we want to emphasize quantitative values in relation to time (days, months, years etc.). In case of time series, we use horizontal axis for time values and vertical axis for quantities. Lines could also be used for representing deviations and frequency. We can combine lines with points (1) if we want to emphasize values on line or (2) if we want to distinct different lines in black-and-white environment (using different shapes of points for every line). If we have more lines in graph, we should avoid hatched lines (anyhow) and use points of different shapes on lines or colours instead. Hatched lines make too much confusion and are “chart junk”, as Edward Tufte would say.
With points, we highlight particular values. Points must be plotted clearly & visibly — especially in case where points might overlap each other (similar values). Points can be of different shapes and forms, but usually we use circles, squares, crosses, or triangles. There can be points with the colour fill or contours only. I believe contours are better choice since they allow us to distinguish overlapping points. People can easily perceive where the centre of the circle is. Also, crosses can be effective as well.
Axes define graph and there is no (effective) graph without axes. The first thing we should consider is the number of axes. That number depends on how many attributes we have or want to include. Usually, we use one vertical and one horizontal axis. If we have a matrix of graphs — which is usually better option for visualizing more variables then “3D alike” axonometric projection — it is wise to add two additional axes to all graphs for better distinguishing. Another important thing about axes is aspect ratio — ratio between horizontal and vertical axis. That ratio should be close to Golden Ratio since humans have wide-angle field of sight. Graph shouldn't look like a square because it can mislead viewers (e.g. excessively oscillating lines can bear manipulative messages).
For background, it is suggested to use light, warm colour or white. In that case, visual structures must not be too bright. Decorated backgrounds or colours that preserve data to stand out should be avoided. If colour is necessary for some reason, than it is suggested to use it only for one thing, whether for structures or for the background, but never for both of them.
Scales measure values in graph and are incorporated into axes. With the help of tick marks, scales divide axes into equal proportions and enable viewer to read encoded values. Axes with scales offer only visual support and nothing else — they have no additional information. All scales should begin with the zero value. Distances between tick marks should always be exactly the same. We shouldn't use too many of them — it makes graph saturated with unnecessary non-data ink. It is also better, to put them outside of data area so data in graph can “breathe”. Try to avoid using “strange” proportions that can confuse people (e.g. 0, 40, 90, instead of 0, 50, 100…) and don’t forget to put labels on scales. At last, use tick marks only on quantitative scale, on categorical axis they are needless.
Text is useful in directing the viewer and supporting his understanding. Text in the graph is additional information. There are two types of text in graphs: (1) data-region text, and (2) surrounding text. Data-region text appears in the area where data structures are plotted. Usually, such text are notes that come with an interactivity of graph (depends on software). A good example of such text can be found in Google Analytics, where small clouds appear when user moves a mouse pointer over the line in the graph. Surrounding text consists of title, legends and other notes. As Stephen Few said, it’s good to put text close to data region so the viewer doesn't need to make long distances with his/her eyes. Title is necessary because it tells us, what is graph all about. Usually, we locate titles above data region. Legends should be used only in case when we have two or more categories of data, in other cases they are meaningless. Their location can be on the right or between title and data region. We use notes only if they are necessary (e.g. particular date that viewer should be aware of).
There are no strict suggestions about typography; Tufte suggests serif font but I think sans-serif font works much more cleaner, especially after we have experienced flat design and fell in love with light fonts.
Grid lines help us to read, compare, and detect values and patterns in a more accurate way. Otherwise, they represent no additional information. Grid lines should be discrete and pushed to the back so they don’t attract any attention. For that reason, it is better to use light colours (light grey). Lines should always be continuous and never hatched. There’s nothing more annoying in line chart than searching for continuity of the line in line graph that is broken by several smaller grid lines. Horizontal grid lines are useful in bar graphs & line graphs, but not in horizontal bar graph. Vertical grid lines work the best in horizontal line graph. Scatter plots should have both, vertical and horizontal grid lines.
Example#1: Bar graph
Example #2: Line graph
Example #3: Scatter plot
Concluding thought: when designing graph, focus on message you want to share and clear all the unnecessary things from the graph that could spoil the message. Structures should stand out for viewers to remember. If everything stands out, nothings stands out, and people won’t remember anything.
To read more about graphs I would recommend Edward Tufte’s The Visual Display of Quantitative Information (2001) and Stephen Few’s Show me the Numbers : Designing Tables and Graphs to Enlighten (2004).
You are welcome to give any comments, suggestions, or notes. You can also find me on twitter @dejanulcej. Thank you for reading and sharing!