Archive for category Visualization

Bar charts vs. line charts

It may be difficult to attribute the following points to a specific source, but here are all of the guidelines I can remember off the top of my head about bar charts vs. line charts, mostly learned from Edward Tufte and Stephen Few. It’s a bit of an art, and how you represent your data depends on what exactly you are intending to find in it, so it’s difficult to write finite rules that dictate what to do. If you want to learn more, check out their books on our Reading page.

line

Line

  • When to use them: Line charts should be used only for time series (chronological) or when there is some other sequence to the dimensions on the x-axis, e.g. dates, months, sequence of stages of a project, sequence of meters along on a gas pipeline, and they should be used to detect trends and patterns, not to give people exact quantitative readings.
  • Scale: As line charts are not really intended to give people exact numbers, forcing zero scaling is not necessary and can make it considerably more difficult to detect said trends and patterns.

bar

Bar

  • When to use them: Bar charts should be used for comparing specific x-axis values, though they can certainly be used for time series, like line charts. They can also be used to display parts of a whole in favor of pie charts, in which case, the space between the bars should be reduced.
  • Orientation: Do not use vertical or diagonal text to label the axis of a bar chart. If the x-axis has longer text descriptions, use horizontal bar chart, so the text can read left-to-right, horizontally (the way we normally read).
  • Scale: As the area of bars implies volume, it can be deceptive to use dynamic scaling with bar charts (see: Lie Factor). If the differences between the data points is difficult to distinguish with forced-zero scaling, use symbols/points in favor of bars and use dynamic scaling.

Applies to both

  • Dimension order: There should be some logical order to the dimensions on the x-axis. In the case of a line chart, it should follow the chronological, process, or stage order that caused you to select a line chart in the first place. In the case of bar charts, the order should have some rhyme and reason to it: sorted by y-axis value, alphabetical, etc., depending on the content of the chart and what its intended use is, e.g. ranking, distribution.
  • Scale labels: If the numbers are already being displayed on the data points, it is redundant to label the axis with numbers, too.
  • Axis labels: If you can incorporate the metric names and dimension names into the chart title or legend, do not waste space on axis labels.
  • Share/Save/Bookmark

1 Comment

Fun visualization site: Information is Beautiful

Information is Beautiful

Ideas, issues, knowledge, data – visualized! See what you think

I think I would get along well with the creator of the site, David McCandless, who, on his About page, says, “My pet-hate is pie charts. Love pie. Hate pie-charts.”  It’s not surprising that he has written for Wired before - his visualizations look to be very much in the Wired style of infographic.  Here is one I saw featured on Lifehacker that could prove useful:

The Buzz vs The Bulge: Caffeine and calories

According to McCandless’ scatter plot, the most bang for your buck in terms of high caffeine and low calories is iced coffee (lower-right of chart), while the worst is a hot chocolate with whipped cream (upper left of chart).  I’ll also point out that iced coffee, like regular coffee, is cheaper than espresso-based drinks, and most people are actually capable of making it at home, so it seems like an easy choice.

caffeine

  • Share/Save/Bookmark

2 Comments

The book on trellis charts, AKA small multiples

small_multiples
ManyEyes

From pg. 67 of Edward Tufte’s Envisioning Information:

At the heart of quantitative reasoning is a single question: Compared to what? Small multiple designs, multivariate and data bountiful, answer directly by visually enforcing comparisons of changes, of the differences among objects, of the scope of alternatives.  For a wide range of problems in data presentation, small multiples are the best design solution.

What are small multiples?  Essentially, a small multiple is a series of displays with the same design structure repeated for all the images, arranged in a grid.   That means each graph in the series should be the same size and shape, with the same scale, differing only in the data they display.

What are the advantages of using small multiples?  On page 29 of Envisioning Information, Tufte says, “An economy of perception results; once viewers decode and comprehend the design for one slice of data, they have familiar access to data in the other slices.  As our eye moves from one image to the next, this constancy of design allows viewers to focus on changes in information rather than changes in graphical composition.  A steady canvas makes for a clearer pictures.”  (If you want to learn more about small multiples, all of chapter four is dedicated to them.)

Unfortunately, most of Tufte’s examples in Envisioning Information, e.g. the proper formation of capital letters, light signals for a train, or Saturn’s orbit, while instructive, are a bit of a stretch to apply to common BI situations.  Enter Stephen Few, who always manages to apply Tuftean principles in a way that you can use them at work.  From page 159 of Information Dashboard Design:

Concerning their efficiency, a small multiple offers another advantage over a series of individual graphs: the title, legend, and other metadata need to be printed only once to represent the series.

Here, Few uses small multiples to introduce another dimension to the standard grouped bar chart:

plain

trellis

Few also offers small multiples up as a method of resolving overplotting in graphs (PDF) (this paper is an excerpt from his excellent new book):

all-regions

by-region

The last example from Stephen Few I’ll mention is that small multiples can be used as “visual crosstabs”.  (Of course, it is helpful to have the supporting information available, too, if possible.)

crosstab

visual-crosstab
Improve Your Vision (PDF)

A white paper worth reading that covers some of the principles involved when employing small multiples is Three Blind Men and an Elephant: The Power of Faceted Analytical Displays (PDF).

Looking at other examples, the written-in-stone-reliable (cough) Wikipedia’s sample image in the entry for small multiple is not a small multiple, due to the different metrics and scale of vertical axis in each chart.

smallmult

Another small multiple fail is way back in one of the first posts on this blog, The Trilogy Meter.  The problem there is that the graphs are arbitrarily arranged, while they should be in order of magnitude from greatest overall to worst overall trilogy.

My image sample from the QlikView 9.0 Beta - QlikView 9 being the first release to support trellis charts -  may not have been a great example of when to use small multiples.  They should not be frivolously used in place of every multi-series line graph, containing the same information in merely five times the space.  That said, it may be of value if you see small multiples as an alternative to using a list box to toggle through slices of  a dimension, due to the way the human memory works, as discussed in the “single context” section of my post about facilitating comparison.

  • Share/Save/Bookmark

1 Comment

Pleasant new visualization defaults in QlikView 9.0

Word is that QlikView consulted with Stephen Few for improvements to data visualization in QlikView 9.0.  He was less than kind to BI software companies, including QlikTech, during his keynote at Qonnections 2008.  Do you think he had anything to do with the new default colors and settings?  They’re refreshingly simple and far less distracting than QlikView 8.5 and earlier’s delicious-looking captions and old-fashioned stick candy bar graphs, treading farther into Tableau territory.

9.0:
90

8.5:
85

Coincidentally, Few on small multiples (AKA trellis charts), from pages 159-160 of Information Dashboard Design:

An intelligent organizer for small multiples built into the software would allow you to reference the data, indicate which variable goes on which axis of the graph, which should be encoded as lines of separate colors, which should be arranged per graph, and finally whether you want the graphs to be arranged vertically, horizontally, or in a matrix; the organizer would then handle the rest for you.  As of this writing, I have yet to see dashboard software that makes this easy to do.  I reserve the hope, however, that this will soon change.

That pretty much nails how they work in 9.0.

stick-candy
Mmm…

  • Share/Save/Bookmark

3 Comments

Design consistency

red-means-stop
Adobe

Welcome back, as I continue my slow, alphabetical wade through Universal Principles of Design.

According to the principle of consistency, systems are more usable and learnable when similar parts are expressed in similar ways.  Consistency enables people to efficiently transfer knowledge to new contexts, learn new things quickly, and focus attention on the relevant aspects of a task. (page 46)

That’s right: creating report templates is not just busywork, because inconsistency is actually distracting and counterproductive.  When you see a new report, you shouldn’t have to search for the run date, and when you flip the pages of a dashboard or analytical interface, you shouldn’t need to regain your bearings and relearn the layout.  Consistency allows end users to jump right into consuming the data, and this principle applies equally to the likes of reports, dashboards, analytical applications, and even slide decks and your company’s web site.

So pick an approach, given your audience and medium, and stick to it.

  • Replicate your layout on all pages of a report or application, with components aligned and intuitively arranged. Selectors (dropdown boxes, radio buttons) for the same dimensions should be in the same place, the data date should be in the same corner, etc.

By the way, here’s a QlikView enhancement idea to make layout consistency a natural part of building applications: create an option on the Layout tab of an object’s properties that allows you to choose from a series of checkboxes on which sheets an object should appear, rather than just copying it once per tab.  I imagine it would even save 1) memory, because there would be fewer overall components and 2) calc time, because it wouldn’t need to update copies of those components on new tabs when you open them.

  • Have not just consistency of colors, but consistency of meaning of colors. For instance, darker colors might indicate recency or the degree to which a data point is an outlier.

alerters

That, of course, precludes you from making psychedelic bar charts, where every slice is arbitrarily assigned a color for the sake of variety.  (This example commits more offenses than just that.)
junk-bar-chart
Junk Charts

  • Pick a standard font, colors and sizes. Choose based on readability and the ability to create some emphasize using contrast.  For instance, everything data-related might be black, while everything else, like help text or metadata, might be a lighter color.
  • For charts, pick a standard alignment for titles and a standard place to display the units and granularity. Making a habit of this improves not only consistency, but provides end users with the context necessary for understanding the data correctly.
  • monochrome-bar-chart

  • Apply a standard alignment for tabular data, e.g. left-justified text and right-justified numbers with the same number of decimal places, to facilitate visual comparison.

table

  • Sort orders should be consistent, particularly if several charts or tables are using the same dimensions.
  • Each page should have the same footprint. In a dashboard, the components on all tabs should fit into a space with the same dimensions, e.g. 1024×768, and reports shouldn’t have objects running off of the page.

Some of those sound like common sense, but it’s shocking how often they are violated.  Once you have finished your thoughtful planning and are happy with your work, document your choices as the standards upon which future work will be based.

  • Share/Save/Bookmark

No Comments

The book on sparklines

sparkline

Sparklines, a term coined by Edward Tufte, are becoming increasingly popular in Business Intelligence software.  Some applications, like Excel (through various add-ins) and QlikView (starting in version 9.0), have the ability to make them, out of the box, while they can be created elsewhere, like Xcelsius, with a bit of creativity.

You’ve likely seen them before, but do you know when it is appropriate to use them?  They’re not to be thrown around just because all of the cool data visualization kids are using them.

The background, from Wikipedia:

The term ‘Sparkline’ was proposed by Edward Tufte for “small, high resolution graphics embedded in a context of words, numbers, images.” Tufte describes sparklines as “data-intense, design-simple, word-sized graphics“. Whereas the typical chart is designed to show as much data as possible, and is set off from the flow of text, sparklines are intended to be succinct, memorable, and located where they are discussed.

The clearest and most instructive examples, not surprisingly, can be found in one of Tufte’s books, Beautiful Evidence.

tufte-sparkline

Pictured components

  • Line representing the last n data points
  • Data point for most recent reading highlighted in red
  • Value of most recent reading in corresponding red type
  • Name of metric
  • Acceptable/normal range as gray, shaded area

Another example of his incorporates lows and highs over the period represented:

high-low

(Note that, while the horizontal axis is not labeled, the 12 months header indicates the time period being displayed.)

There isn’t a single pixel wasted on meaningless or redundant data, embodying Tufte’s data-ink ratio.  Another way in which he is practicing what he preaches is that all of the data related to each metric is in close proximity, not requiring repeated references to scattered information.  Of course, those are Tufte’s specs, and different BI companies and the people who have created custom sparkline components may choose to implement them differently.

If you’re looking for guidance on the best way to apply them in your applications, I like how Stephen Few succinctly puts it: “Think of them as an enhanced, much more informative substitute for the trend arrows that often appear on dashboards.”

For only marginally more space than a trend marker, sparklines provide significantly more information and paint a more complete picture than simple up/down or green/red indicators.  The lack of context surrounding trend indicators leaves open the possibility that a positive indicator represents a minuscule uptick at the end of a significant and long-term drop.  In other words, when you look at your dashboard for the day and see a green, up arrow for margin %, that means margin % has improved in the most recent period, while it could still be down for the week, month, quarter, or year (Few explains something similar on page 140 of Information Dashboard Design).

While the line obviously represents some period of time, the horizontal, dimensional axis is not labeled.  In fact, neither axis is.  The reason is that sparklines are meant to show trends and comparisons, not detailed values, like standard line graphs.  This helps explain why they are not a substitute for the standard line graph, which can more easily compare multiple dimensions or multiple measures with greater precision.

And don’t forget that the line chart is but one type of sparkline.  This image from Juice Analytics shows a catalog of examples from one Excel add-in (some of which are at least mildly objectionable, in my opinion):

sparklinegallery

Finally, see this thread on Edward Tufte’s message board for the single longest conversation about sparklines since the dawn of time.

  • Share/Save/Bookmark

No Comments

The quotable Stephen Few

The nice thing about Stephen Few is that, as he is not beholden to any software companies, he can be blunt in his appraisals of the programs we know and love (and hate).  Here are a few gems:

“Just for fun, I decided to go all out and take advantage of the one other visual design option that Graphwise offers: the ability to put an image in the background of the graph, which they call a watermark. From the many pictures of animals, buildings, furniture, etc., I decided to dress up the arctic cool version of my graph by appropriately pairing it with a penguin.  I particularly like how I was able to make the penguin’s beak reach for the high value of 100,000. This might look cool (arctic cool, even) , but it is an example of dysfunctionality at its worst.”  [In what respect is this venture wise?]

graphwise-figure-_9

“Try to decipher the patterns and values in the following chart. Come on, give it your best shot. Even if I offered a cash prize to anyone who managed to come close, it wouldn’t be worth your effort to try, because you’d be forced to use the prize money to pay a doctor to fix the damage done to your eyes.” [Dysfunction at its finest]

step-chart

“…Here’s a radar chart that you could use to compare the performance of three products across eight years of time. Did you know that time is circular and that in the year 2007 we have returned to where we began in 1999? Despite this revelation, I’m finding it hard to relinquish my notion that time is linear and my desire to see this information in a simple line graph.” [Dysfunction at its finest]

radar-chart

“A vendor that claims to be the best, which this one unabashedly claims (just like every other major BI vendor), should be ashamed of selling such moronic products. Don’t reward them for irresponsible work—products that assume their customers are halfwits—by wasting your money on them.” [Fast track to nowhere]

“…Don’t insult the intelligence of the business intelligence community by gluing a carrot on the head of a goat and calling it a unicorn. That only works at carnivals for children and drunks.” [Newsflash: BI discovers the obvious]

This is not a knock on him; I’m a little jealous.  I think his books are fantastic (and have the new one on preorder), thoroughly enjoyed his keynote at the QlikView partner conference last year, and have no doubt of his objectivity.  He’s doing his job.  Mine is to communicate data effectively…even with some of the tools he is referencing in those quotes.  It is possible, even if it cannot be found in the vendors’ sales material or default visualization settings.

  • Share/Save/Bookmark

No Comments

A defense of limited 3-D glossiness

I am no fan of excessively shiny, glassy, or delicious-looking dashboard components or graphs, but that effect can serve a purpose, as explained by the design principle of perceived affordance.

Affordance is more general: “a quality of an object, or an environment, that allows an individual to perform an action”. Don Norman, author of The Design of Everyday Things, elaborates on perceived affordance:

…Because I can click anytime I want, it is wrong to argue whether a graphical object on the screen “affords clicking.” It does. The real question is about the perceived affordance: Does the user perceive that clicking on that location is a meaningful, useful action to perform?

More from Universal Principles of Design, pg. 20:

Images of common physical objects and environments can enhance the usability of a design.  For instance, a drawing of a three-dimensional button on a computer screen leverages our knowledge of the physical characteristics of buttons and, therefore, appears to afford pressing.  The popular “desktop” metaphor used by computer operating systems is based on this idea - images of common items like trash cans and folders leverage our knowledge or how those items function in the real world and, thus, suggest their function in the software environment.

label-or-button

(They’re both buttons.)

In other words, is it obvious from the design of your application what can be clicked and what can’t?  Giving your entire application the same flatness when certain components are intended to be clicked can cause areas of importance to be overlooked by the user.  Similarly, it is misleading to make pie and bar charts look like lollipops and old-fashioned stick candy when they are not interactive.

I don’t know how much weight I would give this - because it’s subjective - but another design principle that could be relevant is the aesthetic-usability effect.

Advances in our understanding of emotion and affect have implications for the science of design. Affect changes the operating parameters of cognition: positive affect enhances creative, breadth-first thinking whereas negative affect focuses cognition, enhancing depth-first processing and minimizing distractions. Therefore, it is essential that products designed for use under stress follow good human-centered design, for stress makes people less able to cope with difficulties and less flexible in their approach to problem solving. Positive affect makes people more tolerant of minor difficulties and more flexible and creative in finding solutions. Products designed for more relaxed, pleasant occasions can enhance their usability through pleasant, aesthetic design. Aesthetics matter: attractive things work better.

Frankly, I don’t find these kinds of things attractive.

Warning: opens SWF in new window

  • Share/Save/Bookmark

No Comments

How to reduce “glitz” in Xcelsius

xcelsius-replica

I thought this thread was too valuable to remain buried on the Perceptual Edge message board.  “Candy-like” is the adjective I often use to describe shiny, round, three-dimensional-looking dashboard components, and Xcelsius is incredibly candy-like if you don’t make the effort to create a clean, professional dashboard.  Even the samples on the SAP website are enough to offend anyone’s Tuftean sensibilities, also frequently lacking the appropriate data context required for quick analysis.

Here are some takeaways from the thread:

  • Apply a skin like Halo or Windows Classic instead of the default, Aqua (you can permanently change your default theme under File, Preferences, Document)
  • Remove gridlines when not they’re not necessary, e.g. when the purpose of a chart is to provide relative comparisons, not quantitative precision
  • Show limits on circular, horizontal, and vertical gauges; show targets when feasible, e.g. when target locations will be static, not moving with scaling axes
  • You can create sparklines with tiny line graphs with all labels, axes, etc. disabled
  • Create multiple bullet charts with a stacked bar chart (targets) overlaid with a bar chart (actual), provided that they share a scale, which works best with percentages

bulletexample

And a few things I would add that were not mentioned in the thread:

  • Left-align titles and subtitles
  • Subtitles, help text, and axes should be a lighter shade than the axis labels, dimension labels, and data itself
  • There is a free add-on for Xcelsius 2008 that allows you to create basic bullet charts and sparklines without employing workarounds (scroll to bottom of linked page)

While on his site, I also noticed that Stephen Few has a new book coming out June 1.  Preorder!

  • Share/Save/Bookmark

3 Comments

Serif versus sans-serif fonts

periodic_font_table
Periodic Table of Typefaces

To begin, according to Wikipedia:

In typography, serifs are semi-structural details on the ends of some of the strokes that make up letters and symbols.  A typeface that has serifs is called a serif typeface (or seriffed typeface).  A typeface without serifs is called sans-serif, from the French sans, meaning “without”.  Some typography sources refer to sans-serif typefaces as “grotesque” (in German “grotesk”) or “Gothic”, and serif types as “Roman.”

fonts

Tufte, on page 183 of The Quantitative Display of Visual Information, quotes Josef Albers’ Interaction of Color:

The concept that “the simpler form of a letter the simpler its reading” was an obsession of beginning constructivism.  It became something like a dogma, and is still followed by “modernistic” typographers….Ophthalmology has disclosed that the more the letters are differentiated from each other, the easier is the reading.  Without going into comparisons and details, it should be realized that words consisting of only capital letters present the most difficult reading - because of their equal height, equal volume, and, with most, their equal width.  When comparing serif letters with sans-serif, the latter provide an uneasy reading.  The fashionable preference for sans-serif in text shows neither historical nor practical competence.

Ryan Newman, of the Interactive Visualization blog, says:

In choosing typefaces for dashboards, you will always want to use San-Serif fonts, that is fonts without the serif accents.  Arial and Verdana are san-serif fonts, and enable an end user to read text on the computer screen much easier than serif fonts (example: times roman).  Serif fonts are best applied in large bodies of printed text for readability.  There is no value in using multiple fonts in a dashboard, so pick 1 san-serif font that works well for you.

typeface

If Information Dashboard Design is any indication of Stephen Few’s opinion, he agrees with Ryan Newman.  The text of the book appears to be in a serif font (except the headers and chapter titles), while every chart and dashboard example he created features a sans-serif font.  Tufte, by the way, seems to publish the text of his books in serif, while his visualizations can be either type of font.

example9solution
Stephen Few

Garr Reynolds, a design and presentation expert, on the other hand, published Presentation Zen: Simple Ideas on Presentation Design and Delivery entirely in sans-serif.  In fairness, sans-serif fonts are simpler, which is in the title of the book.

presentation-zen-book

The U.S. State Department has banned Courier 12 in favor of Times New Roman 14 (both serif), except in the cases of telegrams, treaties, and documents drawn up for the President’s signature, because “[Times New Roman 14] takes up almost exactly the same area on the page as Courier New 12, while offering a crisper, cleaner, more modern look”.

Personally, I tend towards serif - especially Times New Roman - for most text and sans-serif - especially Arial - for visualizations in my own work, but I do like to pepper it with Comic Sans or Wingdings, just to jazz things up (kidding).  There are no definite conclusions to be drawn, really.  Draw your own based on trying to read what you created whilst squinting.

  • Share/Save/Bookmark

1 Comment