Pages

Friday, January 30, 2015

Viz Makeover | A Better Column Chart

This "Sports Chart of the Day" from Business Insider showed up in my data viz blogroll the other day...


My first thought is, "Fantastic! A column chart. Let's see what the story is."  My second thought is, "Maybe if I stand on my head, I'd be able to read this better."  Or maybe I can rotate my monitor 90 degrees counterclockwise to get something like this...


I guess that's a little bit better.  Now I can actually read the labels.  But I'd still rather see it in descending order and the bars should be extending to the right of the baseline.  Even if I had the ability to rotate the graphic on an imaginary vertical axis, it's still going to result in something to the left or right of upside down or backwards.  In other words, it needs a makeover.

This one is pretty simple.  After all, it's still a column chart and that's a good starting point to show a simple ranking of categorical data.  But let's first look at the data to determine what's important for the story.

According to my strict interpretation of data visualization best practices, here's where I land...


The original graphic has too much "fluff" and it fails to showcase the data.  In this case, it's the descending rank of the columns that is the most important story.  The question I'm asking is, "What teams have the highest rating and what teams have the lowest?"  Along with, "What is the average rating? And what teams are above and below the average?"  Ultimately, it's the length of the bars that hold these answers.  That's what we need to focus on.

Okay, maybe I'm a bit aggressive in stating that the data labels and the title are junk.  In fact, they are a necessary element to the story.  But not important enough that they should occupy 70% of the visualization.  They're too bold, too large, and unnecessarily precise.

Enough about what's wrong with the original, let's make it more effective.  Here's my makeover followed by an explanation of each of my changes...



Title - there's no reason for your title section to take up the top quarter of your visualization.  Left justify it and use an action title to frame the context of your story.  I've also rolled the original subtitle into the title.  Font size shouldn't be too great that it draws attention away from the real data.

Columns - this is your focus - the real data, your data ink.  Display this prominently and push the non-data ink to the background.  Using color encoding from my action title, I'm focusing my audience on the two data points that I want to emphasize (Seattle and Boston).  In this case, there's really no meaningful purpose in assigning different colors to these items (as we see in the original).  I also added an average rating line to aid in storytelling.  Now we can see which teams are above and below the league average.

Data labels - I chose to forgo a visible y-axis in this case and instead I've labeled each of my columns with the rating value.  I could just as well use the y-axis and omit the labels, but with 32 columns it would be difficult to gauge the values for the columns at the far right of the visual.  While including the data labels, I'm still pushing them to the background by using a smaller font size.  And, most importantly, the labels are horizontally oriented.  Avoid vertical orientation of labels - don't make your audience stand sideways to read your data.

x-axis - in the original chart, the full team names are used.  This forces the labels to be vertically aligned.  That's a violation!  It's not necessary to list the full team names here.  Keeping in mind that your audience is likely a sports fan, and even more likely an NFL fan, they're going to recognize abbreviated team names or even team helmets.  Use what your audience already knows to your advantage.  I chose a combination of helmet icons and abbreviated team names along the x-axis instead.  This allows me to horizontally align the labels and provides enough context for a reader who may not recognize the team helmet logos.  In many cases, I'd omit the logos but since this is a blog post/journal article (as opposed to business intelligence), I think the logos add a bit to the visual appeal without adding too much "chart junk".

Brand logo - I get that this is the "Sports Chart of the Day" and it's published by Business Insider.  How can I miss it?  It's the first thing I see in the title and it's in all caps in the lower left corner.  These are non-essential elements.  Mute them and push them to the background.

Source - doesn't need to be in all caps, doesn't need to be bold font, and doesn't need to be sized larger than your data labels.  Again, mute it and leave it in the lower right corner.

Notice the result we get by making these five or six small changes.  We're simply increasing the volume of the data and muting the volume of the background noise.

In closing, bar charts are fantastic storytelling tools.  But like all forms of storytelling, there's an art and science required to really nail it.  It's not a complicated art form, it only takes an ounce of effort and skill.  Next time you build a bar chart, use the suggestions above and you'll see it's quite easy.  And you'll get the Bar Chart Guy stamp of approval.

Thursday, January 29, 2015

Divergent Stacked Bars & Survey Data

Last week, a colleague asked me to produce a visualization based on some recent survey data he had collected.  Prior to this project, I had only briefly worked with survey data for a small-scale project at work.  So I dug around in my Evernote archive and found some old articles that I had saved on how to effectively display survey data.

Here's the result I came up with...



As usual, I landed on a bar chart as the basis for my display.  Although I'd refer to this as a "divergent stacked bar chart" rather than a typical bar chart.

I had to do a bit of manual reshaping of the data in Excel to get it to play nicely with Tableau.  In particular I assigned a positive/negative string to each of the responses.  I then created a calculated field in Tableau that allowed me to plot the negative responses as a negative value, thus orienting them to the left of the base line.  Plotting the "negative" responses to the left of zero and the "positive" responses to the right of zero are what make this a divergent stacked bar chart.

I prefer the divergent stacked bar visual for this type of survey data because it allows the audience to easily compare the positive and negative responses.  It also clearly presents some relationships and insights.

Take for example, the relationship between age and interest in politics.  The divergent stacked bar approach very clearly shows that interest in politics increases with age, while increasing even more in respondents over the age of 65.

A typical bar chart (or column chart) as seen below doesn't communicate that message quite as effectively...


The divergent stacked bar approach also has the advantage of offering a large canvas for all of the demographic segments that are usually associated with survey data.  In this case I'm able to very succinctly communicate all of the responses from each demographic segment.

Once again, we see the power of the bar chart, or more appropriately, the effectiveness of assigning a value to the length of a bar.  It's not necessary to over-complicate a visualization that's tied to a somewhat complex data set (like survey data can often be).  Keep it simple, and make the bars work for you!

An interactive version of my survey sample can be seen here.  And a wealth of information on divergent stacked bar charts and communicating Likert survey results can be found here on Steve Wexler's fantastic Data Revelations blog.

Tuesday, January 20, 2015

Viz Makeover | Beaver County School Financials

The front page story in my local newspaper this past weekend was focused on the ongoing teachers' contract dispute in one of our local school districts.  The story itself is worthy of front page status (at least in our small community).  It was well written, well documented and thoroughly researched.  So kudos to The Times for generating a good front page story.

I was, however, less impressed with the graphic they included as a supplement to the story.  A few pages in, I discovered this....



The graphic they chose was an area chart where each school district is listed alphabetically along the x-axis and three different dollar values (2012-13 Budget, 2012-13 Local Revenue, 2012-13 State Revenue) are plotted along the y-axis.  Each dollar value is encoded with a different pattern to differentiate the values.

Many readers can probably guess where I'm headed with this critique, but let me preface the following with a bit of background.  The Times just recently started using Tableau Public to add a data visualization layer to their stories.  I noticed this development a few months ago and applauded them on Twitter at the time for taking that step.  A short time later I took a jab at them for a pie chart viz they included in the paper edition (which did look slightly better in color online, but hey it's still a pie chart).  Since then they've done a few more stories which have included some Tableau graphics.  So cheers to the Times for making an effort to use data visualization as a supplement to their journalism, but jeers to the Times for not making an effort to doing their data viz the right way.  Anyway, continuing...

My beef with the Times' chart choice is that area charts should be used to show time-series relationships.  Connecting the school districts using a line suggests that there is some kind of trend over time.  In this case, there isn't.  The area chart, quite simply, is a poor choice.

Using a powerful tool like Tableau unchecked - without some understanding of data visualization best practices - results in output like we see here.  Tableau does nudge users in the right direction of choosing the correct chart type in the 'Show Me' dialog when it indicates that area charts should include a 'Date' dimension.  But that wasn't enough here.  Instead we see a graphic that was likely the result of something along these lines: "I have this great data set and I want to use it to supplement my story.  I haven't used an area chart before and it looks pretty cool, so I'll choose that one and go with it."

Now I very rarely use area charts in my own work, but I do know how and when they should be used.  And this is absolutely not a use case for an area chart.  So I took a few minutes to grab the same data that the Times used and did a simple makeover as, you guessed it, a bar chart.  With a couple slight modifications to help add some of the context of the original story.

Here's a link to my interactive redesign...


With the bar chart on the left, I'm showing the ranking of schools from those with the greatest surplus (under budget) to those that with the greatest deficit (over budget).  The "bar in bar" chart on the right adds some context showing each district's 2012-13 Budget and 2012-13 Total Revenue.

I don't routinely deal with financial data, but it feels like we're making certain assumptions about this data that could be misleading.  The data that the Times used seems to indicate that school district revenue comes solely from state and local taxes - maybe that's true, maybe it's not.  But that's what was included in their chart so that's what I included in my makeover.

I'm sure there are many other ways this data could be represented - a scatter plot immediately comes to mind.  But for this use case, choosing between an area chart and a bar chart - the bar chart wins every time.

The intent of this particular exercise is not to indicate that a bar chart is the only way to represent this data.  Instead, it's to indicate that an area chart is not the way to represent this data.  And, maybe more importantly, do some homework before choosing your next chart type.

The source data I used came from this workbook


Monday, January 19, 2015

Viz Makeover | Pittsburgh Stock Report

I'm a big fan of the Pittsburgh Business Times website and blogs as they provide great content on Pittsburgh businesses, and the articles are very well researched and documented.  My one complaint is that they're always publishing these slideshows that force me to click through dozens of pages of content to get the full story.

A couple weeks ago, they published a new slideshow outlining the 2014 stock performance of publicly traded Pittsburgh-area companies.  Again, I had to click through over 40 slides to get the story.  Throughout my clicking I was trying to remember who the winners and losers were, and I found myself clicking back to previous slides so that I could recall the performance of the different companies that I wanted to compare.  The slides have valuable content but it's almost impossible to compare the businesses using the slideshow format.

So I did a makeover.  And, you guessed it, I did it as a bar chart.

My version is an interactive display that lists the forty-five area businesses from best to worst performers.  I used some simple annotation to provide context and I used color encoding to indicate the different sectors.

In the end, I think my makeover offers an intuitive user experience and it's quite simple to view the results and make comparisons when everything is displayed on a single page.

Here's one of the "before" slides...




Welcome

Hello and welcome to the Bar Chart Guy blog!

My name is Harley Ellenberger.  But on this blog I'll assume the name of the Bar Chart Guy.

To start, a bit about me.  First of all, I wouldn't describe myself as a very creative person.  My drawing skills are limited to stick figures and the rough, boxy sketches that I use to plan out my dataviz work.  Power Point is my tool of choice for lumping together the limited graphic design elements that I use in my work.  Oh, and I'm color blind.  

But what I lack in the creativity area, I make up for with my right brain traits.  I've always been pretty good with numbers (but by no means am I a mathematician).  As a kid, I had an elaborate ranking system for my baseball card collection.  I had all of my best cards ranked (on graph paper) by their value, and when I'd trade or acquire new cards I'd start a new sheet.  I didn't plot the values using the length of a bar, but it was a good way of keeping track of my most valuable cards.

So, I like numbers, I can draw lines and boxes, and I have trouble seeing variations in color.  It's not a coincidence then that my go to chart is the bar chart.  They play exactly to all of my strengths.

I've designed hundreds of different interactive charts and dashboards both for fun and for my job.  And I'll bet that nearly every one of those included a bar chart of some type.  I'll mix it up from time to time - maybe throw in stacked bar chart here or a divergent stacked bar chart there.  Call them what you like, the premise is the same.

Show me a pie chart and I'll turn it into a bar chart.  Give me a scatter plot and I'll reshape it into a bar chart.  Give me a poorly designed bar chart and I'll turn it into an effectively designed bar chart.  If a bar chart fits the bill, then a bar chart it will be.

On this blog, I'll be taking different visuals that I find on the web and redoing them as bar charts.  I'll list the original sources of the visuals and data that I use.  And I'll show my makeover of the original along with an explanation of why I think a bar chart is the appropriate choice.