Waterfall charts - #SWDChallenge
Updated: May 11, 2018
The reason I enjoy participating in the Storytelling With Data Challenges is due to the simplicity of the challenge; there is no complex data set, simply an instruction to create a single chart type. This allows you to focus on learning the chart type rather than learning the data and, with over 50 people doing the same exercise, you are able to also learn through the experience of others.
The May 2018 Storytelling With Data Challenge was to create a waterfall chart, a type of chart that I have never previously used. Given my lack of experience with this chart my first task was to investigate how the chart is traditionally used and how it is interpreted by the reader. Wikipedia states:
A waterfall chart is a form of data visualization that helps in understanding the cumulative effect of sequentially introduced positive or negative values. These intermediate values can either be time based or category based.
When I googled examples of the chart type I was inundated with charts based on financial data. Whilst I acknowledge finance is a very good use case for a waterfall chart I wanted to pick a topic that I was more familiar with and, for those of you who know me, sports is one of my passions.
This month it was actually a submission by Rodrigo Calloni that inspired my choice of theme; Rodrigo created a fantastic example of a waterfall chart that visualised the scoring progression of the 2018 Super Bowl. I really liked how Rodrigo took the concept of cumulative effect of positive values and applied it to a sports data set (a team scoring in the Super Bowl being the positive value and the cumulative effect being the gap of the winning team after each score). You can check out Rodrigo's viz by clicking on the image of his viz.
So onto my viz; I knew from Rodrigo's viz I wanted to visualise a sports data set using a waterfall chart. I asked myself what sports data could be used to demonstrate changes over time, whilst ensuring I focused on the cumulative effect of the change; then it hit me, athletics world records change over time. Often the timeline of world records are visualised through bar charts or line charts and whilst these easily portray the overall trend of the world record they can be difficult to identify the 'size' of each step change when a new world record has been set. Using this concept I settled on visualising the 100 metres world record since automated timings were introduced in 1964.
My waterfall chart - Techniques used
When creating my waterfall chart in Tableau I used a number of design principles to keep the viz 'clean':
Hiding the axis
Cole had previously commented that if the value of the data item, in a waterfall chart, was included as a data label it enabled the axis labels to be hidden. I applied this technique to label both the year the world record was set and the world record time for each new world record.
Using a dual axis gantt chart to 'position' data labels
The risk of visualising both the year of the world record and the world record time was the data label could become crowded, taking the focus of the viz away from the waterfall bars. To overcome this I created a dual axis gantt chart within Tableau. This enabled me to create a traditional waterfall style bar for each data point whilst positioning the year of the world record above the bar and the world record time below the bar.
Positioning images to direct the visual flow
During a recent UK Healthcare Tableau User Group the legend that is Jonni Walker shared how he regularly uses images to direct the visual flow of a visualisation. I decided to use this technique to highlight the decreasing values of the waterfall chart. I did this by including an image of Usain Bolt's lightning strike pose, with his arm pointing towards the last value in the chart, which just so happens to be his current world record time.
My waterfall chart - Reflection
Whilst I was really happy with the waterfall chart I had created there was one aspect of the visualisation that was bothering me. I had visualised each new world record as a discrete value as opposed to a continuous time series. The advantage of this approach was it replicated a traditional waterfall chart where by every data point was an equally spaced and sized waterfall bar. The disadvantage was the visualisation failed to portray the length of time between world records. For example, in 1968, Jim Hines was the first male sprinter to break the 10 second barrier and Hine's world record remained in place for 15 years; and yet in my waterfall chart the spacing between data points was consistent, regardless of the time the world record stood for.
This irritation got me wondering if a waterfall chart could be adapted to visualise both cumulative change over time and the length of time between data points. Having asked Cole if she was aware of any examples of this being done before, and being told no, I decided to attempt to create a second version of my waterfall chart based on this alternative approach.
My revised waterfall chart - Continuous time series
Before I share my revised waterfall chart I will make a confession, I did not use any complex Tableau tableau calculations to visualise the time between world records, instead I adapted the data set in Excel, sorry!
Dual axis gantt to replicate a waterfall bar and 'connecting line'
I wanted the primary focus of my revised waterfall chart to still be the margin by which each world record reduced, when compared to the last. As such the core concept of using a gantt chart to visualise the reduction through a waterfall bar remained; however in my revision I adapted the secondary gantt chart to visualise a connecting line between world records.
This connecting line enabled the waterfall chart to visualise the change in world record times and the length of time the world record stood for.
I think the revised waterfall chart is potentially useful when continuous data is needing to be visualised, for example, time, what do you think?