Lorna Brown set the challenge this week to build this butterfly chart, so called because of the symmetrical display. I’ve built these before so hoped it wouldn’t be too taxing.
The data set isn’t very verbose, fields for gender & year specific values by age bracket
When connecting to this excel file, you need to tick the Use Data Interpreter checkbox which removes all the superfluous rows you can see in the excel file.
Now, I started by building a version that required no further data reshaping, which I’ve published here. This version uses two sets of dual axis, and I use the reversed axis feature to display the male figures on the left. This version required me to float the Population axis label onto the dashboard so it was central. Overall it works, the only annoying snag is that I have two 0M labels displayed on the bottom axis, since this isn’t a single axis. As a result, I came up with an alternative, which matches the solution, and which I’m going to blog about.
Reshaping the data
So once I connected to the data and applied the Use Data Interpreter option, I then selected the Males, 2021 and Females, 2021 columns, right clicked and selected Pivot
This results in a row for the Male figures and a row for the Female figures, with a column Pivot Names indicating the labels for each row. I renamed this to Gender, and the Pivot Values column to Population.
Building the Chart
On a sheet, add Age to Rows and Population to Columns and Gender to Colour. Set Stack Marks Off (Analysis > Stack Marks > Off).
You’ll see that both the values for the Male & Female data is displaying in the positive x-axis. We don’t want this. So let’s create
Population – Split
IF [Gender]= ‘Males, 2021’ THEN [Population]*-1 ELSE [Population] END
Replace the Population pill on Columns with Population Split
So this is the basic butterfly, but now we need to show the Total Population. Firstly, let’s create
Total Population per Age Bracket
{FIXED Age: SUM([Population])}
And then similarly to before, we need to display this in both directions against the Male and the Female side, so create
Total Population Split
IF [Gender]= ‘Males, 2021’ THEN [Total Population per Age Bracket]*-1 ELSE [Total Population per Age Bracket] END
Drag this field onto the Population Split axis until the two green column icon appears, then let go of the mouse. This is creating a combined axis, where multiple measures are on the same axis.
Move Measure Names from Rows to the Detail shelf and then change the symbol to the left, so the pill is also added to the Colour shelf. Adjust the colours of the 4 options accordingly.
Edit the axis and rename to Population. Then Format the axis, and set the scale to display in millions to 0dp
The tooltip needs to display % of total, so we’ll need
% of Total
SUM([Population]) / SUM([Total Population per Age Bracket])
Add this to the Tooltip shelf
We also need the population figures on the tooltip displayed differently, so explicitly add Population Split and Total Population Split to the Tooltip shelf. Use the context menu of the pills on this shelf, to set the display format in the pane to be millions to 2 decimal places. But this time, once set using the Number(Custom) dialog, switch to the Custom option and remove the – (minus sign) from the format string. This will make the values displayed against the Males which are plotted on the -ve axis, to show as a positive number.
Finally, set aliases against the Gender field, so it displays as Males or Females (right click on Gender > Aliases).
Now you have all the information to set the tooltip text.
Now add a title to the sheet to match that displayed, hide field labels for rows, and adjust any other formatting with row/column dividers etc that may be required, and add to a dashboard.
And that should be it. My published version of this build is here.
This week the whole #WOW2022 crew and myself were lucky enough to be able to attend #data22 – the Tableau Conference in Las Vegas. In the 10 years I’ve been involved in the Tableau Community, this was my very first US conference, and my first opportunity to meet some of the people I engage with on a weekly basis, in person. Meeting all those who have been involved in the WOW challenges over the years and my fellow regular participant, Rosario Gauna, was a big highlight for me.
Erica led the live #WOW2022 session at the conference on Thursday morning (not sure the organisers fully understood the title), with this challenge.
Erica walked through the challenge end-to-end in the session, but I attempted to build out my solution, just as I would at home, so let’s crack on.
Modelling the data
The viz compares actual sales vs target, and an additional data set was provided to store the target data. As a consequence this needs to be combined/joined/related with the actual sales data.
For this you need to download the two excel data sources provided in the challenge – SuperStore Sales and Superstore Category Monthly Sales Targets.
Connect to the Superstore Sales file and drag the Orders sheet into the canvas. The Add a new connection to theSuperstore Category Monthly Sales Targets file and drag in the Sales Targets sheet to the right of the Orders object in the canvas.
This will try to create a relationship between the two objects, but can’t as it needs to be defined.
In the section at the bottom, relate the Category field from the Orders object to the Category field in the Sales Target object.
This creates a valid link between the two objects, but it’s not enough. We also need to relate on a date field. In the Sales Targets object there is a field Month which stores the date on the 1st of a month. So to relate the Orders data we need to
Click + to Add more fields
From the Select a field dropdown on the Orders side, select Create a Relationship calculation
In the calculation window type in DATE(DATETRUNC(‘month’, [Order Date])). This returns the 1st of the month related to each Order Date into a Date rather than Datetime datatype.
Select Month from the Sales Targets dropdown.
Building the bar in bar chart
You can build this type of chart using a dual axis chart, where both axis are set to use the bar mark type, but the sizes of the bars are different. However, if you go down this route you’ll struggle to get the labels right (as the labelling is the trickiest part of this challenge).
Instead for this challenge you need to build a combined axis chart, which is possible since both measures are represented by the same mark type, a bar.
First up, add Order Date to the Filter shelf and select to filter by the Year 2021.
Then add Order Date to Rows and set it to the discrete (blue pill) ‘month’ level (the ‘May’ option rather than the ‘May 2015’ option on the context menu). Add Sales to Columns and change the mark type to bar.
Then drag Sum Targets from the left hand pane onto the Sales axis, releasing the mouse when you see the ‘double column’ green icon appear.
This will result in a combined axis chart being created, and the fields Measure Names and Measure Values automatically added to the viz.
Move Measure Names from the Columns and place on Size. Additionally add another copy of Measure Names onto Colour and adjust accordingly.
On the Analysis menu, select Stack Marks -> Off so the bars both start from 0, rather than being placed on top of each other.
We now need to adjust the bar sizing. Edit the sizes via the Size legend on the right hand side, so the range between the sizes is less than that set.
We now need to work on colouring the Sales bars based on whether they met target or not. For this we first need to work out whether the Target is bigger or not.
Met Target?
SUM([Sales]) > SUM([Sum Targets])
This returns a boolean true/false result. We want this on the colour shelf in addition to the Measure Names field that is already there. If we just drag it onto colour, it will replace Measure Names. Instead, drag Met Target? onto the Detail shelf. Then click on the 3 dots immediately to the left of the Met Target? pill in the marks card, and select the Colour icon.
This has the effect of adding this as an additional colour field, so four options rather than two are now presented in the colour legend. Adjust the colours once again.
We’ve now got the core bar-in-bar chart. We can just add some final formatting to this section.
Format the month axis to set the dates to be abbreviated, and then rotate the labels.
Click the Order Date label at the top of the chart and Hide field labels for columns.
Edit the Value axis and rename the title to Sales ($).
Amend the number format of both the Sales and Sum Targets fields to be a number with 0dp and $ prefix. Add both fields to the Tooltip and adjust so that both values display when you hover over either the Sales bar or the Targets bar.
Labelling the bars
Labelling the bars to match Erica’s display is the trickiest part of this challenge. Bottom line – it shouldn’t be so tricksy but that’s just the way it is until Tableau see fit to fix it in the product.
Anyway, on examining Erica’s published solution before I started, I could see, by hovering my mouse over the viz, that there was a small mark highlighted above the bars that missed target. This provided a clue that there was a dual axis involved (hence the need for a combined axis to display the bars), with a mark plotted at some distance above the existing data…but where…?
Before working that out, I need to build a couple of fields
Difference
SUM([Sum Targets])- SUM([Sales])
which returns the difference between the Sales and Sum Targets for each month. But since I only want the values when the target hasn’t been met, I need
Missed Target Diff
IF NOT([Met Target?]) THEN [Difference] END
which is formatted to $ with 0dp.
I also needed some text to only display when the target was missed
Label Text Missed Target
IF NOT([Met Target?]) THEN ‘Target missed by’ END.
Now to figure out what measure to plot….
In Erica’s solution, she worked out an ‘uplift’ of the Sum Targets value to plot, but only when the target was missed. She did this using quite a complex looking nested LOD calculation (check out her solution for this).
After much deliberation, discussions, trial and error, I finally came up with a alternative that isn’t perfect, but is ‘good enough’ in my book.
Firstly, I needed another measure to plot
Target to plot
IF NOT([Met Target?]) THEN SUM([Sum Targets]) END
This plots the Targets value but only against the months where the target wasn’t month.
I then added this to the Columns shelf and I changed the mark type to circle on the Target to Plot marks card only.
I then made the chart dual axis and synchronised the axis.
Then, remove Measure Names from the Colour and Size shelf on the Target to Plot marks card, and also remove the Met Target? field from the Colour shelf too.
Adjust the Size of this mark to the smallest possible, and change the Opacity (via the Colour shelf) to 0%, so the circle mark becomes invisible except on hover.
Add Missed Target Diff to the Text shelf of the Target To Plot marks card. Edit the text in the label as below
I added a carriage return, formatted the text to 11pts and orangey/red and added 5 spaces to the front of the 2nd line.
I then changed the label alignment so the text was rotated and it was positioned top centre.
Next I added Label TextMissed Target to the Label shelf on the Measure Values marks card.
I edited the text as below, left aligning and again adding 5 spaces in front
Then I adjusted the alignment to be top left
Adding Category to the Filter shelf and testing with the different filters, suggested this technique seemed to work (at least on Desktop).
Last step is to remove the secondary axis, remove the row & column dividers, and hide the ‘nulls indicator’. Then add to a dashboard.
This challenge certainly raised questions over these formatting specifics. As I mentioned above, it shouldn’t be that hard to add a label that looks like this – left aligned and positioned above the bar. However the product doesn’t let you achieve this easily as yet, which is disappointing and certainly confusing to new users of the product – it seems such a straightforward requirement after all.
If I was doing this for my own work, I’d have kept with whatever options a single label allowed (where the text was all right aligned), or placed on a single line. The impact on performance of a complicated calculation along with the maintainability of a dashboard using it, are important considerations when deciding what’s ‘good enough’, and are arguments that should be made if a client/user insists. After all, what real benefit does this particular format provide over