Mekko charts & Offset Labels

This week’s WOW challenge for community month, saw Sean partner with Tableau Visionary and ‘Hall of Famer’, Linsdsay Betzendahl to deliver this double challenge – creating a Mekko chart with staggered labels AND a stacked bar chart with staggered labels.

There was quite a lot going on in this challenge, and I had utilised all the concepts in the past, but did still need a refresher to remind how to get things done. So I did draw on the links provided in the challenge, as well as blogs I’ve written for other WOW challenges, specifically

Modelling the data

The link in the challenge downloaded a CSV file to connect to. Once connected, the instruction was to pivot the data, so all the ‘types of death’ columns were transposed from rows into columns. In the data source pane, multi-select all the ‘Total Dead xxxx’ columns then right-click and select pivot

I then renamed the Pivot Field Names column as Disaster Type Raw and Pivot Values as Decadal Deaths.

In the Data pane, I then created a group against the Disaster Type Raw field, called Disaster Type. Every value was grouped, even if it only contained a single entry, just so I could display the name of the disaster in the way that matched the solution

Building the calcs for the Mekko chart

A Mekko chart is based on % – in this instance, the height of the chart is split based on the % of deaths by type per decade, while the width is based on the % of total deaths per decade. For this we’ll need some table calculations, so building out a tabular view to check the values is always something I find very valuable.

Change the Year field to be discrete, then add to Columns. Add Disaster Type to Rows and Entity to Filter and restrict to World. Add Decadal Deaths to Text.

To get the % of deaths by type per year, create

% Decadal Deaths

SUM([Decadal Deaths]) / TOTAL(SUM([Decadal Deaths]))

Add this into the table and then adjust the table calculation so it is computing by Disaster Type only, which essentially means based on the layout of this table view, that each column should be adding up to 100%

This is giving us the value we need to plot on the Y-Axis of the chart (ie the height)

To determine the value to plot on the X-Axis, we need several calculations. First we need to know the total deaths per decade

Total Deaths per Decade

{FIXED [Year],[Entity]:SUM([Decadal Deaths])}

Add this into the table. The value should be the same for every row for the same year.

Then we need to know what this value is as a % of all the deaths across all the decades

% Total Deaths per Decade

SUM([Total Deaths per Decade]) / TOTAL(SUM([Total Deaths per Decade]))

Add this into the table, and adjust the table calculation so that it is computing by Year, which means the values per row should add up to 100%

To build a Marimekko chart, which is a variable width chart, we need to use the cumulative value of this %, so we plot points from the start up to 100%. So we need

Running Sum % Total Deaths per Decade

RUNNING_SUM([% Total Deaths Per Decade])

Add this into the table, verify the table calc is computing explicitly by Year as described above for both the nested calculations, and we should see the values increasing up to 100%

Now typically this would be enough for the chart and is what is mentioned in the referenced blog posts for building these charts. However, due to the staggered labelling requirement, we need to go a step further, as the value we plot on the X-Axis needs to be used for both the Mekko chart and the ‘pointers’ associated to the staggered labels. As these labels are centred, we need to plot our point midway

Mekko – Plot Point X Axis

[Running Sum % Total Deaths Per Decade] – ([% Total Deaths Per Decade]/2)

ie take the running sum value, but subtract half of the % value for the decade.

As before add to the table, setting all nested table calculations to be computing by Year.

Now we have all this, we can start to build out the chart

Building the Mekko chart

On a new sheet, add Entity to Filter and restrict to World. Add Year to Detail and Disaster Type to Colour, and adjust colours to suit.

Then add Mekko – Plot Point X-Axis to Columns. Adjust the table calc, so both next calculations are computing by Year only. Then add % Decadal Deaths to Rows and adjust the table calc to be computing by Disaster Type only. Change the mark type to bar and adjust the sort on the Disaster Type field to sort by field Decadal Deaths ascending

Then add % Total Deaths per Decade to Size. Adjust the table calc so it is computing by Year only. Then click on Size and change the option from Manual to Fixed and align Centre, so the width of the bar is stretched equally, either side of the central plot point. Add a white border via the Colour shelf.

Add Decadal Deaths and Total Deaths per Decade to Tooltip and update as required.

Add Disaster Type to Label, then adjust the Label settings so the label is only displayed based on the maximum value of the Year field, and match the mark colour.

At this point, while the labels are displayed, they aren’t in the required location. To resolve this, start by extending the X axis to be fixed to end at 1.2 (right click axis > Edit)

This extends the axis, giving some ‘space’ in the chart for the labels. Trying to adjust the alignment of the labels doesn’t work though, so move them manually. Click on the label so the cursor changes to a cross, and then drag the label to the side to reposition it.

Do this for all the labels displayed in the solution, doing your best to align them vertically. Once done, then in the label settings, uncheck the ‘allow labels to overlap other marks’ option, so the other values disappear.

Now we have the Mekko chart, we now need to handle the staggered labelling.

Mekko – Label Position YAxis

IF INDEX()%2 = 1 THEN -0.1 ELSE -0.3 END

We’ll index the year from 1 to 13, and every odd number index will have the label positioned at -0.1 and even at -0.3

Add this to Rows which will create a 2nd marks card. Remove all the fields except Year and Total Deaths per Decade from the card.

Adjust the table calc to explicitly compute by Year, and then change the Size from Fixed to Manual and reduce the slider to as small as possible. You should now have your label ‘pointers’

Move the Year and Total Deaths per Decade pills onto the Label shelf, then set the Label settings, so All labels are shown, they can overlap other marks, and adjust the alignment, so they’re not skewed (you may need to find you have to click the sideways position first before you can get the alignment working)

Adjust the formatting of the label, then edit the Mekko – Label Position Y Axis axis so that it is fixed to start at -0.5, and the title and tick marks are removed, so the axis looks to disappear.

Finally tidy up by removing all gridlines, row and column dividers and hiding the Mekko – Plot Point X Axis axis.

Add to a dashboard, and include the requested citation to the data source. I added this within a floating text box with a show/hide button.

Building the calcs for the Bar chart

The bar chart employs similar principals as the Mekko chart, we need to calculate a position to plot the point on the X-axis, but the bar has a fixed height and not split. Let’s start with getting the calcs we need again.

Build a table with Entity on Filter (filtered to World), Year on Rows and Disaster Type on Columns. Add Decadel Deaths onto Text and then add % Decadel Deaths into the table, and adjust the table calc so it is computing by Disaster Type. Due to the layout of this table, the values in each row now add up to 100%.

The % Decadal Deaths defines the ‘width’ of each segment in the bar chart, so one again we need a cumulative value to help determine the plot point.

Running Sum % Decadal Deaths

RUNNING_SUM([% Decadal Deaths])

Add to the table and ensure all the table calcs for the nested calcs are all computing by Disaster Type.

And as before, as we need a ‘central’ point to plot, create

Bar – Plot Point X Axis

[Running Sum % Decadal Deaths] – ([% Decadal Deaths]/2)

Add to the table and ensure all the table calcs for the nested calcs are all computing by Disaster Type.

There is an additional requirement for this chart, to restrict what is displayed based on the Year selected. Create a parameter based off of the Year field

pYear

integer parameter, defaulted to 2020 (formatted without thousand separators), that lists the values

Then create a field

Show Last 3 Decades

[Year]>= [pYear]-20 AND [Year]<=[pYear]

and add to the Filter shelf and set to True.

For the label pointer position, we also need a similar field as before, but only want labels to show when the % is over 4%, so create

Bar – Label Position Y Axis

IF [% Decadal Deaths] > 0.04 THEN
IF INDEX()%2 = 1 THEN -0.1 ELSE -0.3 END
ELSE NULL
END

format to show 2dp, then add to the table, and ensure all table calcs are computing by Disaster Type.

Values will only display if the % Decadal Deaths value is >4%, and the value alternates based on the index position.

Building the bar chart

The steps are very similar to the Mekko chart, except this time, we need the Year dimension on Rows and we can ‘hard code’ a ‘y-axis’ by double clicking into Rows and typing MIN(0.5). This axis is then fixed from 0 to 1 to provide some spacing. Bar – Plot Point X Axis is on Columns and % Decadal Deaths is on Size with the size once again Fixed and centred. This time to Disaster Type field is sorted by Decadal Deaths descending.

Add Bar – Label Position Y Axis to Rows to create the 2nd marks card, and again apply similar steps as above to end up with labelled pointers. I set the axis of this measure to be fixed from -1

The you just need to tidy up the formatting by removing the gridlines, axis lines, row/column dividers and hiding the various axis. Then add to a dashboard and show the parameter.

My published vizzes are here:

Happy vizzin’!

Donna

Can you build a funnel chart?

For my first #WOW2023 challenge, as an ‘official’ part of the #WOW crew, I challenged everyone to build this funnel chart. Inspiration came from a colleague who posed me the challenge of building something similar, ensuring the bars were a single entity when clicked on. In that example we’d just used Superstore data, but it wasn’t really reflective of how a funnel would typically work, so I sourced some dummy sales opportunity data and made the challenge more relevant to a real business scenario.

Building the Funnel Chart

I saw this as a table calculation exercise, though I have subsequently seen solutions that manage it all through LODs. I’ll obviously blog my solution.

Let’s start by getting the core data into a table, so we can see what we’re aiming for

The total value of all the Opportunities is $91.4M. Based on the sales pipeline process and the assumptions documented in the requirements, all of that value will have passed through the first stage – Prospecting.

There is $6.1M still sitting in Prospecting, so when we move onto the next stage, stage 2 -Qualification, the total amount that has passed through will be $91.4M – $6.1M = $85.3M

As there is $6.5M sitting in Qualification, as we move to stage 3, the total amount that has passed through will be $85.3M – $6.5M = $78.7M. A similar pattern repeats for Stage 4, but for Stage 5, Closed Won, the value that has ‘passed’ through that stage, is equivalent to itself.

If we amend the Value field to be a Running Total quick table calculation (and add Value back into the view too), we get

The Running Sum is computing down the table , with the value in each row being the sum of the preceding rows. But this is opposite to what we want. We want Stage 1 to have the biggest value, the total, and the values to get smaller as you look down the table.

To resolve this, edit the table calculation and change the Sort order to custom sort by Minimum Stage No Descending

This results in the running total to sum from the last row (Stage 6) up to the first row (Stage 1), and you can see the value against Stage 1 matches the total for the Value column.

However, this still isn’t quite what we need. While we need to consider the amount for Closed Lost in stages 1-4, we shouldn’t be including it in stage 5, Closed Won.

While excluding the Stage 6 from the view (Stage No on Filter shelf set to Exclude 6) , gives the correct value for Stage 5, we’ve now lost the amount for Closed Lost in the other rows

So, we need some additional calculations to help resolve this.

Amount Lost

{FIXED : SUM(IF [Stage No]=6 THEN [Value] END)}

This just captures the amount of Stage 6 and ‘spreads it across every row of data.

Let’s now get our running total field, baked into it’s own calculation, as this will help with other calcs

Cumulative Value Per Stage

RUNNING_SUM(SUM([Value]))

Replace the Value table calc field with this one, ensuring the table calculation settings are identical to those described above.

Now let’s add the Amount Lost onto the Cumulative Value Per Stage unless we’re at Stage 5, Closed Won

Total Amount Per Stage Inc Lost

IF MIN([Stage No]) = 5 THEN [Cumulative Value Per Stage]
ELSE [Cumulative Value Per Stage] + SUM([Amount Lost]) END

Pop this into the table (checking the table calc settings are as expected) and the total amount now stored against Stage 1, is the total value of all the Opportunities.

Ok, so now we know the values, how do we plot this in such a way that it represents a funnel. I chose to do this by first working out the proportion of the total value that the ‘reverse’ cumulative value represented.

Total Value

{FIXED:SUM([Value])}

meant I could determine

Proportion of Total

[Total Amount Per Stage Inc Lost] / SUM([Total Value])

I formatted this to percentage with 0 dp.

Adding these into the table

If we plotted this information on a bar chart, we’d get this

but we need to ‘centre’ each bar in relation to the first bar to give us the funnel effect. This means we need to shift each ‘bar’ to the right, by half the amount that is the difference between 100% and the Proportion of Total.

Position to Plot

(1 – [Proportion of Total])/2

We can’t represent this with a actual bar chart, but instead we can use a Gannt chart.

  • Stage No to Rows
  • Stage No to Filter and exclude Stage 6
  • Stage to Detail
  • Position to Plot to Columns, adjusting the table calculation as previously described
  • Change mark type to Gantt bar
  • Add Proportion of Total to Size (and verify the table calc is set properly)

Hey presto! A funnel!

To finalise

  • add Stage to Label and align centrally. Make the font bold and match mark colour.
  • add Cumulative Value per Stage to Colour and reverse the colour range (via the Edit Colours dialog)
  • Widen each row a bit.
  • Fix the Position to Plot axis to start at 0 and end at 1 to ensure the funnel is centred exactly.
  • Add Value and Total Amount Per Stage Inc Lost to the Tooltip and adjust the tooltip text accordingly (don’t forget to sense check the table calc settings!).
  • Hide the Position to Plot axis, and the Stage No column. Remove all gridlines, zero lines, axis rulers /row/column banding etc.

Building the KPIs

We need a few calculated fields to store the required numbers

Won

{FIXED:SUM(IF [Stage No]=5 THEN [Value] END)} / [Total Value]

formatted to a percentage with 0 dp.

Lost

{FIXED:SUM(IF [Stage No]=6 THEN [Value] END)} / [Total Value]

formatted to a percentage with 0 dp.

Outstanding

1-([Lost] + [Won])

formatted to a percentage with 0dp.

On a new sheet add Measure Names to Columns and Measure Values to Text. Add Measure Names to Filter and select Total Value, Won, Lost, Outstanding. Manually re-order the columns.

Format Total Value to be a number displayed in millions with 1 decimal place & prefixed with $.

Add Measure Names to Text and adjust the text as required. Align the text to be centred.

Remove the row banding, and hide the column heading.

Then arrange the two sheets on the dashboard, ensuring both are set to fit entire view.

My published viz is here.

Hope you enjoyed this

Happy vizzin’!

Donna

Can you create a Waterfall Pipeline using Salesforce data?

For the Salesforce Dreamforce (#DF22) conference, Lorna set this challenge based on using Salesforce data. You could access the data either by creating a Salesforce Developer account, or using the provided csv data set. I chose to use the latter.

I’m not overly familiar with the SF data, so it’s possible in the course of this blog, I may have missed a field in the data set that could have been used instead of whatever technique I describe. Feel free to let me know in the comments if this is the case.

Setting up the calcs

After connecting to the data, the first step was to amend the amount based on whether the Stage Name was ‘Closed Lost’ or not.

Revised Amount

IF [Stage Name]=’Closed Lost’ THEN -1 * [Amount] ELSE [Amount] END

I set the format of this field to be Currency £M with 1 dp.

Now the crux of the waterfall, is that the position of the mark that needs to be plotted it is based on the cumulative sum of the values displayed. So for this we need to create a table calculation

Revised Amount (Running Total)

RUNNING_SUM(SUM([Revised Amount]))

I set the format of this field to be Currency £M with 0 dp.

Let’s put the data into a table to see what’s going on.

The table calculation of the 2nd column (Revised Amount (Running Total)) is computing ‘down’ the table, so for this to work as we require the order of the rows is important. I just manually sorted by dragging each Stage Name into the relevant position. and then added a grand total.

With these 2 fields, we can now build the basic waterfall.

Build the Waterfall

I chose to duplicate the above sheet and then move the pills around as follows

  • Move Stage Name to Columns
  • Move Revised Amount (Running Total) to Rows. Amend the table calculation setting so it is set to explicitly compute using Stage Name.
  • Change the Mark Type to Gantt Bar
  • Move Revised Amount from Text to Size, then double click on the Revised Amount pill so it becomes editable and add * -1 to the end, to invert the value

  • Add Row Grand Total
  • Add Stage Name to Colour and manually adjust each colour accordingly

  • Add Revised Amount to Tooltip and adjust.
  • Edit Revised Amount (Running Total) axis (right click axis -> Edit Axis) and amend axis title
  • Format the Grand Total label (right click label -> Format) and amend says ‘Total’ and is bold.
  • Hide the Stage Name label at the top of the chart (right click label -> hide field names for columns)

Labelling the bars

Labelling the bars isn’t quite as simple as you think it might be, as the position of the label differs depending on whether the Revised Amount is positive (at the top) or negative (at the bottom).

So we need to create some new calculations

LABEL – +ve Revised Amount

IF SUM([Revised Amount])>=0 THEN SUM([Revised Amount])

formatted to £M to 1 decimal place

LABEL – -ve Revised Amount

IF SUM([Revised Amount])<0 THEN SUM([Revised Amount]) END

formatted to £M to 1 dp

Adding these to the tabular layout we had earlier you can see how these fields are behaving.

Switch back to the waterfall chart, and add LABEL – +ve Revised Amount to the Label shelf. Format the label to be smaller font (I chose 8pt), bold and explicitly aligned to the top.

Now duplicate the Revised Amount (Running Total) pill that is on Rows to add another instance directly next to it. I do this by holding down ctrl as I then click and drag the pill. This will create another axis.

On the second marks card, remove LABEL – +ve Revised Amount from Label and add LABEL – -ve Revised Amount instead. Adjust the label so it is aligned at the bottom instead.

Now make the chart dual axis and synchronise the axis.

Then hide the right hand axis, and remove all divider and gridlines.

Then add to the dashboard. I used a vertical layout container so I could add blanks of 1 pixel in height with a black background colour to present the this black lines separating the header and footer text.

My published viz is here.

Happy vizzin’!

Donna

Let’s build a Marrimeko Chart!

Sean set the challenge this week and went retro, revisiting a challenge from 2017 originally set by Emma Whyte to build a Marrimeko chart.

Hang on… a what chart? Marrimeko…???

A good place to understand what a Marrimeko chart actually is, is this blog post by Tableau Visionary & Ambassador, Jonathan Drummey. This leads onto this post which explains the steps to help build.

The main concept of this type of chart is to show part-to-whole relationships across two variables at once.  In the above for each job title, we have a split vertically based on proportion by gender; but the proportion of people with each job title is also being represented horizontally by the width of the ‘bars’. Both the x and the y axis are up to 100%.

I was actually around when the original challenge was set, so have my solution from then already on my Tableau Public profile. But it was a long time ago, with an older version of Tableau, so I built this from scratch (using the referenced blogs as a quick refresher). This blog will take you through the steps I took (which actually didn’t end up that different from the last time).

The data

The data set isn’t very large and contains all the information we will need, but not quite in the structure we might expect

The Sub-Type = Total field contains the % we will need to define how the width of the bars should be split (all the Total values add up to 100% or just about).

Whereas the combination of the Sub Types of Male and Female define how the height of the bars should be split (Male + Female = 100% for each Job Type).

So when we come to build, we ideally want to ‘filter out’ the Sub-Type = Total field, but we still need to retain this information to build the viz.

We need to get the % values associated to each Total row to be reflected against the Male/Female rows.

We’ll create an LoD for this

Job Type Percentage

{FIXED [Job Type]: SUM(IF [Sub-Type] = ‘Total’ THEN [Percentage]END )}

format this to % with 0 dp

If we pop the data out into a tabular format as below and add Sub-Type to the Filter shelf so it excludes the Total row, we can see we have essentially transposed the values which were stored against the Total row into it’s own column.

Building the Marrimeko chart

As stated above, the Job Type Percentage field defines the width of the bars we’ll be displaying. But before we can get to the point, we also need to define where on the x-axis each mark should be positioned. This too is based on Job Type Percentage, but is the cumulative value.

On a new sheet add Job Type to Rows, Filter by Sub-Type to exclude Total, and add Job Type Percentage to text. Sort by Job Type Percentage descending.

Now add a Running Total Quick Table Calculation to the Job Type Percentage field, and the cumulative values will display.

It is these values we need to plot on the x-axis.

Duplicate the sheet.

Edit the table calculation against the Job Type Percentage field, so that it is explicitly set to compute using Job Type. This is important to ensure the calculation is retained regardless as to where we put the pill on the canvas.

Now move this pill from the Text shelf to Columns, and change the mark type to Gantt Bar. Move Job Type from Rows to the Detail shelf. Reapply the Sort to the Job Type field so its sorting by Job Type Percentage descending

The markers for the Gantt should all be positioned at the correct position.

Now add Percentage to Rows and change the mark type to Bar.

Add Job Type Percentage to the Size shelf, then click on the shelf, and change to be Fixed and aligned Right.

Now add Sub-Type to Colour. Manually drag the values in the COlour legend to re-order (so Male is listed first) and adjust colours to suit.

Ta dah! The crux of the chart.

Now to add the ‘bells and whistles’.

The chart is labelled, but only for the Entry Job Type. We need a calculated field to manage this.

Label: Sub Type

IF [Job Type] = ‘Entry’ THEN [Sub-Type] END

Add this to the Label shelf, and set to Show Mark labels. If it doesn’t show (like it didn’t on mine), change the field to be an attribute (I need to do a dig on why this is required… I think it’s something to do with the table calcs….).

I tried to use the alignment setting of the label to set to ‘top left’, but while they would left align they wouldn’t move to the top. I manually moved them instead – simply click on the label and when the cursor changes to a cross, drag the label to the desired position.

Do this for both labels, and adjust the formatting of the label too (I set it to 12pt bold).

Remove all gridlines, axis lines, zero lines etc.

Add a 50% reference line on the y-axis. Right click on the Percentage axis ->Add Reference Line. Add a table level constant of 0.5 which is a thin dotted grey line that is almost invisible (due to the colour selected).

Hide the axes (uncheck show header).

The tooltip has a minor nuance in that it refers to ‘Women’ & Men rather than Male & Female, so we need a field for this

Gender

CASE [Sub-Type]
WHEN ‘Female’ THEN ‘Women’
WHEN ‘Male’ THEN ‘Men’
END

Add this to the Tooltip shelf and adjust the tooltip accordingly.

Labelling the axes

The final viz has labels on both the x and y axis, but these are all managed by text/image objects positioned ‘cleverly’ on the dashboard. It’s a bit of trial and error to get everything aligned as required.

To label the Job Type, I used a horizontal container, positioned beneath the chart, with several text objects, some of which had the text rotated.

The Equal Proportions and Less Equality text are both floating text objects. I saved an ‘arrow’ image from the internet and added a floating image object too.

My published viz is here.

Happy vizzin’!

Donna

How often is Sean listening to his favourite songs?

A “scatter plot with a twist” music based challenge this week was posed by Sean Miller, using data from his last.fm account. On the surface, a simple scatter plot of each song, comparing the number of days between first and last listen against total plays. But click on song, and you get to see a timeline depicting days since first play vs cumulative plays, on the same chart. Hmmm…

The data set provided consists of 1 row per song per date played since 2017. For the scatter plot, we’re looking to summarise the data at a song level. For the timeline related to the selected song, we need to have the data at the song and date level. Ultimately we’re mixing levels of granularity within a single chart.

This certainly took some thinking. Creating the 2 charts independently was pretty straightforward, but trying to put them together took some thought. I knew I was going to want to use a set action to drive the interactivity and apply different logic based on whether a song was selected or not, but it took a bit of trial and error to get a solution.

First up, the data provided contained some timestamp date fields, but these were string data types. I chose to use the Timestamp UTC field to get a ‘proper’ date. I actually did this by duplicating the field, renaming it to Date Played and changing the datatype to a date. This generated a calculation which is below… I’d have never of typed this myself 🙂

Date Played

DATE(IF NOT ISNULL( DATEPARSE ( “MM/dd/yyyy HH:mm”, [TimeStamp UTC] ) ) THEN DATEPARSE ( “MM/dd/yyyy HH:mm”, [TimeStamp UTC] ) ELSEIF NOT ISNULL ( DATEPARSE ( “MM/dd/yyyy hh:mm:ss a”, [TimeStamp UTC] ) ) THEN DATEPARSE ( “MM/dd/yyyy hh:mm:ss a”, [TimeStamp UTC] ) END)

Now we have that, we can work out the first date a song was played

Min Date Per Song

{FIXED [Song ID]:MIN([Date Played])}

and the latest dates

Max Date Per Song

{FIXED [Song ID]:MAX([Date Played])}

and then we can derive the days between

Days since first listen

DATEDIFF(‘day’, [Min Date per Song], [Max Date per Song])

We can also get the total plays per song using

Total Plays per Song

{FIXED [Song ID]: [Total Plays]}

and with these calculated fields, we can build the basic scatter, using Total Plays per Song as a range filter, and setting both the axes not to start at zero.

So far so good. Now let’s think about the timeline. We need to identify a ‘selected song’ to help build this, so let’s create a set by right clicking on Song ID > Create > Set, and selecting a single option

We need to capture the number of plays on a date

Total Plays

COUNT([2021_05_26_WW21_My Streaming Activity.csv])

This is the default count field generated (the equivalent of the Number of Records if you’re on older versions of Tableau).

And we also need to capture the number of days from first play date to the current date, as we need to plot on a consistent axis when putting the charts together (ie we can’t plot with date on the axis).

Days to Date

DATEDIFF(‘day’,[Min Date per Song],[Date Played])

Let’s put this out into a table so you can see what’s going on.

Add the Selected Song = True to the filter shelf

There are days when there were multiple plays, so the Days To Date field needs to be set to AVG rather than SUM, to get the correct figure.

For Total Plays we need to plot the cumulative value, so we can set a quick table calculation against the Total Plays field of Running Sum (right click on Total Plays field). We’re going to explicitly set the table calculation to compute by Date Played as when depicted on a viz, the default of table down, might not give the correct values.

So with these fields, we can build the timeline viz (duplicate the table sheet and move the fields around). Set the mark type to line, and change the setting on the Path to make a stepped chart.

Ok, so now we have the 2 charts and hopefully understand what we’re aiming for. But how do we now go about getting everything onto a single chart?

We’re going to need a dual axis chart, since we have different mark types in play. And we’re going to want to plot different measures depending on whether we’re working with the selected song or not.

We’re going to build the data up in a table to get the logic for the fields we need. To just test the concept, we’ll filter to just a few songs, including the one in the Selected Song set. Add Song ID to the filter shelf and filter to a few songs.

Then build out a table as below:

What we’re aiming for, is for the records where In/Out of the set is Out, we want to plot the information we’re getting from the 1st two columns, but for the records where In/Out the set is In, we want the information from the other columns.

So let’s build this out.

Days to Plot

IF ATTR([Selected Song]) THEN AVG([Days To Date]) ELSE SUM([Days since first listen]) END

Plays to Plot

IF ATTR([Selected Song]) THEN RUNNING_SUM([Total Plays]) ELSE SUM([Total Plays per Song]) END

Pop these in the table, making sure any fields which are table calculations are set to compute by Date Played.

If you scroll to find where the data changes from a song out of the set to the one in, you can see how the two new fields are working.

So lets try plotting the chart out using these fields instead.

  • Days to Plot on Columns
  • Total Plays Per Song on Rows
  • Song ID on Detail shelf
  • Plays to Plot on Columns
  • Date Played (set to exact date) on Detail of the All Marks card.
  • Set the table calculation of Plays to Plot to compute by Date Played only.
  • Change mark type of the Plays to Plot to Line and set the Path to stepped line
  • Add Total Plays Per Song to Filter shelf and set to range from 50 to 100 (just to make the chart less busy).

You should end up with the below

You’ll notice we have the line of circles in the top chart, which is plotting a mark per day for the Selected Song. If we remove the Date Played pill from the Detail shelf of the Total Plays Per Song marks card, we lose these marks including the mark for the Selected Song too, which we need.

To resolve this, we need another field.

Is Last?

LAST()=0

Last Plays to Plot

IF [Is Last?] THEN Sum([Total Plays per Song]) END

This is just saying give me the total plays for the last mark in the list. Add these to the table to see what’s going on, making sure to set the table calc to compute by Date Played only

Now if we replace Total Plays per Song with the Last Plays to Plot field, we get
Now make the chart dual axis (don’t forget to synchronise), and we can then sort the formatting.
  • Set the marks type on the Last Plays to Plot to circle
  • Remove Measure Names from the Colour shelf, and set the mark colour to #3dde3c. Reduce the opacity to about 90%. Add a dark grey border.
  • Add Selected Song to the Size shelf, and adjust so the selected song is larger than the others.
  • On the Plays to Plot marks card, again remove Measure Names from the Colour shelf, and set the colour to black.
  • Set the Path to stepped line.
  • Click on the right hand axis and select Move marks to back
  • Reduce the Size of the mark.
  • Add Selected Song to the Colour shelf, and adjust the colours so the line is black and the tiny dots for all the other marks that you can see in the circles, is set to #3dde3c

You’ve now got the core chart, which needs to be further tidied to remove grid lines, axes, add tooltip etc. Once done, you can add to a dashboard, where you can then set the interactivity.

Add a dashboard action to change set values that sets the Selected Song set on Select. Set to work on single-select only.

Finally, you’ll find that if you select a mark, while you’ll get the trend line, the other points will now ‘fade out’

Create a new field True = True, and add this to the Detail shelf of the All Marks card. Then on the dashboard, add a new dashboard highlight action, which is set to Target Highlighting to the True field only.

Now if you click on and off a mark, you should get the trendline show and disappear, and all marks remain at the same transparency throughout.

Hopefully you’ve got enough now to complete this challenge. My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you build a Customer Lifetime Value Matrix?

Ann Jackson challenged us this week to build this matrix depicting the average worth of customer cohorts during their lifetime.

This challenge involves a mix of LoDs (Level of Detail calculations) and table calculations.

First up , we need to define our customer cohorts (ie group the customers), which for this challenge is based on identifying the quarter they placed their first order in. This will involve an LoD calculation. For a good introduction to LoDs with some worked examples (including a similar cohort analysis example), check out this Tableau blog post.

ACQUISITION QUARTER

DATE(DATETRUNC(‘quarter’,{FIXED [Customer ID] : MIN([Order Date])}))

The 2nd part of the formula in the { … } returns the earliest Order Date associated to the Customer ID, which is then truncated to the 1st day of the quarter that date falls in ie 23 Feb 2019 is truncated to 01 Jan 2019.

For the ‘quarters since birth’ field, we need to calculate the difference in quarters, between the ACQUISITION QUARTER and the ‘quarter’ associated to the Order Date of each order in the dataset.

QUARTERS SINCE BIRTH

DATEDIFF(‘quarter’,[ACQUISITION QUARTER],DATETRUNC(‘quarter’,[Order Date]))

Drag this field into the ‘dimensions’ area of the left hand data pane (above the line if you’re using later versions of Tableau).

Lets sense check what this looks like, by adding

  • ACQUISITION QUARTER to Rows (Discrete, Exact Date)
  • ORDER DATE to Rows, set to Quarter (quarter year ie May 2015 format which will make a green pill), then set to discrete to change to blue
  • QUARTERS SINCE BIRTH to Rows

You can see that while the first row against each cohort starts with a different quarter, the QUARTERS SINCE BIRTH always starts at 0 and counts sequentially down the table.

Next we want to count the number of distinct customers in each cohort, and we’ll use another LOD for this.

CUSTOMERS

{FIXED [ACQUISITION QUARTER]: COUNTD([Customer ID])}

Once again move this field into the Dimensions section of the data pane.

Add this onto the Rows of the above data table, and you should get every row for the same cohort displaying the same number

Add Sales onto Text to get the value of sales made by the customer in each cohort in each quarter. The ‘customer lifetime value’ we need is defined as the total sales so far / number of customers in the cohort.

Remove the QUARTER(Order Date) field from the table, as we’re not going to need this for the display, and it’ll affect the next steps if it’s left.

To get the cumulative sales, we need a Running Total Quick Table Calculation. Click on the Sales pill on the Text shelf and select Quick Table Calculation -> Running Total. The click again and Compute By -> QUARTERS SINCE BIRTH. Add Sales back into the table, so you can see the quarterly Sales value and how it’s cumulating until it reaches the next cohort.

We’ve now got the building blocks we need for the CLTV value we need to plot

Avg Lifetime Value

RUNNING_SUM(SUM([Sales])) / SUM([CUSTOMERS])

Note – I purposefully haven’t called this field what you might expect, as I’m going to ‘fill in the gaps’ that Ann describes in the requirements, and I’ll use that name then.

Pop this field into the table above, again setting the table calculation to compute by QUARTERS SINCE BIRTH

You can now use the data table above to validate the calculation is what you expected.

Now let’s build the viz out.

On a new sheet

  • QUARTERS SINCE BIRTH to Columns
  • ACQUISITION QUARTER (exact date, discrete blue pill) to Rows
  • Avg Lifetime Value to Text, setting the table calculation to Compute By QUARTERS SINCE BIRTH

From this basic text table, you can see the ‘blank’ fields, Ann mentioned. In the data table view, it’s not so obvious. The blank is there because there are no sales in those quarters for those cohorts. To fix we need another table calculation

CUSTOMER LIFETIME VALUE (CLTV)

IF ISNULL([Avg Lifetime Value]) AND NOT ISNULL(LOOKUP([Avg Lifetime Value],-1)) AND NOT ISNULL(LOOKUP([Avg Lifetime Value],1))

THEN LOOKUP([Avg Lifetime Value],-1) ELSE [Avg Lifetime Value] END

This says, if the Avg Lifetime Value field is NULL but neither the previous or the subsequent values are NULL, then use the Avg Lifetime Value value from the previous column (LOOKUP).

Replace the Avg Lifetime Value with the CUSTOMER LIFETIME VALUE (CLTV) field (setting the Compute By again), and the empty spaces have disappeared.

If you hover over the cells in the lower right hand side of the view, you’ll see tooltips showing, indicating that a mark has been drawn on the viz with Null data. To fix this, add CUSTOMER LIFETIME VALUE (CLTV) to the Filter shelf and specify non-null values only to show via the Special tab.

Now if you hover over that area you don’t get any tooltips displaying, as there aren’t any marks there.

Now it’s just a case of formatting the viz a bit more

  • Add CUSTOMERS to Rows
  • Add CUSTOMER LIFETIME VALUE (CLTV) to the Colour shelf by holding down the Ctrl key then clicking on the field that’s already on the Text shelf, and dragging via the mouse onto the Colour shelf. Using Ctrl in this way has the effect of copying the field including the table calculation settings, so you don’t need to apply them again. This will change the colour of the Text.
  • Then change the mark type to Square, which will then fill out the background based on the colour.
  • Then edit the colour legend to the relevant palette (which you may need to install via Ann’s link).
  • Set the border of the mark via the Colour shelf to white
  • Remove the row & column dividers
  • Set the row Axis Ruler to a dark black/grey line
  • Format the 1st 2 columns so the font is the same and centred. Widen the columns if required.
  • Update the tooltip

And then you should be ready to add the viz to your dashboard. My published version is here.

This blog is a bit more detailed that my recent posts, but I’m also conscious I’ve skipped over some bits that if you’re brand new to Tableau, you may not be sure how to do. Feel free to add comments if you need help!

Happy vizzin’! Stay Safe!

Donna

Can you create a small multiple waterfall chart?

For this week’s #WOW2020 challenge, Lorna Brown asked us to recreate a waterfall chart – a chart style that hasn’t featured in many previous challenges (if at all), and is always a useful one to know how to build.

I’m familiar with these, and this challenge didn’t cause me too many issues, so this blog is going to be brief.

  • Building the waterfall
  • Small multiple / grid layout
  • Adding the month label

Building the waterfall

So I’d built out the basic waterfall for each month and day by plotting Year(Order Date), Month(Order Date) and Day(Order Date) on Columns. The Day(Order Date) field was set to Show Missing Values so each day without an order was still plotted, and I was trying to figure out how to get the additional ‘long’ bar at the end.

I worked out it was essentially a ‘total’ bar and when I duplicated my data as crosstab and played round with the data in a tabular form, I got the subtotals I needed displayed.

But I seemed to be having issues displaying these on the chart view. So I turned to my usual route, Google, and had a search, and came across this blog from Tim Ryan at The Data School, which gives you the complete guide to building the waterfall, so there’s no need for me to repeat it all – thanks Tim! 🙂

My issue was I had a green continuous Day(Order Date) field rather than a blue discrete one – doh!

The only couple of things you need to make note of – you need to ensure you have 0 displayed for the missing dates

Actual Profit

ZN(SUM([Profit]))

and the gantt bars should be coloured red for negative profit, blue for positive and grey for the missing days

Colour

IF SUM([Profit])<0 THEN ‘Red’ ELSEIF SUM([Profit]) > 0 THEN ‘Blue’
ELSE ‘Grey’
END

Small multiple / grid layout

For this you need fields to add to the Rows and Columns shelf that position the month in the appropriate cell.

The Quarter(Order Date) (blue discrete) on Rows, is just used to define the row a month lands in.

You then need

Cols

IF MONTH([Order Date])%3 = 0 THEN 3
ELSE MONTH([Order Date])%3
END

which assigns each month a value of 1,2 or 3. You’ll need this on the Columns shelf.

Adding the month label

The label shows the month and the total profit in the month, so I created an LOD for this

Profit for Month

{FIXED YEAR([Order Date]),MONTH([Order Date]): SUM([Profit])}

From this I wanted the maximum value of all the monthly profits

Max Monthly Profit in Year

{FIXED Year([Order Date]): MAX([Profit for Month])}

I then used a dual axis to plot this field on the Rows, set the mark type to a line and set the opacity of the line colour to 0%, so it disappears.

The month and value were then added to the Label shelf and the label set to Label start of line only and also right aligned to get the required positioning.

And that’s it. I said this would be brief 🙂 My published viz is here.

Happy vizzin’! Stay Safe!

Donna

How does the Sales Pipeline look?

Luke posted the challenge this week to build a Sales funnel using bar charts and table calculations.

The data provided was structured as follows

Each row identified a record in the pipeline along with its value. The last_stage field indicated what stage in the process the record was currently at. The value and last_stage are the only two bits of data needed for this challenge, along with the knowledge that the process moves through the stages in the following order :

  1. Prospect
  2. Lead
  3. Qualified
  4. Opportunity
  5. Negotiations
  6. Closed

which you can tell from the diagram anyway.

The challenge is that the value of any record currently at stage 2 (lead) or above, also needs to be incorporated into the total value of the previous stages.

When tackling these types of challenges involving table calculations, I start by creating a table view of all the data I need. So let’s get cracking…

Building up the data required

First up, I need a field to help drive the order of the stages, so I created

last_stage Order

CASE [last_stage]
WHEN ‘Prospect’ THEN 1
WHEN ‘Lead’ THEN 2
WHEN ‘Qualified’ THEN 3
WHEN ‘Opportunity’ THEN 4
WHEN ‘Negotiations’ THEN 5
ELSE 6
END

Add this to a view along with the value

This is essentially the data needed to build the Current Status column in the output.

Next we need to work out the Overall Funnel values. As Luke stated in the requirements, this is going to involve table calculations. What we want is a running sum but one that starts at the bottom at stage 6, and goes ‘up’ the table. But there is no Table Up option in Tableau, so we need to be creative.

We’ll still need a running sum though, so create this

Running Sum

RUNNING_SUM(SUM([value]))

Add this to the view, it will automatically be applied Table Down, which is what we need at this point – each row is the sum of the previous rows above it.

We also need the overall total captured against each row in the table.

Window Sum

WINDOW_SUM(SUM([value]))

Now we’re in a position to work out our ‘cumulative’ value or Overall Funnel, which is a calculation involving all 3 fields

Cumulative Value

[Window Sum]- ([Running Sum] – SUM([value]))

Note – for the values that will be displayed, I’ve applied formatting of $ to 0 dp.

Now onto the final Percent to Close calculation which is the value of records at stage closed, as a proportion of the cumulative value. So we need to get the closed value stored against every row

Closed Value

{FIXED : SUM(IF [last_stage order] = 6 THEN [value] END)}

so then we can determine

% To Close

SUM([Closed Value]) / [Cumulative Value]

which is formatted to percentage to 0 dp

So now we have all the data we need to build the viz. I like to save this sheet for future reference, and to double check as we start moving pills around to build the viz.

Building the viz

Start by duplicating the table sheet, and remove the calculations, so only the 3 necessary (Sales, Cumulative Value & % To Close) are listed.

Then move these 3 onto the Columns shelf, and in doing so, the viz should reformat as a bar chart automatically.

Set the colour of the bars to the dark green (#00646d) and tick the Show mark labels option on the Label shelf (you may need to expand the width of the rows to get the label to show.

The first 2 columns are displaying what we need, but the final one needs to show the bars ‘filling up’ to 100%. We need a dual axis for this.

Type into the Columns shelf Min(1) to create a fourth column. Uncheck show mark labels for this column only.

The set this field to be Dual axis, and synchronise the axis. It’s likely the marks will all change to circles, so reset the ‘All’ marks card back to bar. On the Min axis, right click and select Move marks to back, to ensure the % Closed values are sitting on top.

Now change the colour of the Min(1) field to light green (#7cadb2), untick Show Header against last_stage order & SUM([Value]) to hide the axis. Also right click on last_stage in the view, and select Hide field labels for rows.

Finally adjust the size of the labels displayed to 8pt. On the % To Close marks card, change the label alignment to left, and set the font colour to white. Uncheck Show Tooltips for all marks.

And you should now have the main display.

Oh, adjust the Size of the bars to suit.

When added to the dashboard, add a horizontal container above the viz, and use text boxes to display the column titles.

My published viz is here.

Happy vizzin’!

Donna