Can you build a ranked heatmap tile?

Erica had a guest coach, Valerija Kirjackaja setting the challenge this week, asking us to use table calculations to build this heatmap table.

I built an initial version of the heatmap only, which I’ve published here, but I couldn’t get the display to match Valerija’s when I clicked on it. (Note – I wasn’t bothered about Viz in Tooltip part at this point). I used 2 dimensions to display the yellow header section, and you can see below, when clicking, this is noticeable. But Valerija’s solution treats the header as a single entity.

I couldn’t figure out how she’d managed this, so I had to have a look at the solution, and then I built my own instance, which is what I’ll now blog about.

Defining the core fields

Connect to the datasource and add a data source filter to restrict the data by the latest year (as I used Superstore 2025, I filtered to 2025). Doing this meant I didn’t have to worry about adding worksheet level filter fields.

On a new sheet add Category and Sub-Category to Rows and add Sales to Text and then sort descending so you can see the data we need to work with. Format Sales to be $ to 0 dp.

Create a new field

Sales by Cat Rank

RANK(SUM([Sales]))

change it to be discrete and then add to Rows. Adjust the setting of the table calculation so it is computing by Sub-Category only, so the ranking restarts for each Category

We will also need to display the Category in upper case, so create

Category Upper

UPPER([Category])

and add to Rows.

Having this tabular layout just lets us clarify how the table calculation will be working.

Building the Heatmap Table

On a new sheet, add Category to Columns and Sub-Category and Sales to Text and align centrally (Note – I originally put Category Upper on the columns, and then changed later after taking all the screen shots)

Add Sales by Cat Rank to Rows. Change it to a continuous (green) pill and edit the axis to be reversed. Adjust the table calculation so it is computing by Sub-Category only.

Create a new field

One

1

Change the mark type to gantt bar and add One to the Size shelf, setting the aggregation to Avg. Increase the size of the mark via the Size shelf to as large as possible. Add Sales to Colour then add a white border.

We need the axis numbers to be central to the ‘row’, so to fix this, double click into the Sales by Cat Rank pill and change it to subtract 0.5 ( [Sales by Cat Rank] -0.5 )

Re-edit the axis to reverse it again.

So our ‘table’ is now displaying data on a axis from 0.5 up to 9.5. To add the header, we’re going to plot another mark at -0.5, so we can display a section of the same size as the existing ones. For this, double click into Rows and type MIN(-0.5).

This has the effect of creating a second marks card

Remove Sales from Colour and Sub-Category from Label. Add Category Upper to Label. Adjust the label text alignment and formatting and change the Colour to yellow.

Make the chart dual axis and synchronise the axis and we now have the required display.

Tidy up by

  • Remove row & column dividers
  • Remove gridlines, zero lines & axis ticks
  • Hide the right hand axis (right click > uncheck show header)
  • Hide the Category Upper column labels (right click pill > uncheck show header)
  • Remove the left hand axis title
  • Fix the left hand axis from -0.5 to 9.5
  • Format the axis with a custom formatting of #,##0;-#,##0;TOTAL so that 0 is displayed as the word TOTAL (this is very sneaky by the way, and took me a while to work out)
  • Then format the font to be bold
  • Adjust the Tooltip on the Sales by Cat Rank marks card to display the Sub-Category and Sales value.
  • Delete the text from the Tooltip on the MIN(-0.5) marks card
  • Name the sheet Table or similar

Building the Viz in Tooltip

On a new sheet add Order Date to Columns at the discrete month level (blue pill) and add Sales to Rows. Add Sub-Category and Category to Detail. Format the date axis so the dates are just using First Letter.

Create a new parameter

pSubCat

string parameter defaulted to Bookcases

Then create a field

Is Selected SubCat

[Sub-Category] = [pSubCat]

and add to the Colour shelf. Adjust accordingly, then make sure True is listed first in the legend, so the line is ‘on top’

Create a new field

Label Line

IF [Is Selected SubCat] THEN [Sub-Category] END

and add to the Label shelf and update to allow marks to overlap. Hide the Order Date label heading (right click > hide field labels for columns). Title the chart Line or similar

Remove gridlines and row/column dividers

Go back to the Table sheet, and update the Tooltip of the Sales by Cat Rank marks card to reference the Line chart and update the filter to pass as just <Category>

Adding the final interactivity

Create a dashboard and add the Table sheet. Then add a parameter action

Set Sub Cat Param

On hover of the Table sheet, set the pSubCat parameter with the value from the Sub-Category field, setting the value to <empty string> when the selection is cleared.

If all has been applied, then the line chart should just display the lines associated to the Sub-Categories in the same Category

My published viz is here.

Happy vizzin’!

Donna

Let’s Practice RegEx in Tableau with Generative AI

Yoshi set this week’s challenge which focused on 2 areas : the use of tools such as CHAT GPT to help us create Tableau RegEx functions (which are always tricksy), and the build of the chart itself.

Creating the REGEX calculated fields

I chose to use CHATGPT and simply entered a prompt as

“give me a regex expression to use in Tableau which extracts a string from another string in a set format. The string to interrogate is”

and then I pasted in the string I’d copied from one of the Session Html fields. I got the following response

along with many more code snippet examples. I used these to create

Session Title

REGEXP_EXTRACT([Session Html], '<div class="title-text">(.*?)</div>')

Description

REGEXP_EXTRACT([Session Html], '<div class="description"[^>]*><div>(.*?)</div>')

Location

REGEXP_EXTRACT([Session Html], '<span class="session-location"[^>]*>(.*?)</span>')

Session Level

REGEXP_EXTRACT([Session Html], 'session-level-([a-zA-Z]+)')

Alias this field to show the values in proper case (ie ‘Advanced’ rather than ‘advanced’) and display Null as ‘All Levels’

Session Date

REGEXP_EXTRACT([Session Html], '<span class="semibold session-date">(.*?)</span>')

Session Time

REGEXP_EXTRACT([Session Html], '<span class="semibold session-time">(.*?)</span>')

Start Time

REGEXP_EXTRACT([Session Html], '<span class="semibold session-time">([0-9: ]+[AP]M)')

End Time

REGEXP_EXTRACT([Session Html], '- ([0-9: ]+[AP]M)')

Creating the additional fields

All of the RegEx functions return string fields. To build the viz, we need actual date fields and additional information

AM|PM

IIF(CONTAINS([Start Time],’AM’), ‘AM’, ‘PM’)

Date

DATE(DATEPARSE(“MMMM d yyyy”, SPLIT([Session Date], “, “, 2) + ” 2023″))

returns the day of the session as a proper date field, all hardcoded to 2023, since this is when the data is from.

Start Date

DATETIME(STR([Date]) + ” ” + [Start Time])

returns the actual day & start time as a proper datetime field.

End Date

DATETIME(STR([Date]) + ” ” + [End Time])

Duration

DATEDIFF(‘second’, [Start Date], [End Date])

Add fields to a table as below

Firstly, when we build the viz, we’ll need to sort it based on the Start Date, the Session Level (where Advanced is listed first, and All Levels last) and the Duration (where longer sessions take higher precedence) . For this we need a new field which is a combination of information related to all these values.

Sort

STR([Start Date]) + ‘ – ‘ + STR(
CASE [Session Level]
WHEN ‘advanced’ THEN 4
WHEN ‘intermediate’ THEN 3
WHEN ‘beginner’ THEN 2
ELSE 1
END
) + STR([Duration]/100000)

(this just took some trial and error)

Add this to the table after the Session ID pill and then apply a Sort to the Session ID pill so it is sorting by the Minimum value of the field Sort Ascending. You should see that when the start time and session level match, the sessions are sorted with the higher duration first.

The viz also lists the AM & PM sessions as separate instances, where the 1st session of the PM session is displayed on the same ‘row’ as the 1st session of the AM session. To handle this we need

Session Index

INDEX()

Make this discrete and add to the table after the AM|PM pill. Change the table calculation so it is computing by all fields except AM|PM and Session Date. The numbering should restart at each am/pm timeslot

Finally, the last field we need to build this viz, is another date field, so we can position the gantt chart bars in the right places. As the viz already segments the chart by Session Date and AM|PM, we can’t reuse the existing Start Date field as this will display the information as if staggered across the 3 days. Instead we want to create a date field, where the day is the same for al the sessions, but just the time differs

Baseline Date

DATETIME(STR(#2023-01-01#) + ” ” + [Start Time])

the date 01 Jan 2023 is arbitrary and could be set to any day. Add this into the table, so you can see what it’s doing. You will probably need to adjust the table calculation on Session Index again to include Baseline Date in the list of checked fields.

Building the viz

On a new sheet, add Session Date to Columns and manually re-order so Tuesday listed first. Then add AM|PM to Columns . Add Baseline Date as a continuous exact date (green pill) to Columns . Add Session Id to Detail then add Session Index to Rows and adjust the table calculation so it is computing by all fields except Session Date and AM|PM.

Create a new field

Size

SUM([Duration])/86400

and add this to the Size shelf. 86400 is the number of seconds in 24hrs, so adjusts the size of the gantt bars to fit the timeframe we’re displaying.

Add Session Level to Colour and adjust accordingly. You will need to readjust the Session Index table calc to include Session Level in the checked fields. Add a border to the bars. Edit the Baseline Date axis so the axis ranges are independent

Add Description and Session Title to Detail and again adjust the Session Index table calc to include these fields too. Apply a Sort to the Session Id field so the data is sorted by the Sort field descending

Add Location, Start Time and End Time to Tooltip, and update accordingly. Then apply formatting to

  • change font of the Session Date and AM|PM header values
  • remove the Session Date/ AM|PM column heading label (right click > hide field labels for columns)
  • Hide the Session Index field
  • Hide the Baseline Date axis
  • Add column banding so the Wednesday pane is coloured differently

Add the ability to highlight the Session Title and Description by selecting Show Highlighter from the context menu of each pill

Then add the sheet to a dashboard, using containers to display the colour legend and the highlighter fields.

And that should be it. My published viz is here.

Happy vizzin’!

Donna

People Resource Planning

This week’s #WOW2024 challenge was a guest post from Dan Wade using an alternative custom data set focused on staff resource planning.

Creating the parameters

We need 3 parameters for this challenge

pMeasure

string parameter containing a list of 3 options: Budget, Demand, Supply and defaulted to Supply.

pAttrition_Actual

string parameter containing a list of 2 options: Actuals and Attrition and defaulted to Attrition

pRate

a float parameter containing a list of values from 0.05 to 0.12, defaulted to 0.06 and formatted as % with 0 dp.

Building out the calculations

On a new sheet add Forecast Date as a discrete (blue) pill at the month-year level to Rows and add Budget FTE, Demand FTE and Supply FTE and Actuals FTE to Columns vis Measure Names/Measure Values, so you have a tabular display. Show the 3 parameters.

The first 3 measures will be plotted as 3 of the lines on the chart. The 4th line – the Actuals/Attrition line is a calculation based on the parameter selections.

If the pAttrition_Actual parameter displays ‘Actuals’ then we need to show the Actuals FTE as a constant value across every row. From the above, we can see it’s only set against Jan 2024. We will make use of a FIXED LOD calculation to ‘spread’ this value across every row.

However if the pAttrition_Actual displays ‘Attrition’ then we want to calculate a value which is initially a proportion of the Actual FTE, but then is a proportion of the value calculated against the previous month. The requirements also state the rate in the pRate parameter is an annual attrition rate, so we need to apply 1/12 of this value against each month (assuming a linear decline). The calculation we end up with is

Actuals/Attrition FTE

IF [pAttrition_Actual] = ‘Actuals’ THEN SUM({MAX([Actuals FTE])})
ELSE
IF FIRST()=0 THEN SUM([Actuals FTE])
ELSE PREVIOUS_VALUE(0) * (1-([pRate]/12))
END
END

{MAX([Actuals FTE])} is the Fixed LOD which spreads the Actuals FTE value across every row. FIRST() is a table calculation which identifies the first row in the table. PREVIOUS_VALUE(0) looks at the previous value compared to the current row we’re on, and then is reducing it by 1/12 of the pRate parameter. Format this to a number with 2 decimal places.

Add this field to the table and explicitly set the table calculation to compute by Forecast Date.

Adjust the values of the pAttrition_Actual and pRate parameters to see the behaviour of the calculation.

Next we need to calculate the actual difference between this value and the value of the measure selected in the pMeasure parameter. To start we need

Selected Measure

CASE [pMeasure]
WHEN ‘Supply’ THEN SUM([Supply FTE])
WHEN ‘Demand’ THEN SUM([Demand FTE])
WHEN ‘Budget’ THEN SUM([Budget FTE])
END

and then we can create

FTE Difference

[Selected Measure] – [Actuals / Attrition]

Add this to the table. Verify the setting of the table calculation. Adjust the pMeasure value to see the changes.

In order to colour the bars on the viz, we need to know the % difference

% FTE Difference

[FTE Difference]/[Selected Measure]

format this to % with 2 dp and then create

% Diff > 20%

IF [% FTE Difference] > 0.2 THEN ‘Outside 20% range’ ELSE ‘Within range’ END

Add this to Rows and verify the output.

Building the Viz

On a new sheet add Forecast Date as a continuous (green) pill at the month/year level to Columns. Add Budget FTE, Demand FTE and Supply FTE to the same axis, using Measure Values/Measure Names. Adjust colours accordingly. Edit the y-axis and uncheck Include zero and change the axis title.

Add Actuals/Attrition FTE to the Measure Values section and ensure the table calculation is set to compute by Forecast Date. Adjust colour.

Add Measure Names to the Label shelf. Adjust to align right and central and set the font to match mark colour. Edit the date axis, adjust the axis title, then set the axis to have a fixed end of 31 Jul 2027. This gives some space for the labels to display.

Adjust the Tooltip to suit.

Then add Actuals/Attrition FTE to Rows, making sure the table calculation is set as it should. Change the mark type to Gantt Bar. Remove Measure Names from the Label and Colour shelf of this marks card.

Add FTE Difference to the Size shelf, adjusting the table calc setting. Then add % Diff > 20% to the colour shelf. Set the colours accordingly, and reduce the opacity to 25%.

Add Budget FTE, Demand FTE and Supply FTE to the Tooltip shelf then adjust the tooltip as required, making reference to the parameters to make the tooltip ‘dynamic’

Make the chart dual axis and synchronise the axis.

Show the parameters and adjust to see how the chart behaves with the different settings. Hide the right hand axis, remove row/column dividers, but make the axis lines slightly more prominent than the gridlines. Update the title and again reference the parameters so the text is dynamic.

Then add the chart to a dashboard and edit the parameter/legend titles as required.

My published viz is here.

Happy vizzin’!

Donna

Can you create a time-focused Gantt chart?

It was Kyle’s turn to set the challenge this week, and took inspiration from a work based scenario he had encountered. As a result 3 fictitious data sets were provided, and a viz was required to be built against each of them.

Building the Marketing Campaigns Gantt chart

All 3 vizzes are tied together by 2 parameters – the highlight date and number of days, so lets’ start with defining those

pHighlightDate

date parameter defaulted to 01 Dec 2023

and

pDays

integer parameter defaulted to 120

On a new sheet, show both of these parameters.

Using the Campaign data source, add Campaign to Rows and Start Date to Columns as a continuous exact date (green pill).

To define the width of each mark, we need

Duration

DATEDIFF(‘day’, [Start Date], [End Date])

Add this to the Size shelf, then add Category to the Colour shelf and adjust accordingly.

Add Start Date as a discrete exact date (blue pill) to Rows and position in front of the Campaign pill. This will order each row based on the Start Date ascending.

Hide the Start Date pill on Rows (uncheck Show Header). Add End Date as a continuous attribute to Tooltip, and amend the tooltip as required.

Add pHighlightDate to Detail, then right click on the Start Date axis and Add Reference Line. Add a dotted line for the entire table based on the pHighlightDate field.

Remove all row and column dividers; delete the axis title (right click > edit axis) and hide the Campaign column heading (right click > hide field labels for rows). Format the Campaign numbers so they are aligned right.

Update the title of the sheet, and include the details for the legend title within

Finally, we need to filter the rows shown based on whether the campaign was running within the window based on the number of days before/after the highlight date. We need

Window Start

DATE(DATEADD(‘day’, [pDays]*-1, [pHighlightDate]))

Window End

DATE(DATEADD(‘day’, [pDays], [pHighlightDate]))

and then

Campaigns to Include

[End Date] > [Window Start] AND [Start Date] < [Window End]

Add Campaigns to Include to the Filter shelf and set to True. Test the viz changes as the parameters are changed.

Building the Experiments Gantt Chart

This is built in very much the same way as the above using the Experiments data source instead. In this case, Start Time and Experiment ID will be on Rows, Start Time on Columns and Status on Colour,

A Duration field should be on Size, but the calculation needs to be slightly different to handle those records where there is a start but not an end. In this case, we assume the experiment is still ongoing, so set the end to ‘today’.

Duration

DATEDIFF(‘day’, [Start Time], IFNULL([End Time], TODAY()))

A Window Start and Window End field will need to added to this data source too, and then

Experiments to Include

IFNULL([End Time], TODAY()) > [Window Start] AND [Start Time] < [Window End]

should be added to the Filter shelf, and set to True – at this point all the experiments that didn’t have a start date either will disappear. Add a title and the reference line and format as before

Building the Emails bar chart

On a new sheet, using the CRM data source, add Date as a continuous exact date (green pill) to Columns and Sent to Rows. Change the mark type to bar chart. Click on the Size button, and change to be Fixed.

Add Email Type to Colour and adjust. Update the Tooltip.

Create fields Window Start and Window End as before, then create

Emails to Include

[Date] > [Window Start] AND [Date] < [Window End]

and add to the Filter shelf, set to True.

Add pHighlightDate to the Detail shelf, and add a reference line as before.

Remove all row/column dividers, gridlines & axis lines; delete the axis titles; add a sheet title including the legends.

The add all 3 sheets to a dashboard – I placed them all in their own vertical container, so I could then distribute contents evenly.

And that should be it! My published viz is here.

Happy vizzin’!

Donna

Can you build a funnel chart?

For my first #WOW2023 challenge, as an ‘official’ part of the #WOW crew, I challenged everyone to build this funnel chart. Inspiration came from a colleague who posed me the challenge of building something similar, ensuring the bars were a single entity when clicked on. In that example we’d just used Superstore data, but it wasn’t really reflective of how a funnel would typically work, so I sourced some dummy sales opportunity data and made the challenge more relevant to a real business scenario.

Building the Funnel Chart

I saw this as a table calculation exercise, though I have subsequently seen solutions that manage it all through LODs. I’ll obviously blog my solution.

Let’s start by getting the core data into a table, so we can see what we’re aiming for

The total value of all the Opportunities is $91.4M. Based on the sales pipeline process and the assumptions documented in the requirements, all of that value will have passed through the first stage – Prospecting.

There is $6.1M still sitting in Prospecting, so when we move onto the next stage, stage 2 -Qualification, the total amount that has passed through will be $91.4M – $6.1M = $85.3M

As there is $6.5M sitting in Qualification, as we move to stage 3, the total amount that has passed through will be $85.3M – $6.5M = $78.7M. A similar pattern repeats for Stage 4, but for Stage 5, Closed Won, the value that has ‘passed’ through that stage, is equivalent to itself.

If we amend the Value field to be a Running Total quick table calculation (and add Value back into the view too), we get

The Running Sum is computing down the table , with the value in each row being the sum of the preceding rows. But this is opposite to what we want. We want Stage 1 to have the biggest value, the total, and the values to get smaller as you look down the table.

To resolve this, edit the table calculation and change the Sort order to custom sort by Minimum Stage No Descending

This results in the running total to sum from the last row (Stage 6) up to the first row (Stage 1), and you can see the value against Stage 1 matches the total for the Value column.

However, this still isn’t quite what we need. While we need to consider the amount for Closed Lost in stages 1-4, we shouldn’t be including it in stage 5, Closed Won.

While excluding the Stage 6 from the view (Stage No on Filter shelf set to Exclude 6) , gives the correct value for Stage 5, we’ve now lost the amount for Closed Lost in the other rows

So, we need some additional calculations to help resolve this.

Amount Lost

{FIXED : SUM(IF [Stage No]=6 THEN [Value] END)}

This just captures the amount of Stage 6 and ‘spreads it across every row of data.

Let’s now get our running total field, baked into it’s own calculation, as this will help with other calcs

Cumulative Value Per Stage

RUNNING_SUM(SUM([Value]))

Replace the Value table calc field with this one, ensuring the table calculation settings are identical to those described above.

Now let’s add the Amount Lost onto the Cumulative Value Per Stage unless we’re at Stage 5, Closed Won

Total Amount Per Stage Inc Lost

IF MIN([Stage No]) = 5 THEN [Cumulative Value Per Stage]
ELSE [Cumulative Value Per Stage] + SUM([Amount Lost]) END

Pop this into the table (checking the table calc settings are as expected) and the total amount now stored against Stage 1, is the total value of all the Opportunities.

Ok, so now we know the values, how do we plot this in such a way that it represents a funnel. I chose to do this by first working out the proportion of the total value that the ‘reverse’ cumulative value represented.

Total Value

{FIXED:SUM([Value])}

meant I could determine

Proportion of Total

[Total Amount Per Stage Inc Lost] / SUM([Total Value])

I formatted this to percentage with 0 dp.

Adding these into the table

If we plotted this information on a bar chart, we’d get this

but we need to ‘centre’ each bar in relation to the first bar to give us the funnel effect. This means we need to shift each ‘bar’ to the right, by half the amount that is the difference between 100% and the Proportion of Total.

Position to Plot

(1 – [Proportion of Total])/2

We can’t represent this with a actual bar chart, but instead we can use a Gannt chart.

  • Stage No to Rows
  • Stage No to Filter and exclude Stage 6
  • Stage to Detail
  • Position to Plot to Columns, adjusting the table calculation as previously described
  • Change mark type to Gantt bar
  • Add Proportion of Total to Size (and verify the table calc is set properly)

Hey presto! A funnel!

To finalise

  • add Stage to Label and align centrally. Make the font bold and match mark colour.
  • add Cumulative Value per Stage to Colour and reverse the colour range (via the Edit Colours dialog)
  • Widen each row a bit.
  • Fix the Position to Plot axis to start at 0 and end at 1 to ensure the funnel is centred exactly.
  • Add Value and Total Amount Per Stage Inc Lost to the Tooltip and adjust the tooltip text accordingly (don’t forget to sense check the table calc settings!).
  • Hide the Position to Plot axis, and the Stage No column. Remove all gridlines, zero lines, axis rulers /row/column banding etc.

Building the KPIs

We need a few calculated fields to store the required numbers

Won

{FIXED:SUM(IF [Stage No]=5 THEN [Value] END)} / [Total Value]

formatted to a percentage with 0 dp.

Lost

{FIXED:SUM(IF [Stage No]=6 THEN [Value] END)} / [Total Value]

formatted to a percentage with 0 dp.

Outstanding

1-([Lost] + [Won])

formatted to a percentage with 0dp.

On a new sheet add Measure Names to Columns and Measure Values to Text. Add Measure Names to Filter and select Total Value, Won, Lost, Outstanding. Manually re-order the columns.

Format Total Value to be a number displayed in millions with 1 decimal place & prefixed with $.

Add Measure Names to Text and adjust the text as required. Align the text to be centred.

Remove the row banding, and hide the column heading.

Then arrange the two sheets on the dashboard, ensuring both are set to fit entire view.

My published viz is here.

Hope you enjoyed this

Happy vizzin’!

Donna

What is the lifetime value of customers?

Luke Stanke set the #WOW2022 challenge this week (see here) asking us to visualise the lifetime value of customers. I have to admit, I did struggle to really understand what was being requested, and to get the numbers to match Luke’s. I ended up referring to my own blog post on a similar challenge Ann Jackson set in week 2 of 2021 to get the calculations I needed 🙂

We first need to define the quarter that each customer made their first purchase.

Customer First Order Quarter

DATE(DATETRUNC(‘quarter’, {FIXED [Customer ID]:MIN([Order Date])}))

The {FIXED LOD} calculation returns the minimum order date per customer, then truncates this to the first day of the quarter that date falls in.

We then need to determine the difference in quarters between the Customer First Order Quarter and the quarter associated to the Order Date of each order in the data set. The requirement indicated we needed to add 1 to the result. I split this into multiple calculated fields. Firstly,

Order Date Quarter

DATE(DATETRUNC(‘quarter’, [Order Date]))

gets me the quarter of each Order Date, then

Quarters Since First Purchase

DATEDIFF(‘quarter’, [Customer First Order Quarter], [Order Date Quarter])+1

Let’s pop these fields out into a table to check things are behaving as expected.

Add Customer First Order Quarter and Order Date Quarter to Rows as discrete exact dates (blue pills), and add Quarters Since First Purchase to the Text field, but set it to be a dimension, so it is disaggregated.

Now we need to count the number of distinct customers that made a purchase in each Customer First Order Quarter (the cohort)

Count Customers Per Cohort

{FIXED [Customer First Order Quarter]: COUNTD([Customer ID])}

Add this to to the sheet, along with Sales (I moved Quarters Since First Purchase to rows too)

As expected, we can see the same customer count for each Customer First Order Quarter cohort.

We’ve now got all the building blocks to move on the the next requirement, but I’m going to rearrange the table a bit, to start to reflect the data actually needed for the output.

The x-axis of the chart is going to be based on the Quarters Since First Purchase field, so move that to be the first column of data (1st entry in Rows). Then remove Customer First Order Quarter and Order Date Quarter, as we don’t need this level of information in the final viz.

We now have the sum of all the customers and the total sales, so we can now create the division reqiurement

Sales / Customer

SUM([Sales])/SUM([Count Customers Per Cohort])

I set this to currency $ with 0 dp.

Add this to the table

Then to make this a running total, create a Running Total quick table calculation off of this field (right click on field -> Quick Table Calculation -> Running Total). Add back in Sales/ Customer.

We’ve now got all the components needed to build the viz, but do require an additional calculation to be displayed in the tooltip, which is the difference between each row.

Right click the existing running total Sales / Customer pill (the one with the triangle) and choose to Edit Table Calculation. Tick the Add secondary calculation checkbox and choose the Difference From table calc to run down the table, making the calc relative to the previous row.

Re-add a Running Total quick table calc to the other Sales / Customer pill, and then add Sales / Customer back into the view (it’s annoying that Tableau won’t let you add multiple pills of the same measure name unless they have a different calculation against them).

The snag with this is that we need a value to display for the first row. We need to create a new field that can reference the data in the second table calculation. Click and drag the ‘difference’ table calculation field into the Data pane, and name the field Difference. It should look like below, but you shouldn’t have needed to type any of that.

Difference

ZN(RUNNING_SUM([Sales / Customers])) – LOOKUP(ZN(RUNNING_SUM([Sales / Customers])), -1)

Now create a new calculated field

Tooltip:Difference

IF FIRST()=0 THEN [Sales / Customers] ELSE [Difference] END

Set this to a customer number format of 1dp, prefixed by + (the data is always cumulative so is never going to have a -ve value, so this works).

Now we can build the viz.

On a new sheet add Quarters Since First Purchase to Columns as a continuous dimension (green pill), then add Sales / Customers to Rows and set to be a Running Total quick table calc. Change the mark type to be a Gantt Bar.

Create a new field

Size

[Tooltip:Difference]*-1

and add this to the Size shelf.

Add a second instance of the Sales / Customer running total calc (press ctrl and click and drag the existing pill in rows to create another next to it). Change the mark type of this to be bar. Remove the Size pill, and then click the Size shelf button, and set the Size to be fixed, aligned left.

Now make the charts dual axis and synchronise the axis. Set the colour of each mark type to the relevant palette, and set the mark borders to None. Hide the right hand axis.

On the All marks card, add the Tooltip:Difference and Count Customers Per Cohort fields to the Tooltip shelf, and amend the tooltip to match.

On the bar marks card, click the Label shelf button and tick the show mark labels checkbox. Align these left.

Remove the left hand axis, remove all gridlines and row/column dividers. Add column axis ruler and dark tick marks

Edit the x-axis and rename the title to Quarter of Order.

If you find you have the x-axis going from 0 to 18, then change the datatype of Quarters Since First Purchase from a whole number to decimal. (right click pill in data pane -> change data type -> Number(decimal).

Add the chart to a dashboard and boom! you’re done. My public viz is here. (note, I did notice some vertical dividers displaying in the area chart on public, but think this is a Tableau Public ‘bug’ as I’ve switched off all the lines I can think of, and Desktop displays fine….

Happy vizzin’!

Donna

Can you create a Waterfall Pipeline using Salesforce data?

For the Salesforce Dreamforce (#DF22) conference, Lorna set this challenge based on using Salesforce data. You could access the data either by creating a Salesforce Developer account, or using the provided csv data set. I chose to use the latter.

I’m not overly familiar with the SF data, so it’s possible in the course of this blog, I may have missed a field in the data set that could have been used instead of whatever technique I describe. Feel free to let me know in the comments if this is the case.

Setting up the calcs

After connecting to the data, the first step was to amend the amount based on whether the Stage Name was ‘Closed Lost’ or not.

Revised Amount

IF [Stage Name]=’Closed Lost’ THEN -1 * [Amount] ELSE [Amount] END

I set the format of this field to be Currency £M with 1 dp.

Now the crux of the waterfall, is that the position of the mark that needs to be plotted it is based on the cumulative sum of the values displayed. So for this we need to create a table calculation

Revised Amount (Running Total)

RUNNING_SUM(SUM([Revised Amount]))

I set the format of this field to be Currency £M with 0 dp.

Let’s put the data into a table to see what’s going on.

The table calculation of the 2nd column (Revised Amount (Running Total)) is computing ‘down’ the table, so for this to work as we require the order of the rows is important. I just manually sorted by dragging each Stage Name into the relevant position. and then added a grand total.

With these 2 fields, we can now build the basic waterfall.

Build the Waterfall

I chose to duplicate the above sheet and then move the pills around as follows

  • Move Stage Name to Columns
  • Move Revised Amount (Running Total) to Rows. Amend the table calculation setting so it is set to explicitly compute using Stage Name.
  • Change the Mark Type to Gantt Bar
  • Move Revised Amount from Text to Size, then double click on the Revised Amount pill so it becomes editable and add * -1 to the end, to invert the value

  • Add Row Grand Total
  • Add Stage Name to Colour and manually adjust each colour accordingly

  • Add Revised Amount to Tooltip and adjust.
  • Edit Revised Amount (Running Total) axis (right click axis -> Edit Axis) and amend axis title
  • Format the Grand Total label (right click label -> Format) and amend says ‘Total’ and is bold.
  • Hide the Stage Name label at the top of the chart (right click label -> hide field names for columns)

Labelling the bars

Labelling the bars isn’t quite as simple as you think it might be, as the position of the label differs depending on whether the Revised Amount is positive (at the top) or negative (at the bottom).

So we need to create some new calculations

LABEL – +ve Revised Amount

IF SUM([Revised Amount])>=0 THEN SUM([Revised Amount])

formatted to £M to 1 decimal place

LABEL – -ve Revised Amount

IF SUM([Revised Amount])<0 THEN SUM([Revised Amount]) END

formatted to £M to 1 dp

Adding these to the tabular layout we had earlier you can see how these fields are behaving.

Switch back to the waterfall chart, and add LABEL – +ve Revised Amount to the Label shelf. Format the label to be smaller font (I chose 8pt), bold and explicitly aligned to the top.

Now duplicate the Revised Amount (Running Total) pill that is on Rows to add another instance directly next to it. I do this by holding down ctrl as I then click and drag the pill. This will create another axis.

On the second marks card, remove LABEL – +ve Revised Amount from Label and add LABEL – -ve Revised Amount instead. Adjust the label so it is aligned at the bottom instead.

Now make the chart dual axis and synchronise the axis.

Then hide the right hand axis, and remove all divider and gridlines.

Then add to the dashboard. I used a vertical layout container so I could add blanks of 1 pixel in height with a black background colour to present the this black lines separating the header and footer text.

My published viz is here.

Happy vizzin’!

Donna

Can you make a bullet chart?

Kyle returned this week to set this challenge to create a bullet chart which compared Win % for baseball teams in a year (bar) to the previous year (line), along with call out indicators (red circles) if the difference was greater than a user defined threshold. To add an extra ‘dimension’ to the challenge, a viz in tooltip showing the historical profile for the selected team was also required.

Building the basic bar chart

We’re going to use a parameter to define which year we want to analyse, as we can’t just filter by the Year as we need information about multiple years

pYear

integer field defaulted to 2018, formatted so the thousands separator does not display, and populated by Add values from the existing Year field

With this, we can create the following fields

Wins – CY

IF Year = [pYear] THEN [Wins] END

Wins – PY

IF Year = [pYear]-1 THEN [Wins] END

Games – CY

IF Year = [pYear] THEN [Games] END

Games – PY

IF Year = [pYear]-1 THEN [Games] END

and subsequently

Wins per Game – CY

SUM([Wins – CY]) / SUM([Games – CY])

and

Wins per Game – PY

SUM([Wins – PY]) / SUM([Games – PY])

Both these 2 fields are formatted to a custom format of ,##.000;-#,##.000 (this basically strips off the leading 0, so rather than 0.595 it’ll display as .595.

Now we have all these, we can build a dual axis chart.

Add League and Team to Rows and Wins per Game – CY to Columns. Sort by Wins per Game – CY descending. Display the pYear parameter.

Add Wins per Game – PY to Columns, make dual axis and synchronise the axis. Change the mark type of the CY card to a bar and the mark type of the PY card to Gantt. Remove Measure Names from the Colour shelf of both cards, and change the colour of the gantt bar to black.

Show mark labels for the bar mark and align left (expand the width of each row if need be).

Colouring the bars

The colour of the bars is based on whether the CY value is greater or less than the PY value. So we need to find the difference

CY-PY DIfference

[Wins per Game – CY]-[Wins per Game – PY ]

and then work out if the difference is positive or not

CY-PY Difference is +ve

[CY-PY Difference]>0

Add this to the Colour shelf of the bar marks card and adjust accordingly.

Adding the call out indicator

The red circle is based on whether the difference is above a threshold set by the user . A parameter is required for this

pChange

float field set to 0.1 by default; the values should range from 0.05 to 0.2 in 0.05 intervals

Show this parameter on the sheet.

We then can create

CY-PY Diff Greater than Change

IF ABS([CY-PY Difference])>[pChange] THEN ‘●’ ELSE ” END

The ABS function is used, as we want to show the indicator regardless as to whether the difference is a +ve or -ve difference. The ● image I get from copying from https://jrgraphix.net/r/Unicode/25A0-25FF. I use this page a lot, so keep it bookmarked.

Add this field to Rows, and then format the field so it is red and the font size is bigger (I used 12pt)

Now the viz just needs to be tidied up by

  • Hide field labels for rows
  • Rotate label of the League field
  • Format the font of the Team text
  • Reduce the width of the first three columns
  • Remove gridlines and zero lines
  • Adjust the row divider to be a dotted line at the 2nd level
  • Remove column dividers
  • Add an axis ruler to Rows
  • Uncheck Show Header on the axis to hide them
  • Tidy up the tooltip on the gantt mark type.

Building the Viz in Tooltip line chart

For this we need the Win % for every year, so we need

Wins per Game

SUM([Wins]) / SUM ([Games])

Add Team to Rows, Year to Columns (change to be a continuous, green pill) and Wins per Game to Rows

Add pYear to the Detail shelf, then add a Reference line to the Year axis to display the value of pYear as as dotted line

Right click on the reference line > format and adjust the alignment of the value displayed to be top centre.

Show mark labels and set to just show the min & max values for each line.

Colouring the lines

The colour of the line is based on whether the current year’s value is bigger or smaller than the previous year (ie the colour of the line matches the bar).

However, we can’t just use the fields we’ve already built, as because we have Year in the view, the data only exists against the appropriate year, so the difference can’t be computed. You can see this better if you build out the table below…

We need to ‘spread’ these values across every year for each team.

Wins per Game – CY Win Max

WINDOW_MAX(SUM([Wins – CY]) / SUM([Games – CY]))

Format this as you did above.

Add this to the view and amend the table calculation so it is computing by the Year field only. You should see that the value of this field matches the Wins per Game – CY value, but the same value is listed against every year now, rather than just the one selected.

Similarly, create

Wins per Game – PY Win Max

WINDOW_MAX(SUM([Wins – PY]) / SUM([Games – PY]))

format, and add this to the view too and adjust the table calculation as you did above.

So now we have this, we can work out whether the difference between these two fields is +ve or not

CY-PY Difference is +ve Win Max

[Wins per Game – CY Win Max]- [Wins per Game – PY Win Max]>0

Add this to the Colour shelf of the line chart. Verfify the table calculation is set to compute by Year only for both nested calculations, and then adjust the colours to match.

Finally tidy up the viz, to remove all headings, axis, gridlines, row/column dividers etc Set the background colour of the sheet to a light grey, then finally set the fit to be Entire View.

Adding the Viz in Tooltip

Go back the bar chart, and on the tooltip of the bar mark, adjust the initial text, then insert the line chart sheet via the Insert > Sheets > Sheetname option.

I adjusted the maxwidth & maxheight properties of the inserted text to be 350px each

If you hover over the bar chart now, you should get a filtered view of the line chart.

Final step is to put all this on a dashboard. My published viz is here.

Happy vizzin’!

Donna

Can you show the top and bottom profit performers for the selected time period?

With #TC21 looming next week, Candra’s set this week’s challenge, based on inspiration from past Tableau Conferences – a simple looking, but effective visualisation for understanding profit performance within some pre-established timeframes.

  • Building the BANs
  • Identifying Top 5 / Bottom 5 / Everything Else
  • Building the Chart and Labelling the Bars
  • Adding the interactivity

Building the BANs

The timeframes we need to report over need to be based on a specific date. In this case it’s the latest date in the data set. If you were using this for a business dashboard, you might be basing it on Today / 1st of the Current Month etc. Rather than hardcode the date I need, I’ve worked out the latest month I want to use by

Max Month

{FIXED:MAX(DATE(DATETRUNC(‘month’,[Order Date])))}

Set all the Order Dates in the data set to be the 1st of the month, then get the maximum of these dates. So as the last date in the data set is 30th Dec 2021, that’s been truncated to 1st Dec 2021 which is then what this field stores.

I then want to capture the profit values for each month, quarter, year into separate fields, so we have

Month

IF DATETRUNC(‘month’, [Order Date])=[Max Month] THEN [Profit] END

This only stores Profit values for rows where the Order Date is also in December

Quarter

IF DATETRUNC(‘quarter’,[Order Date])=DATETRUNC(‘quarter’,[Max Month]) THEN [Profit] END

This only stores Profit values for rows where the Order Date is in the same quarter as December (ie the 4th quarter which is months Oct-Dec).

Year

IF DATETRUNC(‘year’,[Order Date])=DATETRUNC(‘year’,[Max Month]) THEN [Profit] END

This stores Profit data for rows where the Order Date is in the same year.

All these fields are formatted to be $ with 0 dp.

A basic viz can the be built with Measure Names on Columns and Measure Names and Measure Values on Text. The Measure Names heading is then hidden, and the font and table formatting adjusted so the sheet looks as below.

Note – Naming these fields Month, Quarter, Year rather than Monthly Sales, Quarterly Sales etc, makes this display much easier and also helps with the interaction later.

Identifying the Top 5 / Bottom 5 / Everything Else

We need to be able to identify the Sub-Categories which have the best profits, those that have the worst, and the ‘rest’. We’re going to use Sets to help us with this. However the set entries could change depending on whether we’re looking by month, by quarter or by year. So first we need to create a field that is going to store the particular Profit value we need depending on what time period is being selected.

We need a parameter pDatePart to capture the time frame. This is a string field which is just defaulted to the text ‘Month’.

The interactivity later will set this parameter to the different values.

So now we know the ‘selected’ date part, we need to get the appropriate profit value

Value To Plot

CASE [pDatePart]
WHEN ‘Month’ THEN [Month]
WHEN ‘Quarter’ THEN [Quarter]
ELSE [Year]
END

This just uses the values from the 3 measures we created to start with.

So now we can create the sets we need. Right click on Sub-Category > Create > Set and create a set called Top 5 that is based on the Top 5 Value to Plot values

Then create another set in the same way called Bottom 5

With these sets, we can now determine the Sub-Category ‘label’ that will be displayed

Sub-Category Display

IF [Top 5] OR [Bottom 5] THEN [Sub-Category] ELSE ‘Everyone Else’ END

and the grouping that will be used to colour the bars

Sub Cat Group

IF [Top 5] THEN ‘Top 5’
ELSEIF [Bottom 5] THEN ‘Bottom 5’
ELSE ‘Everything Else’
END

Building the Chart & Labelling the Bars

Ok, so now we’ve got the building blocks in place, we can build the chart. You will probably be tempted to build a bar chart (I did to start with), but positioning the labels then became a bit tricksy. When we get to the labels, we’re going to need to use the left and right alignment options. However, when you build a bar chart, if you right align the label, the label will be positioned outside at the end of the bar (even though this seems a little odd with negative values, as it looks to be on the left…).

Right aligned labels

But then we set the labels to be left aligned, the labels appear inside the bar instead, and not outside on the left.

Left aligned labels

So instead, rather than using the bar mark type, we need to build this chart using the gantt mark type, and base the Size on the Value to Plot field.

However, the value being plotted is actually an average value based on the number of Sub-Categories being ‘grouped’ as otherwise the value associated to Everything Else can end up bigger than all the rest. I created the following field

Avg Value To Plot

SUM([Value to Plot])/COUNTD([Sub-Category])

formatted to $ with 0dp.

So now we start building by adding Sub-Category Display to Rows and type in MIN(0) into Columns. Change the mark type to Gantt and add Avg Value To Plot to Size. Add Sub-Cat Group to Colour and adjust accordingly. Sort the Sub-Category Display field by Avg Value To Plot descending.

Now we can’t just label by a single field of the value or the sub-category, as while the ‘automatic’ label alignment option, almost puts the labels in the right positions, there is no way to define an ‘opposite’ to the ‘automatic’ alignment. We need to define some dedicated label fields based on where we want them to display.

Label – Left – Profit

IF [Avg Value To Plot]<0 THEN [Avg Value To Plot]
END

If we’re in the bottom half of the chart, we’re going to display the Profit value on the left side.

Label – Left – Sub Cat

IF [Avg Value To Plot]>=0 THEN ATTR([Sub-Category Display])
END

If we’re in the top half of the chart, we’re going to display the Sub-Category Display on the left side.

Add both these fields to the Label shelf and then adjust the label alignment to be left.

To label the other ends, we need to create two further label fields

Label – RightProfit

IF [Avg Value To Plot]>=0 THEN [Avg Value To Plot]
END

Label – Right – Sub Cat

IF [Avg Value To Plot]<0 THEN ATTR([Sub-Category Display])
END

We then need to create another MIN(0) on Columns (easiest way is to hold down control, then click on the existing MIN(0) field and drag it next to itself to create a duplicate. Then on the 2nd marks card, remove the two Label – Left – xxx fields and add the two Label – Right -xxx fields. Change the alignment to right.

The make the chart Dual Axis and synchronise the axis.

Now you can hide the Sub-Category Display header from showing, hide the axis, remove gridlines etc.

Adding the interactivity

Once the two sheets are on the dashboard, you can add a dashboard parameter action which will on select of the KPI/BAN chart, pass the Measure Name into the pDatePart parameter. When the mark is unselected, the parameter value should stay as it is.

And hopefully, you should now have a working viz. My published version is here.

Enjoy #TC21!

Happy vizzin’! Stay Safe!

Donna

When do extracts run during the day?

When I was approached by the #WOW crew to provide a guest challenge, I was a little unsure as to what I could do. I primarily work as a Tableau Server admin, so rarely have a need to build dashboards (which is why I like to do the weekly #WOW challenges, to keep up my Desktop skills). Then the next day I was looking at a dashboard I’d built to monitor extracts on our Tableau Servers, and I thought it would be an ideal candidate for a challenge. I also thought it would provide any users of Tableau Server with the opportunity to implement this dashboard in their own organisation if they wished, by sharing with their Server Admins.

As a Tableau Server Admin, you get access to a set of ‘out of the box’ Admin views, one of which is called ‘Background Tasks for Extracts’ which gives you a view of when extracted data sources and workbooks run on the server. However while the provided view is fine if you want to quickly see what’s going on now, it’s not ideal if you want to see how things ran over a longer timeframe – it involves a lot of horizontal scrolling.

Many server admins will have ‘opened’ up access to the Tableau repository, the PostgreSQL database which stores a rich array of data, all about your Tableau Server [see here for further info], and enables admins to extend their analysis beyond the provided Admin views. This site even provides a set of pre-curated data sources to help you get started! These aren’t formally supported by Tableau, but is the brain-child of Matt Coles, a server admin at Tableau (no relation to me though!).

My dashboard doesn’t actually use one of these data sources though. For the challenge, I’ve just created some sample, anonymised data in the required structure. I’ll explain later at the end of the post how to go about converting this to use ‘real’ server data, if you do want to plug it into your own server environment.

Understanding the data

When using Tableau Server, published data sources and workbooks can connect to their underlying data source (eg a SQL Server database, an Excel file etc) directly (ie via a live connection) or via an extract. An extract means that a copy of the data is pulled from the underlying data source and stored on Tableau Server utilising Tableau’s ‘in memory’ engine when the data is then queried. An extract gets added to a schedule which dictates the time when the extract will get refreshed; this may be weekly, daily, hourly etc. Every time the extract runs, a background task is created which provides all the relevant details about that task. The data for this challenge provides 1 row for each extract task that was created between Monday 11th Jan 2021 and Friday 5th Feb 2021. The key fields of note are:

  • Id – uniquely identifies a task
  • Created At – when the task was created
  • Started At – when the task actually started running (if too many tasks are set to run at the same time, they will queue until the server resources are available to execute them).
  • Completed At – when the task finished, will be NULL if task hasn’t finished.
  • Finish Code – indicates the completion status of the job (0=success, 1=failed, 2= cancelled)
  • Progress – supposed to define the % complete, but has been observed to only ever contain 0 or 100, where 100 is complete.
  • Title – the name of the extract
  • Site – the name of the site on the server the extract is associated to

Based on the Finish Code and Progress, I have derived a calculated field to determine the state of the extract (to be honest, I think this is a definition I have inherited from closer analysis of the Background Tasks for Extracts Server Admin view, so am trusting Tableau with the logic).

Extract Status

IF [Finish Code] = 1 AND [Progress] <> 100 THEN ‘In Progress’
ELSEIF [Finish Code] = 0 AND NOT [Progress] = 1 THEN ‘Success’
ELSE ‘Failed’
END

Building the required calculated fields

The intention when being used ‘in real life’, is to have visibility of what’s going on ‘Now’ as well as how extracts over the previous few days have performed. As we’re working with static data, we need to hardcode what ‘Now’ is. I’ll use a parameter for this, so that in the event you do choose to plug this into your own server, you only have to replace any reference to the Now parameter with the function NOW().

Now

Datetime parameter defaulted to 05 Feb 2021 16:30

The chart we are going to build is a Gantt chart, with 1 bar related to the waiting time of the task, and 1 bar related to the running time of the task. We only have the dates, so need to work out the duration of both of these. These need to be calculated as a proportion of 1 day, since that is what the timeframe is displayed over.

Waiting Time

(DATEDIFF(‘second’, [Created At], IF ISNULL(Started At]) THEN [Now] ELSE [Started At ] END))/86400

Find the difference in seconds, between the create time and start time (or Now, if the task hasn’t yet started), and divide by 86400 which is the number of seconds in a day.

We repeat this for the processing/running time, but this time comparing the start time with the completed time.

Processing Time

(DATEDIFF(‘second’, [Started At], IF ISNULL([Completed At]) THEN [Now] ELSE [Completed At] END))/86400

As mentioned the timeframe we’re displaying over is a 24 hr period, and we want to display the different days over multiple rows, rather than on a single continuous time axis spanning several days.

To achieve this, we need to ‘baseline’ or ‘normalise’ the Created At field to be the exact same day for all the rows of data, but the time of day needs to reflect the actual Created At time . This technique to shift the dates to the same date, is described in this Tableau Knowledge Base article.

Created At Baseline

DATEADD(‘day’, DATEDIFF(‘day’, [Created At], #2021-01-01#), [Created At])

And again, we’re going to need to do a similar thing with the Started At field

Started At Baseline

DATEADD(‘day’, DATEDIFF(‘day’, [Started At], #2021-01-01#), [Started At])

Putting this out into a table, you can see what this data all looks like (note, I’m just choosing a arbitrary date of 01 Jan 2021, so my baseline dates are all on this date:

Building the Gantt chart

We’re going to build a dual axis Gantt chart for this.

  • Add Site to Rows
  • Add Title to Rows
  • Add Created At to Rows. Set it to the day/month/year level and set to be discrete (ie blue pill). Format the field to custom format of ddd dd mmm yyyy so it displays like Mon 11 Jan 2021 etc
  • Add Created At Baseline to Columns, set to exact date
  • Add Waiting Time (Avg) to Size and adjust to be thin

This will automatically create a Gantt chart view

Next

  • Add Started At Baseline to Columns, set to exact date, and move so the pill is now placed to the right of the Created At Baseline pill
  • On the Started At Baseline marks card, remove Waiting Time and add Processing Time (Avg) to the Size shelf instead. Adjust so the size is thicker.
  • Set the chart to be dual axis and synchronise the axes

The thicker bars based on the Started At / Processing Time need to be coloured based on Extract Status. Add this field to the Colour shelf of the Started At Baseline marks card and adjust accordingly.

The thinner bars based on the Created At / Waiting Time need to be coloured based on how long the wait time is (over 10 mins or not).

Over Wait Time Threshold

[Waiting Time] > 0.007

0.007 represents 10 mins as a proportion of the number of minutes in a day (10 / (60*24) ).

Add this field to the Colour shelf of the Created At Baseline marks card and adjust accordingly (I chose to set to the same red/grey values used to colour the other bars, but set the transparency of these bars to 50%).

Formatting the Tooltips

The tooltip for the Waiting Time bar displays

The Created At Baseline and Started At Baseline should both be added to the Tooltip shelf and then custom formatted to h:mm am/pm

The Waiting Time needs to be custom formatted to hh:mm:ss

The tooltip for the Processing Time bar is similar but there are small differences in the display,

Formatting the axes

The dates on the axes are displayed as time in am/pm format.

To set this, the Created At Baseline / Started At Baseline pills on the Columns shelf need to be formatted to h:mm am/pm

Adding the reference band

The reference band is used to highlight core working hours between 8am and 5pm. Right click on the Created At Baseline axis and Add Reference Line. Create a reference band using constants, and set the fill colour accordingly.

Apply further formatting to suit – adjust sizes of fonts, add vertical gridlines, hide column/axes titles.

Filtering the dates displayed

As discussed above, when using this chart in my day to day job, I’d be looking at the data ‘Now’. As a consequence I can simply use a relative date quick filter on the Started At field, which I default to Last 7 days.

However, as this challenge is based on static data, we need to craft this functionality slightly differently.

We’re only going to show up to 10 days worth of data, and will drive this using a parameter.

pDaysToShow

An integer parameter, ranging from 1 to 10, defaulted to 7, and formatted to display with a suffix of ‘ days’.

We then need a calculated field to use to filter the dates

Filter : Days to Show

DATETRUNC(‘day’,[Created At]) >= DATEADD(‘day’,([pDaysToShow]-1)*-1,DATETRUNC(‘day’,[Now]))

Add this to the Filter shelf and set to True.

Additionally, the chart can be filtered by Site, so add this to the Filter shelf too.

Building the Key legend

Some people may build this by adding a separate data source, but I’m just going to work with the data we have. This technique is reliant on knowing your data well and whether it will always exist.

On a new sheet, add Site to the Filter shelf and filter to sites 7 and 9.

Create a new field

Key Label

If [Site] = ‘Site 9’ THEN ‘Waiting’ ELSE ‘Processing’ END

and add this to the Columns shelf and sort the field descending, so Waiting is listed before Processing.

Alongside this field, type directly into the Columns shelf MIN(1).

Edit the axes to be fixed to from 0 to 1. Then add the Site field to the Colour shelf and also to the Size shelf and adjust accordingly (you may need to reverse the sizes). I lightened the colour by changing the opacity to 50%.

Now hide the axes, remove row & column borders, hide the column title and turn off tooltips.

The information can all now be added to a dashboard.

Using your own data

To use this chart with your own Tableau Server instance, you need to create a data source against the Tableau postgres repository that connects to the _background_tasks (bgt) table with an inner join to the _sites (s) table on bgt.site_id = s.Id. Rename the name field from the _sites table to Site. If you don’t use multiple sites on your Tableau Server instance, then the join is not required. The sole purpose of the join is to get the actual name of the site to use in the display/filter.

You should then be able to repoint the data source from the Excel sheet to the postgres connection. You may find you need to readjust some of the colours though.

When I run this, I’m using a live connection so I can see what is happening at the point of viewing, rather than using a scheduled extract. To help with this, I add a data source filter to limit the days of data to return from the query (eg Created at <=10 days), which significantly reduces the data volume returned with a live connection.

Hopefully you enjoyed this ‘real world’ challenge, and your server admins are singing your praises over the brilliance of this dashboard 🙂

My published version is here.

If you’ve got any feedback or suggestions on improvements to enhance the viz even further, please do let me know.

Happy vizzin’! Stay safe!

Donna