Can you label & sort small multiples?

A colourful #WOW2022 challenge this week set by Kyle Yetter and using his favourite data – Baseball. Let’s jump straight in.

Building the required calculations

First up we need to calculate the core measure the viz is based on – % of wins

Win %

SUM([Wins])/SUM([Games])

I formatted this to 3 decimal places, then applied a custom number format to remove the leading 0 (custom number format looks like ,##.000;-#,##.000).

We also need to know the number of losses as this is part of the tooltip.

Losses

SUM([Games]) – SUM([Wins])

Let’s pop all these out into a table (I formatted all the whole numbers to display without any decimal places).

The viz however isn’t plotting the actual Win%, it’s plotting the difference from 50% (or 0.5), so values less than 50% are negative and those above are positive.

Plot Postion

[Win %] – 0.5

And we also need to know whether the Win% is above 50% or not

Above 50%

[Win %]>0.5

Pop these out onto the table too

The viz also displays the overall Win% for each team, and also uses this to sort the data. As it is used for sorting, we need to use an LoD calculation (rather than a table calculation).

Overall Win% LOD

{FIXED [Team]:SUM([Wins])} / {FIXED [Team]: SUM([Games])}

for each team, get the total wins, and divide by the total games for the team. Format this to 3 dp with no leading 0 as before.

pop this into the view (you’ll see it’s the same value for each row for a single team), and then apply a Sort on the Team field to sort descending by the Overeall Win% LOD.

Now we have the data sorted, we can create the fields needed to build the trellis chart.

I have already blogged challenges relating to trellis charts / small multiples (see here) which in turn reference other blogs in the community, so I’m not going to go into all the details. We just need to build two calculated fields to identify which row and which column each Team will sit in. The table is fixed at 6 columns wide as the data wea re using is static. Some solutions work with a more dynamic layout depending on how many entities you need to display. We’re keeping things simpler.

Cols

FLOAT(INT((INDEX()-1)%6))

Rows

FLOAT(INT((INDEX()-1)/6))

Add both these fields to the table as discrete dimensions (blue pills), and as they are both table calculations, set them both to Compute Using – > Team.

Building the Core Viz

On a new sheet, add Cols to Columns as discrete dimension, Rows to Rows as discrete dimension and Team to Detail. Set both Rows and Cols to Compute Using Team.

Add Year as continuous (green) pill to Columns and Plot Position to Rows and change the mark type to Bar and reduce the size. Sort the Team field based on Overall Win% LOD descending.

Add Wins, Losses, and Win% to the Tooltip shelf and adjust the tooltip to display as required. Add Above 50% to the Colour shelf (you may need to readjust the size). Leave the colours as they are for now – we’ll deal with this later.

Adding the labels

Create a new calculated field

Dummy Plot

FLOAT(IF [Year]=2000 OR Year = 2020 THEN 0.35 END)

This is basically going to position a mark at height 0.35 but only if the year is either 2000 or 2020. These values were all just based on a bit of trial and error as to what worked to get the desired result.

Also create a field

LABEL:Team

IF [Year]=2000 THEN [Team] END

and

LABEL:Win%

IF [Year]=2020 THEN [Overall Win % LOD] END

format this to 3dp and exclude the leading 0.

Add Dummy Plot onto Rows and change the mark type of this measure to circle. Amend the Tooltip of this marks card so it’s empty.

Add LABEL:TEAM and LABEL:Win% to the Label shelf, and adjust the label so both fields sit side by side (only 1 value will only ever actually display). Adjust the table calculation of both the Rows and Cols pills so they now compute using both the Team and the LABEL:Team fields.

Adjust the alignment of the labels so they are positioned bottom centre. Set the font colour to match mark colour and bold.

Then reduce the size of the circle mark to as small as possible, reduce the opacity of the mark colour to 0.

Now make the chart dual axis and synchronise the axis. Remove the Measure Names field that has automatically been added to the All marks card.

Hide all the headers and axis (uncheck Show Header), remove all grid lines, zero line, axis rulers.

Hide the null indicator (bottom right).

Colouring by Team

Copy the colour palette text Kyle provided into your preferences.tps file (usually located in the My Tableau Repository directory). For more information on working with custom colour palettes see this Tableau help article.

You’ll need to save your workbook and re-open for the new palette to be available for use.

In order to prevent having to manually set all the colours (and believe me you don’t want to do this!), perform the following steps in order

  • Add Team to also be on the Colour shelf. Click on the 3 dots (…) that are to the left of the Team pill on the All marks card, and change it to Colour. This means there are now 2 fields on colour. Move the Team field so it is listed above the Above 50% pill. This means your colour legend should be listed as <Team>, <True|False>
  • Adjust the Sort of the Above 50% pill, so it is manually sorted to list True before False.

  • Now change the Sort on the Team field so it is sorted alphabetically ascending instead. This will cause the viz to change its sort order, but don’t worry for now. It also changes the list on the colour legend, so ARI, True is listed first then ARI, False etc.

  • Now edit the Colour Legend and select the new MLB Team Colours palette we added. Click the Assign Palette button to automatically assign the colours. As we’ve made sure the entries listed are in the right order, they should get the correct colours.

  • Change the Sort on the Team field back to be based on Overall Win% LOD descending

And that should be it. You can now add the viz to a dashboard and publish. My published version is here.

Happy vizzin’!

Donna

Advertisement

Selected Sub-Category Influence

Ann Jackson made a special guest appearance this week, setting this challenge to introduce the newly released 2022.3 feature of dynamic zone visibility. As a consequence, a pre-requisite to completing this challenge is to install v2022.3 🙂

I used 6 sheets to build my solution, and I’ll step through each one, and then what’s required to put it all together.

  • Building the Sub-Category Picker
  • Building the Sub-Category Counter
  • Building the Bar Chart
  • Building the Line Chart
  • Building the Viz in Tooltip
  • Building the Dot Plot
  • Hiding & Showing the Charts
  • Adding & Removing Sub-Categories

Building the Sub-Category Picker

On a new sheet, add Sub-Category to Rows and change the mark type to circle. Right click on the Sub-Category field in the Data pane, and Create -> Set. Select the entries from Accessories down to Copiers. This will create a new field in the data pane called Sub-Category Set. Add this field to the Colour shelf, and adjust colours according to whether the values are In or Out of the set.

Add a grey border around the circles (via the Colour shelf) and increase the size a bit. Format the text of the sub-categories so its larger and right aligned. Remove all row/column dividers and hide field labels for rows to remove the Sub-Category column title (right-click on the column title). Adjust the Tooltip. Add a title to the sheet and then name the sheet Sub Cat Picker or similar.

Building the Sub-Category Counter

Create a new calculated field

Count Sub Cats Selected

COUNTD(IF [Sub-Category Set] THEN [Sub-Category] END)

If the Sub-Category is in the set, the return the Sub-Category and count the number of distinct entries.

Then create

Count Total Sub Cats

COUNTD([Sub-Category])

This just counts them all.

On a new sheet, add both fields to the Text shelf, and adjust the text accordingly.

Remove the tooltip so it doesn’t display. Name the sheet Set Count Label or similar.

Building the Bar Chart

Create a new field

Profit Ratio

SUM([Profit])/SUM([Sales])

and format to a % with 0 dp.

On a new sheet, add Profit Ratio to Rows and Sub-Category Set to Columns and also to Colour.

Right click on the text ‘In’ either on the colour legend or at the bottom of the bar, and Edit Alias. Change the text to SELECTED. Do the same thing for the text ‘Out’ and change to OTHER.

Add a row grand total (Analysis menu -> Totals -> Show Row Grand Totals). Adjust the colour of the grand total bar. Right click on the text ‘Grand Total’ and select Format. In the pane on the left hand side, change the Grand Total Label to read ALL PRODUCTS.

Create a calculated field

Overall PR

{SUM([Profit])}/{SUM([Sales])}

The { } make this a FIXED Level of Detail (LoD) calculation, so calculates over the complete data set.

Then create

PR Difference from Overall

[Profit Ratio]-SUM([Overall PR])

and format this to a custom number format of +0.0%;-0.0%;

Adding the second semi-colon implies there is a format for +ve numbers, -ve numbers and zero. In this instance we want zero difference to be displayed as blank.

Add both these fields the the Label shelf and adjust the font/layout accordingly, and match mark colour. Adjust the tooltip too.

Remove the profit ratio axis, remove all gridlines and row/column dividers. Add an axis ruler to the columns. Adjust the colour/size of the column labels. Hide the In/Out of Sub-Category Set column label (hide field label for columns). Add a title to the sheet and name the sheet Bar Chart or similar.

Building the Line Chart

On a new sheet, and Order Date to Columns and set to be a continuous (green) pill at the Quarter-Year level. Add Profit Ratio to Rows and Sub-Category Set to Colour.

Create a new calculated field

PR Per Quarter

{FIXED DATETRUNC(‘quarter’, [Order Date]):SUM([Profit])} /
{FIXED DATETRUNC(‘quarter’, [Order Date]):SUM([Sales])}

and format this to % with 0dp.

Add this to Rows next to Profit Ratio.

Make the chart dual axis and synchronise the axis. Remove Measure Names from the All marks card, and remove the In/Out Sub-Category Set pill from the colour shelf of the PR Per Quarter marks card. Manually adjust the colour of this line to the appropriate shade.

On the All marks card, click the Label shelf, and check the Show mark labels option and select line ends. Adjust the font of the labels to be smaller, bold and to match mark colour.

Right click on the PR Per Quarter axis on the right hand side and select move marks to back, the right click again and uncheck Show Header to hide that axis.

Right click on the bottom axis, and Edit Axis and remove the axis title. Edit the left hand axis and amend the title so its capitalised.

Format the font of both axis, so the text is smaller, and then remove all row/column dividers, and all gridlines and zero lines. Add axis rulers for both the rows and columns.

Then add a sheet title and subtitle and name the sheet Line Chart or similar. I use this site to get the circular symbols used in the subtitle.

Building the Viz in Tooltip

The line chart shows another chart on hover.

On a new sheet, add Order Date as a blue discrete pill set to the Quarter-Year level, to Rows then add Sub-Category Set to Rows too. On the Columns shelf, double click and manually type in MIN(1). Add Sub-Category Set to Colour, then edit the MIN(1) axis to fix it from 0 to 1.

Add subtotals (Analysis menu -> Totals -> Add all subtotals). This will add a Total row to each section. Manually adjust the colour of the Total bar if need be via the colour legend.

Right click on the text ‘Total’ in the chart, and format. Amend the Total label to read ‘ALL PRODUCTS’ instead.

Add Profit Ratio to the Label and ensure the font matches mark colour. You may need to adjust the font size and boldness, and expand the row height a bit to see the text.

Hide the Order Date column, adjust the font of the Sub-Category Set column to be darker/bolder and right aligned, and adjust the column width so all the text is displayed.

Hide the MIN(1) axis, remove all row/column dividers and hide the Sub-Category Set column label. Then set the sheet to Entire View, and name the sheet VIT or similar.

Return the Line Chart sheet, and on the Tooltip shelf of the All marks card, adjust the tooltip to display the Order Date and insert a reference to the VIT sheet via the Insert -> Sheets -> VIT option

Adjust the height and width to suit.

Building the Dot Plot

On a new sheet, add Sub-Category Set to Rows and Profit Ratio to Columns. Change the mark type to circle. Then add Product ID to the Detail shelf and Sub-Category Set to Colour. Add column grand totals and adjust the colour of the grand total if need be. Format the ‘Grand Total’ text so it reads ALL PRODUCTS.

To get a clearer idea of how many products there are, we are going to randomly spread the dots across a vertical y-axis. For this we create

Jitter

RANDOM()

This just returns a number between 0 and 1.

Add this field to Rows and change it to be a Dimension. Adjust the opacity of the Colour to 50%.

Hide the Jitter axis. Make the header column wider, so the text doesn’t wrap, and adjust the text to b bigger and bolder and right aligned. Hide the Sub-Category Set column heading. Adjust the size and title of the Profit Ratio axis. Remove all gridlines and column dividers.

Right click on the Profit Ratio axis and add a reference line, which is set per pane to the the Total of the Profit Ratio. Use the Value as label and set the line to be a dotted black line at 100% opacity.

Add Product Name to the Tooltip and adjust accordingly. Add a title and name the sheet Dot Plot or similar.

Hiding & Showing the Charts

We’re going to control which sheet displays by use of a parameter, so I created

pChartSelector

an integer list from 1-3 which are mapped to the 3 display values

Then create a dashboard sheet and using layout containers build out the dashboard. I used a horizontal container in the centre of my dashboard. Within that I used a vertical container to house the Sub-Category Picker and the Sub-Category Counter. Then the 3 charts (bar, line and dot plot) were arranged next to that. I fixed the width of the vertical container with the picker and counter. The pChartSelector parameter is then added at the top right. I made use of both inner and outer padding and background colours of pale grey and white to get the look as reqiured.

To make the hide/show functionality, I created the following fields

Show bar

[pChartSelector]=1

Show line

[pChartSelector]=2

Show dot plot

[pChartSelector]=3

I added Show bar to the Detail shelf of the bar chart sheet, Show line to the Detail shelf of the line chart sheet and Show dot plot to the Detail shelf of the dot plot sheet.

Then back on the dashboard, I selected the bar chart sheet (so it’s surrounded by a dark grey border), and on the Layout tab on the left hand side, I checked the Control visibility using value checkbox and selected the Show bar field

I then repeated this process, this time selecting the line chart sheet, and when I checked the Control visibility checkbox, I selected the show line field instead. this made the line chart disappear, since my parameter was set to ‘Compared to the Total’ which was equivalent to the parameter = 1 and not 2. Changing the parameter to ‘Over Time’ and my line chart showed and the bar disappeared.

Repeat the process again for the dot plot, selecting the show dot plot field instead. Now only 1 chart should display at a time.

Adding & Removing Sub-Categories

The final step is to add the interactivity to allow selection and removal of a sub-category when clicking on the circles of the Sub-Category Picker sheet.

First, you need to add a dashboard action which changes set values

Add Sub-Categories

Uses the Sub Cat Picker sheet as source and on Select targets the Sub-Category Set by Adding values to the set. The values are retained when the selection is cleared.

Then add another dashboard action to change set values. This one is called

REMOVE

Uses the Sub Cat Picker sheet as source and via the Menu targets the Sub-Category Set by Removing values from the set. The values are retained when the selection is cleared.

The title of REMOVE is what is then displayed in the text of the tooltip when a circle that has been added is the clicked again.

Phew!

Quite a lengthy post this week, but there’s a lot going on. My published viz is here.

Happy vizzin’!

Donna

Can you use Tableau to estimate Aaron Judge’s home run trajectories?

It’s community month still for #WOW2022, and this week saw Samuel Epley set this challenge to visualise the home run trajectories of Aaron Judge.

I had a little mini-break to Rome this week, so was hoping I was going to be able to get this week’s challenge done and dusted on the Tuesday evening if it landed early enough, as I wasn’t going to be around.

It did land on the Tuesday for me, but wow! it was not going to be easy! I managed to build the KPIs & the scatter plots on the Tuesday evening, and knowing I didn’t have much time, just chose to use the Home Runs stats data set only. I knew these charts weren’t going to need any data densification, so found this approach simpler.

I’m afraid I’m still constrained by time at the moment, so this post isn’t going to be the detailed walkthrough you might usually expect – sorry! I’m just going to try to pull out key points from each chart.

KPIs

I built this on a single sheet, using Measure Names and Measure Values.

I used aliases on the Measure Names (right click -> Aliases) to change the label you can see displayed ie the Distance pill is aliased to ‘Average Distance’

I also custom formatted the various numbers and applied suffixes to display the unit of measure

Note – to To get the degree symbol, I typed Alt+ 0176

Scatter Plots

I built the Exit Velocity by Distance scatter plot first, and completed all the formatting & tooltips. Then I duplicated the sheet to form the basis of the other scatter plots, and just swapped the relevant pills as needed.

For the ball shape, I loaded the provided images as custom shapes into my shapes repository. I then just created the following calculated field to use as a discrete dimension I could add to the Shape shelf

Ball Shape

[HR Number]%9

It’s not as completely randomised as perhaps it should be, but it looks random enough on the display.

The Pitcher in the data is in the format <Surname>, <Forename>, but on the tooltip it needs to display as <Forename> <Surname>, so I just used a transformation on the Pitcher field to split the field based on the comma (right click Pitcher -> Transform -> Split). This automatically created 2 fields I could use on the Tooltip.

I also noticed a very subtle wording change in the tooltip based on whether the match was Home or Away. If Home, the tooltip read ‘New York Yankees vs. <Opposition>’ otherwise it read ‘New York Yankees at <Opposition>’. I used a calculated field for this logic

TOOLTIP: vs or at

IIF([Location]=’Home’,’vs.’, ‘at’)

The Trajectory Plot

OK, so this was the hardest part of this challenge, and mainly due to getting your head round the physics involved, as so many of the calculations are dependent on each other.

I’m generally pretty confident with my maths, but this was complex, especially with the force calculations for the y-axis. Samuel stated that both gravity and drag impacted the Y-axis calcs, but it wasn’t clear to me how both these forces should be applied (a bit of trial and error and I ended up adding them within the formula).

By the time I came to tackle this challenge, Samuel had already posted a video walkthrough, which can be viewed here and is another reason why I’m not going down to the nth degree in this post.

My suggestion is to watch Samuel’s video and/or feel free to download my workbook. I built my workbook independent of Samuel’s video, so there may be steps/calculations that differ.

However, I have tried to number my calculations in the order in which I created them, so you can hopefully follow the thought process. I have also left a CHK:Data sheet in the workbook, which I used to sense check what I was doing.

All the table calculations in the CHK:Data sheet are just set to the default ‘table down’ as I have filtered the sheet to a specific Home Run (HR Number = 1) only (ie I didn’t change any of the table calc settings as I added the pills to the sheet).

However, when you build the main trajectory chart, you have multiple HR Numbers in the view, so all the table calculations must be set so that calculations are only working for each HR Number. This means that any table calc (and any nested calculations) need to have all the fields except HR Number checked

When using the Pages shelf, which isn’t something I’ve ever really had to do before, you need to Show History and adjust the various settings to get the trail lines to show

To rotate the ball (the bonus option), you need another field to use on the Shape shelf. I had lost the will to live a bit by this point, so used the formula from my friend Rosario Gauna’s solution.

Rotation Shape

STR(IIF([14-Start Position Y m] <= 0, 0,
(MIN([Time Interval]) * 1000 / 25) % 9))

Note – when you add this to the Shape shelf, and select your baseball palette, just then use the Assign Palette button to automatically assign a ball to a number – this will get them into the correct order, without you having to do it one by one.

Finally, when adding the reference average lines, be sure to set the scope to per pane rather than table, otherwise you’ll end up with the wrong figures.

I think I’ve pretty much covered all the ‘little’ points that I came across that may trip you up, aside from all the tricky calcs of course!

My published workbook is here. I hope what I’ve written is enough for you to build it yourself. I think I’d still be here next year if I tried to do anything more fully! I’m off for a lie down now!

Happy vizzin’!

Donna

Can you build a weekly year-over-year line chart?

Week 2 of #WOW2022 was Kyle Yetter’s first challenge as official WOW coach. On first glance when it was posted on Twitter, I thought it didn’t look too bad… I figured they’d be some ‘baselining’ of dates that I’d need to do to get the axis to display.

However, it was a little trickier than I first anticipated, mainly in trying to ensure I got the right values to match with Kyle’s posted solution (which at one point seemed to change while I was building). I’ve also realised since clicking on the link to the challenge again, that it was tagged as an LoD challenge, although there was nothing specific in the requirements indicating this was a requirement. I don’t think I used any LoDs…

Anyway onto the build, and I’m going to start by getting the dates all sorted, as this I found was the trickiest part.

Firstly connect to the data, then verify that the date properties of the data source are set to start the week on a Monday (right click data source > Date Properties

Build a basic view that displays Sales by the week of Order Date and the year of Order Date. Exclude 2018 since we’re only focussing on up to the last 2 years of data.

Examining this data compared to the solution, the first points of each line relate to the data shown against the 7 Jan 2019, 6 Jan 2020 and 4 Jan 2021. Ie the first point for each line is the first Monday in the year.

For simplicity, to make some of the calculations easier to read , I’m going to store the start date of the order date week in a field.

Order Date Week Start

DATE(DATETRUNC(‘week’, [Order Date]))

This means I can easily work out the last day of the week

Order Date Week End

DATE(DATEADD(‘day’,6,[Order Date Week Start]))

Add these fields as discrete exact dates (blue pills) onto the table and remove the existing WEEK(Order Date) field since this is the same as Order Date Week Start

When we examine the end points of each line in the solution, the points relate to the data shown against the week starting 30 Dec 2019 to 5 Jan 2020, 28 Dec 2020 to 3 Jan 2021 and 22 Nov to 28 Nov 2021. For the first 2 points, we can see the data is ‘spread’ across 2 years (ie 2 columns), which means when categorising the data ‘by year’, using the Year associated to the Order Date itself isn’t going to work. We need something else

Year Group

YEAR([Order Date Week Start])

Move this field into the Dimensions pane (above the line), then add to the table, replacing the existing YEAR(Order Date) field. 2018 will appear due to the very first row of data, but don’t worry about this for now.

If you scroll down to the week starting 30 Dec 2019, all the data is now aggregated in a single Year Group.

So things are starting to take shape. We can now work on how to filter the data.

The requirements indicate we should imagine ‘today’ is 1st Dec 2021, so we’ll use a parameter to hold this value.

Today

Date parameter defaulted to 1 Dec 2021

We only want to show data for the last 2 years and for complete weeks up to today

Core Data to Include

[Order Date Week End] < [Today] AND YEAR([Order Date Week Start]) >= YEAR([Today])-2

Remove the existing filter and add this field instead, set to True. You should now have just 3 columns and the data starting and ending at the right points.

The next area of focus is to think about how the data is going to be presented – the lines are all plotted against a single continuous (green) date axis, so we need to ‘baseline’ the dates, that is adjust the dates so they are all on the same year.

Date To Plot

MAKEDATE(2021, MONTH([Order Date Week Start]), DAY([Order Date Week Start]))

this is basically setting the week start dates to the equivalent date in 2021.

In the table we’ve been building, add Date To Plot to Rows and set to the week level and be discrete (blue). Remove the Order Date Week Start pill and move the Order Date Week End to the Tooltip as this is where this pill will be relevant in the final viz.

We’re starting to now see how the data comes together, but we’ve still got some steps to go.

I’m going to adjust the Year Group, so we can present the Current, Previous, Last 2 Yrs labels. Change as follows

Year Group

YEAR([Order Date Week Start]) – YEAR([Today])

This returns values -2, -1, 0 which means the values will be consistent even if the ‘Today’ value changes. The values can then be aliased (right click Year Group > Aliases

Next focus is on the % difference in sales. Add a Percent Difference quick table calculation to the existing Sales pill. The vales will change to those we can see when hovering over the points in the solution.

Edit the table calculation and modify to explicitly compute by Year Group, which is important to understand as, when we build the viz, whether the data is going across or down may change, so ‘fixing’ like this ensures we retain the values we know are correct.

In order to manage the custom colour formatting in the tooltip, we’re going to ‘bake’ this field as a calculated field. Press CTRL, then click and drag the pill into the data pane and name the field accordingly. If you examine the field, it’ll probably look quite complex

% Sales Diff

(ZN(SUM([Sales])) – LOOKUP(ZN(SUM([Sales])), -1)) / ABS(LOOKUP(ZN(SUM([Sales])), -1))

I’m going to start building the viz now, and then we’ll add the final calcs needed for the formatting later.

On a new sheet

  • Add Core Data to Include to Filter and set to True
  • Add Date to Plot to Columns and set to a continuous week level (green pill)
  • Add Sales to Rows
  • Add Year Group to Colour and adjust accordingly
  • Reorder the values in the Year Group colour legend, so the ‘current’ line in the chart is displayed on the top, and the ‘2 yrs ago’ line is at the bottom
  • Format the WEEK(Date To Plot) field to be a custom date of dd mmm (ie 10 Jan)
  • Format the Sales axis to be $ with 0 dp
  • Add Order Date Week End to the Tooltip shelf and format the field to custom date format of mmm dd (ie Jan 10).

Now we need a couple of additional calcs to help format the % sales difference displayed on the tooltip.

% Sales Diff +ve

IF [% Sales Diff] >= 0 THEN [% Sales Diff] END

% Sales Diff -ve

IF [% Sales Diff] < 0 THEN [% Sales Diff] END

Format both these fields with a custom number format as ▲0.0%;▼0.0%

Add both these fields onto the Tooltip and adjust the table calculation settings of each to compute using by Year Group only

The 2 yrs ago line has no % Sales Diff value, and also no label, so we need a field to help this too

TOOLTIP: Label YoY%

IF NOT(ISNULL([% Sales Diff])) THEN ‘YoY%:’ END

This means the text ‘YoY%’ will only display if there is a % Sales Diff value.

Add this field onto the Tooltip too, and again adjust the table calculation settings.

Now format the text and tooltip as below, setting the two % Sales Diff pills to be side by side and colouring accordingly.

Hopefully, this means you now have a completed viz

My published viz is here. And nope… no LODs used 🙂 There are a few more calculated fields than what Kyle mentioned, but I could have condensed these by not having so many ‘building blocks’, but this may have made it harder to read.

I’m now off to check out Kyle’s solution and see whether I really over complicated anything…

Happy vizzin’!

Donna

When do extracts run during the day?

When I was approached by the #WOW crew to provide a guest challenge, I was a little unsure as to what I could do. I primarily work as a Tableau Server admin, so rarely have a need to build dashboards (which is why I like to do the weekly #WOW challenges, to keep up my Desktop skills). Then the next day I was looking at a dashboard I’d built to monitor extracts on our Tableau Servers, and I thought it would be an ideal candidate for a challenge. I also thought it would provide any users of Tableau Server with the opportunity to implement this dashboard in their own organisation if they wished, by sharing with their Server Admins.

As a Tableau Server Admin, you get access to a set of ‘out of the box’ Admin views, one of which is called ‘Background Tasks for Extracts’ which gives you a view of when extracted data sources and workbooks run on the server. However while the provided view is fine if you want to quickly see what’s going on now, it’s not ideal if you want to see how things ran over a longer timeframe – it involves a lot of horizontal scrolling.

Many server admins will have ‘opened’ up access to the Tableau repository, the PostgreSQL database which stores a rich array of data, all about your Tableau Server [see here for further info], and enables admins to extend their analysis beyond the provided Admin views. This site even provides a set of pre-curated data sources to help you get started! These aren’t formally supported by Tableau, but is the brain-child of Matt Coles, a server admin at Tableau (no relation to me though!).

My dashboard doesn’t actually use one of these data sources though. For the challenge, I’ve just created some sample, anonymised data in the required structure. I’ll explain later at the end of the post how to go about converting this to use ‘real’ server data, if you do want to plug it into your own server environment.

Understanding the data

When using Tableau Server, published data sources and workbooks can connect to their underlying data source (eg a SQL Server database, an Excel file etc) directly (ie via a live connection) or via an extract. An extract means that a copy of the data is pulled from the underlying data source and stored on Tableau Server utilising Tableau’s ‘in memory’ engine when the data is then queried. An extract gets added to a schedule which dictates the time when the extract will get refreshed; this may be weekly, daily, hourly etc. Every time the extract runs, a background task is created which provides all the relevant details about that task. The data for this challenge provides 1 row for each extract task that was created between Monday 11th Jan 2021 and Friday 5th Feb 2021. The key fields of note are:

  • Id – uniquely identifies a task
  • Created At – when the task was created
  • Started At – when the task actually started running (if too many tasks are set to run at the same time, they will queue until the server resources are available to execute them).
  • Completed At – when the task finished, will be NULL if task hasn’t finished.
  • Finish Code – indicates the completion status of the job (0=success, 1=failed, 2= cancelled)
  • Progress – supposed to define the % complete, but has been observed to only ever contain 0 or 100, where 100 is complete.
  • Title – the name of the extract
  • Site – the name of the site on the server the extract is associated to

Based on the Finish Code and Progress, I have derived a calculated field to determine the state of the extract (to be honest, I think this is a definition I have inherited from closer analysis of the Background Tasks for Extracts Server Admin view, so am trusting Tableau with the logic).

Extract Status

IF [Finish Code] = 1 AND [Progress] <> 100 THEN ‘In Progress’
ELSEIF [Finish Code] = 0 AND NOT [Progress] = 1 THEN ‘Success’
ELSE ‘Failed’
END

Building the required calculated fields

The intention when being used ‘in real life’, is to have visibility of what’s going on ‘Now’ as well as how extracts over the previous few days have performed. As we’re working with static data, we need to hardcode what ‘Now’ is. I’ll use a parameter for this, so that in the event you do choose to plug this into your own server, you only have to replace any reference to the Now parameter with the function NOW().

Now

Datetime parameter defaulted to 05 Feb 2021 16:30

The chart we are going to build is a Gantt chart, with 1 bar related to the waiting time of the task, and 1 bar related to the running time of the task. We only have the dates, so need to work out the duration of both of these. These need to be calculated as a proportion of 1 day, since that is what the timeframe is displayed over.

Waiting Time

(DATEDIFF(‘second’, [Created At], IF ISNULL(Started At]) THEN [Now] ELSE [Started At ] END))/86400

Find the difference in seconds, between the create time and start time (or Now, if the task hasn’t yet started), and divide by 86400 which is the number of seconds in a day.

We repeat this for the processing/running time, but this time comparing the start time with the completed time.

Processing Time

(DATEDIFF(‘second’, [Started At], IF ISNULL([Completed At]) THEN [Now] ELSE [Completed At] END))/86400

As mentioned the timeframe we’re displaying over is a 24 hr period, and we want to display the different days over multiple rows, rather than on a single continuous time axis spanning several days.

To achieve this, we need to ‘baseline’ or ‘normalise’ the Created At field to be the exact same day for all the rows of data, but the time of day needs to reflect the actual Created At time . This technique to shift the dates to the same date, is described in this Tableau Knowledge Base article.

Created At Baseline

DATEADD(‘day’, DATEDIFF(‘day’, [Created At], #2021-01-01#), [Created At])

And again, we’re going to need to do a similar thing with the Started At field

Started At Baseline

DATEADD(‘day’, DATEDIFF(‘day’, [Started At], #2021-01-01#), [Started At])

Putting this out into a table, you can see what this data all looks like (note, I’m just choosing a arbitrary date of 01 Jan 2021, so my baseline dates are all on this date:

Building the Gantt chart

We’re going to build a dual axis Gantt chart for this.

  • Add Site to Rows
  • Add Title to Rows
  • Add Created At to Rows. Set it to the day/month/year level and set to be discrete (ie blue pill). Format the field to custom format of ddd dd mmm yyyy so it displays like Mon 11 Jan 2021 etc
  • Add Created At Baseline to Columns, set to exact date
  • Add Waiting Time (Avg) to Size and adjust to be thin

This will automatically create a Gantt chart view

Next

  • Add Started At Baseline to Columns, set to exact date, and move so the pill is now placed to the right of the Created At Baseline pill
  • On the Started At Baseline marks card, remove Waiting Time and add Processing Time (Avg) to the Size shelf instead. Adjust so the size is thicker.
  • Set the chart to be dual axis and synchronise the axes

The thicker bars based on the Started At / Processing Time need to be coloured based on Extract Status. Add this field to the Colour shelf of the Started At Baseline marks card and adjust accordingly.

The thinner bars based on the Created At / Waiting Time need to be coloured based on how long the wait time is (over 10 mins or not).

Over Wait Time Threshold

[Waiting Time] > 0.007

0.007 represents 10 mins as a proportion of the number of minutes in a day (10 / (60*24) ).

Add this field to the Colour shelf of the Created At Baseline marks card and adjust accordingly (I chose to set to the same red/grey values used to colour the other bars, but set the transparency of these bars to 50%).

Formatting the Tooltips

The tooltip for the Waiting Time bar displays

The Created At Baseline and Started At Baseline should both be added to the Tooltip shelf and then custom formatted to h:mm am/pm

The Waiting Time needs to be custom formatted to hh:mm:ss

The tooltip for the Processing Time bar is similar but there are small differences in the display,

Formatting the axes

The dates on the axes are displayed as time in am/pm format.

To set this, the Created At Baseline / Started At Baseline pills on the Columns shelf need to be formatted to h:mm am/pm

Adding the reference band

The reference band is used to highlight core working hours between 8am and 5pm. Right click on the Created At Baseline axis and Add Reference Line. Create a reference band using constants, and set the fill colour accordingly.

Apply further formatting to suit – adjust sizes of fonts, add vertical gridlines, hide column/axes titles.

Filtering the dates displayed

As discussed above, when using this chart in my day to day job, I’d be looking at the data ‘Now’. As a consequence I can simply use a relative date quick filter on the Started At field, which I default to Last 7 days.

However, as this challenge is based on static data, we need to craft this functionality slightly differently.

We’re only going to show up to 10 days worth of data, and will drive this using a parameter.

pDaysToShow

An integer parameter, ranging from 1 to 10, defaulted to 7, and formatted to display with a suffix of ‘ days’.

We then need a calculated field to use to filter the dates

Filter : Days to Show

DATETRUNC(‘day’,[Created At]) >= DATEADD(‘day’,([pDaysToShow]-1)*-1,DATETRUNC(‘day’,[Now]))

Add this to the Filter shelf and set to True.

Additionally, the chart can be filtered by Site, so add this to the Filter shelf too.

Building the Key legend

Some people may build this by adding a separate data source, but I’m just going to work with the data we have. This technique is reliant on knowing your data well and whether it will always exist.

On a new sheet, add Site to the Filter shelf and filter to sites 7 and 9.

Create a new field

Key Label

If [Site] = ‘Site 9’ THEN ‘Waiting’ ELSE ‘Processing’ END

and add this to the Columns shelf and sort the field descending, so Waiting is listed before Processing.

Alongside this field, type directly into the Columns shelf MIN(1).

Edit the axes to be fixed to from 0 to 1. Then add the Site field to the Colour shelf and also to the Size shelf and adjust accordingly (you may need to reverse the sizes). I lightened the colour by changing the opacity to 50%.

Now hide the axes, remove row & column borders, hide the column title and turn off tooltips.

The information can all now be added to a dashboard.

Using your own data

To use this chart with your own Tableau Server instance, you need to create a data source against the Tableau postgres repository that connects to the _background_tasks (bgt) table with an inner join to the _sites (s) table on bgt.site_id = s.Id. Rename the name field from the _sites table to Site. If you don’t use multiple sites on your Tableau Server instance, then the join is not required. The sole purpose of the join is to get the actual name of the site to use in the display/filter.

You should then be able to repoint the data source from the Excel sheet to the postgres connection. You may find you need to readjust some of the colours though.

When I run this, I’m using a live connection so I can see what is happening at the point of viewing, rather than using a scheduled extract. To help with this, I add a data source filter to limit the days of data to return from the query (eg Created at <=10 days), which significantly reduces the data volume returned with a live connection.

Hopefully you enjoyed this ‘real world’ challenge, and your server admins are singing your praises over the brilliance of this dashboard 🙂

My published version is here.

If you’ve got any feedback or suggestions on improvements to enhance the viz even further, please do let me know.

Happy vizzin’! Stay safe!

Donna

How much change has occurred?

A relatively straightforward challenge was set by Luke this week, to visualise the difference in Sales between 2020 and 2021 in a slightly different format than what you might usually think of.

Start by filtering the data to just the years 2020 and 2021 (add Order Date to the Filter shelf and select specific years, or add a data source filter to limit the whole data set).

Add Sub-Category to Rows, and Sales to Columns, then add Order Date to Colour which by default will display as YEAR(Order Date). Colour the years appropriately.

Now unstack the marks (Analysis menu -> Stack Marks -> Off), and re-order the colour legend, so 2021 is listed first (this makes the 2021 bars sit ‘on top’ of 2020).

Adjust the size to make the bars thinner.

Now add another instance of Sales to the Columns shelf, and make the chart dual axis (synchronising the axis). Reset the mark type of the original SUM(Sales) marks card back to bar.

We need the circle mark for the 2021 Sales to be blue. To do this, duplicate the Order Date field, then add Order Date (copy) to the Colour shelf of the SUM(Sales)(2) marks card. This will show another colour legend, and you can set the colours accordingly. Add a white border around the circle marks.

To work out the % difference to display on the label, we need the following fields

2021 Sales

{FIXED [Sub-Category]: SUM(IF YEAR([Order Date])=2021 Then [Sales] END)}

This returns the value of the 2021 sales for each Sub-Category against all the years in the data set. Similarly we need

2020 Sales

{FIXED [Sub-Category]: SUM(IF YEAR([Order Date])=2020 Then [Sales] END)}

which means we can then create

% Difference

(SUM([2021 Sales])-SUM([2020 Sales]))/SUM([2020 Sales])

format this using custom formatting to display as +0%;-0%

Now we can add % Difference to the Label field of the Sum(Sales)(2) marks card.

You’ll notice you’ll have duplicate labels displayed. To resolve this, you need to adjust the label settings as below

To sort the rows, you need to sort the Sub-Category field by 2021 Sales descending

And finally to show the value of the 2021 Sales, add this field to the Rows shelf, and change to be discrete (blue pill).

All that’s left to do now is adjust the wording of the tooltips as you see fit, and format to remove gridlines, headers etc.

My published version is here.

Happy vizzin’! Stay Safe!

Donna

Can you make Spine Charts?

Sean Miller provided the challenge for this week, resurrecting a challenge originally set by Emma Whyte in 2017. Revisiting these older challenges is great fun, as often newer product features provide a different way of solving. For me, I also like the fact I know I’ve already solved it once, and have my own work to reference if I get stuck – ha ha!

Sean hinted that this wasn’t a challenge to ‘overthink’ – no table calcs or LoDs required. You need to be able to display average responses per question per university alongside the overall average response for the question. Simply filtering by university isn’t going to cut it, as the quick filter will immediately eliminate all the data that isn’t associated to the selected university, which means you can’t compute an ‘overall average’ without using LoDs.

The key to this challenge is to use a parameter to drive the University selection. Create this by right clicking on the University field -> Create > Parameter. This will create the parameter dialog box, prepopulated with all the university values. Set the default to University of Liverpool.

pUniversity

With this, we can now create calculated fields to store the values associated to the selected university only.

Sample Size

AVG(IF [University]=[pUniversity ]THEN [Sample Size] END)

Note 1, there is already field called Sample Size in the data set. The actual name of this field is <space>Sample Size<space> which Tableau sees as a different name. In hindsight I should have just renamed the original field, so I could then have ‘Sample Size‘. Be mindful of this when I refer to the field later; unless I call it out, I’m referring to my version.

Note 2, I chose to apply the AVG aggregation within the calc rather than changing the default aggregation on the pill when added to the view. There was a reason I did this, but I can’t recall what it was, and think it wasn’t necessary in the end….

University Avg

AVG(IF [University] = [pUniversity ] THEN [% Agree] END)

formatted to percentage, 1 dp

We can also then define the overall average for comparison

Overall Avg

AVG([% Agree])

formatted to percentage, 1 dp

and with that can calculate the variance between the two

Delta

([University Avg]) – ([Overall Avg])

This is custom formatted to 0.00%▲;0.00%▼ (I use this site to get the arrow characters)

And then we need a field to define how the mark needs to be coloured

Colour

[Delta]>=0

We can put these all out in a view

  • Question Number (which I renamed to No) on Rows
  • Question Text on Rows
  • Sample Size on Rows (set to be a discrete blue pill)
  • University Avg on Rows (set to be discrete)
  • Overall Avg on Rows (set to be discrete)
  • Delta on Rows (set to be discrete)
  • University Avg on Columns (continuous green pill)
  • Change Mark Type to Circle
  • Add Colour to the Colour shelf and adjust

Now we need to work on adding the various lines and bands on the chart. This is all managed by adding reference lines (or bands).

Drag % Agree to the Detail shelf, and change to be AVG.

The drag the same field % Agree on Detail again, this time change to MIN. Repeat again, and change to MAX.

Right click on the University Avg axis > Add Reference Line. Create a line, per pane using the AVG(% Agree) field.

Add another reference line (by right clicking on the axis again). This time create a band that starts at MIN(% Agree) and ends at MAX(% Agree). Set the Fill colour to light grey.

We need to create some new fields for the quartile values.

Lower Quartile

PERCENTILE([% Agree],0.25)

Upper Quartile

PERCENTILE([% Agree],0.75)

Add both these fields to the Detail shelf again, then add another reference line (band) similar to that above, but referencing the quartile fields. Set the Fill colour to be a darker grey.

Adjust the formatting and set the tooltips and you’ve got the main chart…. well almost…

In the solution, the first column, the question no, is not labelled. I couldn’t figure out how to do this, which is why I relabelled to simply No. I tried various things, including using text boxes as column headings on the dashboard, but the layout just didn’t work.

BUT I’ve now found out how to do it… because I googled, and I didn’t yesterday when I was building 😦 Andy Kriebel explains it all here. He searches for a ‘zero width space’ character on this site , and then copies the resulting ‘character image’ displayed and pastes into the label of a calculated field. Watch the video to see it in action, but I’ve noted the steps here, just as much for my own benefit when I can’t remember what to do in future… I can see this type of feature cropping up often 🙂

The legend utilises a lot of the concepts above, but we don’t what the mark changing with each university selection. So let’s just hardcode

Legend Avg

AVG(IF [University] = ‘Middlesex University’ THEN [% Agree] END)

and we’ll need a dedicated field for the colour

Legend Colour

[Legend Avg] – [Overall Avg] >=0

The legend sheet can then be built just by plotting the Legend Avg pill on the Columns shelf, with a mark type of circle, the Legend Colour on the Colour shelf, and the same pills used in the reference lines above on the Detail shelf.

When adding the reference lines and bands this time, you will need to add labels and format their position.

The quartile band, also has dotted lines indicating the end of the band, which you can apply as part of the band properties. However the quartile band also has one label left aligned, while the other is right aligned. For a single reference band, the labels can either both be formatted left aligned or both right aligned. To resolve this, don’t add a label to the ‘band to’ section. Create another reference line for the Upper Quartile value, and you can then format the label of this independently.

My published via associated to this challenge is here.

My version based on the original challenge from 2017 is here. The requirement was a bit more complicated it would seem, and it looks like I utilised FIXED LODs quite heavily.

Happy vizzin’! Stay Safe!

Donna

Can you build a Fancy Text Table?

Ann Jackson provided this week’s challenge, to deliver a text table using only Measure Names & Measure Values. I thought with Ann’s introduction that “This challenge should be straightforward for users of all levels” that this would be relatively straightforward, but I have to confess there were moments that I struggled with this. I knew the fundamentals that I’d need to complete this; that all the columns except the first were going to need to be numbers (ie measures), that I’d have to use custom formatting to display the number in the required format (a shape, a date, a word), and that I’d need to use the ‘legends per measure’ functionality to colour each column independently of each other. But determining the best/worst date to display proved to be a bit tricksy! I got there in the end, but there was a fair bit of trial and error.

Custom Formatting

I’m going to step through the build of this, as I think that’s probably the easiest way to describe this challenge. But before I do, one of the core fundamentals to this is knowing about how numbers can be custom formatted. By that I mean when you right-click on a measure -> default properties -> number format -> custom

This box allows you to type in, but you need to know what format/syntax to use. If you set the formatting via one of the other options, then look at the Custom option, it’ll have an entry that will give you a starting point. The above is the format for a number set to 1 decimal place, and shows that negative numbers will be prefixed by a minus sign (-). If we wanted to always show a plus sign in front of a positive number, we can edit this custom formatting to +#,##0.0;-#,##0.0.

The first entry to the left of the semi colon (;) indicates what’s applied to positive numbers. The next entry, to the right of the semi colon, indicates what’s applied to a negative number.

With this knowledge, you can apply more ‘creative’ custom formatting to any numeric measure that contains positive and negative numbers. For example if you want to show a ☑ or a ☒ depending on a ‘yes/no’ or ‘true/false’ concept, then we can create a version of the field as a number along the lines of

Field as Number

If [Field] = ‘XXXX’ THEN 1 ELSE -1 END.

We can than custom format this field by entering ☑;☒ into the text box

The field can still be treated as a measure, since the underlying value is still a number (in this case +/- 1), it’s just displayed differently.

Building the measures

So now we’ve covered how this ‘sneaky formatting’ is working, we’ll get on with the overall build.

The data just needs data from 2019 & 2020, so I chose to set a data source filter to restrict to just these two years.

But, I wanted the rest of the challenge to derive the current year instead of hardcoding, so I created fields

Current Year

YEAR({MAX([Order Date])})

Last Year

[Current Year] – 1

From these, I could then use LoDs to create

CY SALES

IF YEAR([Order Date]) = [Current Year] THEN [Sales] END

This is formatted to $ with 0 dp

and then

LY SALES

IF YEAR([Order Date]) = [Last Year] THEN [Sales] END

again formatted to $ with 0 dp.

We then need an indicator which is ‘true’ if CY SALES is greater than LY SALES, but as discussed above, we need this to be a ‘measure’, which we can custom format.

CY vs LY

IF SUM([CY SALES])>SUM([LY SALES]) THEN 1 ELSE -1 END

Custom format this as ✅;❌ (just copy these symbols from this page… they’ll look black and white in the dialog) – check out this page, to lift the images/other symbols from.

The actual difference identified by △ (again just copy and paste this symbol into the field name) is simply

SUM([CY SALES]) – SUM([LY SALES])

formatted to $ with 0 dp. Once you’ve done this using the Currency(Custom) option, then go to the Custom option and add + to the front of the string :

+”$”#,##0;-“$”#,##0

Next up is the percentage difference

% DIFF
[△]/SUM([LY SALES])

again format this first to a Percentage at 1 dp, then edit the Custom format to +0.0%;-0.0%

Now we’re getting to the slightly more complex part of the challenge – to identify the best and worst day in the month. We’ll start with the best day. We’re using FIXED LoDs throughout this, and while it’s probably possible to do in a single calculation, we’ll use multiple calcs to build up the components.

Order Month

DATENAME(‘month’,[Order Date])

This is the one dimension that’s going to be used in the final output, and simply outputs the month name (January, February etc).

In the data set, there can be multiple sales (ie orders) in a single day. We want to identify the total sales in 2020 (ie the current year) for each order date.

Sales Per Day

{FIXED [Order Date] : SUM([CY SALES])}

Now we’ve got the total sales per day, we want to identify the value of the maximum daily sales in each month

Max CY Sales Per Month

{FIXED [Order Month]: MAX([Sales Per Day])}

Now we need to identify the date in the month that the max daily sales ocurred

BEST DAY

INT({FIXED [Order Month]: MAX(IF [Sales Per Day] = [Max CY Sales Per Month] THEN ([Order Date]) END)}) + 2

WOAH! WHAT??? Let’s try to break this down…

IF [Sales Per Day] = [Max CY Sales Per Month] THEN ([Order Date]) END

If the daily sales value is the maximum daily sales in the month, then return the associated Order Date. But we need to get a date per month, so we’ve wrapped this in a FIXED LoD, for each Order Month. LoDs require the value to be aggregated, so the IF statement gets wrapped in a MAX statement (note MIN would work just as well).

{FIXED [Order Month]: MAX(IF [Sales Per Day] = [Max CY Sales Per Month] THEN ([Order Date]) END)}

Finally, due to the nature of this challenge, that requires we only work with Measure Names & Measure Values, we will convert this date field to a number using the INT function.

The intention here, is that we can then use the Custom formatting option once again, to set the number as a date format – I chose dd mmm yyyy (ie 01 Jan 2020 format, as I feel its less confusing that working out whether the date is in UK or US format).

However, by a very weird circumstance, converting a date to an INT then formatting as a date, will give you a date 2 days out from the one you converted. I don’t understand why, and it left me scratching my head for some time. I had to sense check with a fellow #WOWer who had the same, and checking Ann’s solution, she also was handling the oddity, which is the reason for the +2 on the calculation.

We just create similar fields for identifying the worst day

Min CY Sales Per Month

{FIXED [Order Month]: MIN([Sales Per Day])}

WORST DAY

INT({FIXED [Order Month]: MAX(IF [Sales Per Day] = [Min CY Sales Per Month] THEN ([Order Date]) END)}) + 2

format this to dd mmm yyyy

The final measure we need is based on determining the rank of the CY SALES per month. Ie if we ordered the months based on CY SALES descending, the top 6 would be marked as ‘Top’ and the rest as ‘Bottom’.

RANK

IF RANK(SUM([CY SALES])) <= 6 THEN 1 ELSE -1 END

We can then custom format this to “Top”;”Bottom”

Formatting the Table

Create a text table by

  • Order Month on Rows
  • Measure Names on Columns
  • Measure Values on Text
  • Measure Names on Filter, filtered to just the relevant measures

Add Measure Values to the Colour shelf, and select the Use Separate Legends option to display multiple diverging colour legend controls.

It’s now a case of going through each measure and editing the colour palette, and other settings. Some of this was again a bit of trial and error for me – I chose options that worked.

For the black text fields (CY SALES, LY SALES, %DIFF), choose a diverging colour palette, then click on the coloured squares at each end and select black from the colour picker. Select Stepped Colour and reduce the steps to 2.

Apply the same concept to the BEST DAY and WORST DAY legends, but select the appropriate green or red colour instead.

For the remaining fields, select a diverging colour palette, select the appropriate red at one end, and green at the other, reduce the steps to 2

And subject to some other formatting tweaks (increase font sizes, centre text), this is enough to complete the challenge. My final published viz is here. Note, the published viz does have slight differences to what I’ve blogged… as with many things, you sometimes realise things can be simpler when you try again.

Happy vizzin’! Stay Safe!

Donna

What happens if? Can you update sales forecast and targets using only parameters?

So after Ann’s gentle workout for week 6, newly crowed Tableau Zen Master Lorna, hit us with this challenge, and I confess, I struggled. The thought of then having to write this blog about it even brought a little tear to my eye 😦

But here I am, and I will do my best, but I can’t promise I understood everything that went on in this. I truly am amazed at times how some people manage to be so creative and bend Tableau to their will. It really is like #TableauBlackMagic at times!

So I read the challenge through multiple times, played around with Lorna’s published viz, stared at the screen blankly for some time…. I found the University Planning Dashboard viz by Ryan Lowers that Lorna had referenced in the challenge as her inspiration (she’d linked to it from her published viz). I played around with that a bit, although that took a while for me to get my head round too.

I also did a google search and came across Jonathan Drummey‘s blog post : Parameter Actions: Using a parameter as a data source. This provided a workbook and some step by step instructions, so I used this as my starting point. I downloaded the workbook, copied across the fields he suggested and tried to apply his instructions to Lorna’s challenge. But after a couple of hours, it felt as if I was making little progress. I couldn’t figure out whether I needed 2 or 4 parameters to store the ‘list’ data source variables (one each to store the list of selected categories for forecast, the list of selected categories for target, the list of selected forecast values, and the list of selected target values, or one each to store the list of selected categories and forecast values combined, and selected categories and target values combined). Suffice to say I tried all combos, using a dashboard to show me what was being populated on click into all the various fields/parameters I’d built. But it just wasn’t giving me exactly what I needed.

I downloaded the University Planning Dashboard and tried to understand what that was doing. And finally I shrugged my shoulders, and admitted defeat and cracked open Lorna’s solution. When I finally get to this point in a challenge, I try just to ‘have a peak’, and not simply follow verbatim what’s in the solution. I gleaned that I did need only 2 parameters, and that what I had been doing with my attempts with Jonathan’s example was pretty close. It made me feel a bit better with myself.

How things then transpired after that I can’t really recall – it was still a lot of trial and error but I finally got something that gave me the Sales Forecast data and associated select & reset functionality (by this time I’d probably spent 4 hours or so on this over a couple of evenings). Once I’d cracked that, the target was relatively straight forward, so by the time I’d finished on the 2nd day, I had a dashboard that allowed the selections/resets and simply presented the data in a table on screen. I chose to keep that version as part of my published solution, just for future reference (see here). I then finished off the next day, building the main viz.

What follows now, is just an account of the fields etc I used to build my solution. So let’s get going….

Building the Sales Forecast Selector

I’m going to start by focusing on building the left hand side of the viz, setting and resetting the Sales Forecast values for each Category.

We need 2 main parameters to start with:

Forecast Param

An integer parameter defaulted to 70,000. This is the parameter that stores the value of the forecast to set.

Forecast List

A string parameter defaulted to empty. This is the parameter which will ‘build up’ on selection of a category, to store a delimited list of category + forecast values – ie the data source parameter.

Oh, and I also used a 3rd parameter, Delimiter, which is just a string parameter storing a :

The delimiter needs to be a distinct character that mustn’t exist in the fields being used. The Category field nor the Forecast Param field will contain a ‘:’, so that’s fine. But any other unused character would work just as well. Having this field as a parameter isn’t ultimately necessary, but it makes it easy to change the delimiter to use, if the chosen value doesn’t end up being suitable. It was also a field used in Jonathan Drummey’s solution I’d based my initial attempts on.

Now we need to build the viz to work as the category selector.

I simply put Category on the Rows shelf, sorting the pill by SUM(Sales) descending and set the Mark Type to circle. Oh – and I set a Data Source Filter to set the Order Date just to the year 2019.

I also needed the following

  • something to colour the circles based on whether the Category was selected or not
  • something to use to help ‘build up’ the List parameter ‘data source’
  • something to return the forecast value that had been selected against the specific Category

Category Exists in Forecast List

CONTAINS([Forecast List], [Category])

If the Category exists within the Forecast List string of text, this field will return true, and indicates the Category has been ‘selected’. This field is added to the Colour shelf, and the colour needs to be adjusted once parameter action has been applied to distinguish between true & false.

Add to Forecast List

if [Forecast Param]<>0 THEN
[Forecast List] +
[Category] + ‘_’ + STR([Forecast Param]) + [Delimiter]
ELSE ”
END

If the entered Forecast value isn’t 0, then append <Category>_<Forecast Value>: to the Forecast List parameter. Eg if the Sales Forecast value is $50,000 and Technology is selected, then Technology_50000: is added to the existing Forecast List parameter, which has started as blank.

If the Sales Forecast value is then changed to $10,000 say, and Office Supplies is selected, then the Forecast List parameter will become

Technology_50000:Office Supplies_10000:

This Append To Forecast List calculated field is used in conjunction with the Forecast List parameter within a Parameter Action on the dashboard to make all the ‘magic’ happen. The Append To Forecast List field must be in the view to be available to the parameter action, so it is added to the Detail shelf.

When a circle is selected the Append To Forecast List field is used to ‘set’ the Forecast List parameter, subsequently building up a string of Category_Value pairs.

Finally, on hover, the Category and the value of the selected sales forecast at the time must be visible on the Tooltip. To get the value at the point of selection, which isn’t necessarily the latest value visible in the Sales Forecast parameter displayed on screen, the following field is required:

Current FC Value

INT(if contains([Forecast List],[Category]) then
REGEXP_EXTRACT([Forecast List],[Category]+”_(-?\d+)”)
end)

This manages to pull out the number associated with the Category, so in the above example, would return 50000 for Technology and 10000 for Office Supplies.

This field has custom formatting applied : ▲”$”#,##0;▼”$”#,##0 and is added to the Tooltip shelf.

RegEx is a concept I have yet to really crack, so there is no way I’d have come up with the above on my own. I think it’s looking for the named Category followed by Underscore (_) followed by either 1 or no negative sign (-) followed by some numbers, and returns just the numeric part.

Finally, the circles shouldn’t be ‘highlighted’ when selected on the dashboard. To stop this from happening a calculated field of True containing the value True, and a field False containing the value False are required. These are both added to the Detail shelf, and a Filter Action is then required on the dashboard setting True = False. This is a technique that is now becoming a familiar one to use, having been used in earlier #WOW2020 challenges.

So my ‘selection’ sheet looks like

and when added to the dashboard, the parameter action looks like :

with the filter action looking like :

At this point, I’d suggest using a ‘test’ dashboard which contains the selection sheet, displays the Forecast List and Forecast Param, and has the dashboard actions described above, applied to get an idea of what’s going on when a circle is selected, and the values of the Forecast Param changed.

The final part to this set up, is the ‘reset’ button, which when clicked on, empties the Forecast List parameter.

Create a new sheet, change the Mark Type to Text, and on the Text shelf add the string ‘↺’. I simply typed this ‘into’ a pill, but you could create a calculated field to store the ‘image’, which isn’t actually an image, but a special string character, that I got off my favourite ‘go to’ unicode characters website.

You then need a calculated field

Forecast List Reset

that just contains an empty string. This is added to the Detail shelf.

Put this sheet on the ‘test’ dashboard, and create another parameter action

This takes the value out of the Forecast List Reset field and sets the Forecast List parameter, subsequently resetting the list to an empty string on click.

Verify this is all working as expected.

Building the Sales Target Selector

Subject to Sales Forecast selector working as expected, then apply exactly the same principles to create the Target selection sheet and associated parameters.

The only slight difference with the fields used in the Target selection is:

Add to Target List

if [Target Param]>0 THEN
[Target List] +
[Category] + ‘_’ + STR([Target Param]) + [Delimiter]
ELSE ”
END

This just applies the addition to the list if the entered target is a +ve number (ie > 0), rather than not 0 as in the forecast selection.

The Target also needs to be displayed on the Tooltip, and this time there is a default target value that should be displayed, even when no selection has been made. For this I created

Target

IF ZN(MAX([Current Target Value])) = 0 THEN
MIN(IF [Category]= ‘Furniture’ THEN 270000
ELSEIF [Category]= ‘Office Supplies’ THEN 260000
ELSEIF [Category]= ‘Technology’ THEN 250000
END)
ELSE MAX([Current Target Value]) END

which was formatted to a currency of 0 decimal places, prefixed by $. This was added to the Tooltip shelf.

At this point, you should now have both the ‘selection sheets’ working on the dashboard, so we can now focus on building the main viz.

Building the Bar Chart

Rather than building the bar chart, I first decided to build a tabular view that simply presented on screen all the bits of data I needed for the bar chart, this being

  • Sales value per Category (simply SUM(Sales))
  • Sales Forecast value per Category (ie Sales + selected Forecast value)
  • Selected Sales Target value per Category (this is the Target field described above)
  • % Difference between Sales & Target
  • % Difference between Sales Forecast & Target

So I created the following additional calculated fields:

Forecast

SUM([Sales]) + MAX([Current FC Value])

formatted to currency prefixed with $ set to 0 dp.

Forecast vs Target Diff

([Forecast]-[Target])/[Target]

custom formatted to ▲0%; ▼0%

Sales vs Target Diff

(SUM([Sales])-[Target])/[Target]

also custom formatted to ▲0%; ▼0%

Adding the table to the ‘test’ dashboard allows you to sense check everything is behaving as expected

Now its just a case of shifting the various pills around to get the desired view. Ensure at least one Sales Forecast Category has been selected, to make it easier to ‘see’ what you’re building.

Lorna stated the target should be displayed as a Gantt mark type, with the sales and the forecast displayed as bars. This means a dual axis chart is required, with sales & forecast on one axis and target on the other.

To get Sales and Forecast onto the same axis, we need to add Category to the Rows (sorted by Sales desc) and Measure Values to the Columns, filtering to only the two measures we need.

Set the Mark Type to bar, and add Measure Names to both the Colour and the Size shelf.

Adjust colours and sizes to suit.

You might have something like


where the measures are ‘stacked’, so the bar is the length of the Sales then the length of the Forecast. We don’t want this, so need to set Stack Marks to Off (Analysis menu -> Stack Marks -> Off).

Add all the necessary fields to the Label shelf and format accordingly (you may need to widen the rows to make the labels show against each row).

Note – in my solution I created some fields to make the opening & closing bracket around the Forecast v Target Diff value only show when a Forecast had been selected, however in writing this blog, I realise it was simpler just to change the formatting of the Forecast v Target Diff to add the brackets around the number. The custom formatting was changed to : (▲0%); (▼0%)

Adjust the Tooltip to suit too.

Now add Target to Columns alongside Measure Values. Set to Dual Axis and Synchronise the axis. Reset the Measure Values mark type back to bar if needed, and set the Target mark type to Gantt.

Remove Measure Names from the Colour and Size shelf of the Target marks card. Untick Show Mark Labels too. Adjust the colour of the mark to suit, and you should pretty much be there now…

Tidy up the final bits of formatting, removing/hiding the various axis, labels, gridlines etc etc.

When this is all put together on the dashboard, you might need to fiddle about a bit with layout containers to get the bar chart lined up with the Selector views.

And with that I’m done! My published version is here, along with the ‘check’ dashboard I used to sense check what was going on, as I’m sure if I ever looked at my solution again, I’d struggle to understand immediately 🙂

Once again, I just want to acknowledge those that manage to create this magic with Tableau. I applaud you!

Happy vizzin’!

Donna

Which months have the higher number of orders?

For week 29 of #WorkoutWednesday2019, Luke Stanke set the challenge above (described here), which is comparing the overall average of orders placed per day against the average for each month.

On the face of it, this didn’t seem too bad (especially for a challenge set by Luke).  Some days when I tackle these challenges, the path I take can be long and arduous with several false starts along the way; the result being more of a ‘happy accident’ than anything of real coherence. Luckily for me, since I’d promised to start blogging on these challenges, this wasn’t one of them.

My usual approach to any of these challenges is to take some time reviewing the published viz on Tableau Public; hovering over the various marks to understand what’s on the tooltips and seeing if I can get any clues into how the various objects (marks, views, legends, titles etc) have been rendered, whether axis are being sneakily used, understanding the interactivity at play etc.

My next step is to then get the figures right based on what’s presented.  I typically like to create a ‘data’ sheet in the workbook – a tabular view of the data and associated calculations I have built, so I can easily sense check with the published viz whether my assumptions and computations are valid, and it provides a useful reference point if I’m trying to figure out what I did sometime later. 

So I started this challenge the same way.

First up, I assumed the ‘line’ was probably a reference line, and hovering over the viz confirmed this.  Everything else seemed pretty straightforward, so onto the figures.

1. Average no of orders per day per segment

This is what the line represents, and do to this I need to find the total number of orders placed per segment and the total number of days on which orders were placed for each segment.  The average is then just count orders / count days.

Some people will create everything in a single calculated field, but I like to break things up to help me troubleshoot if things don’t quite work as intended, so I ended up with 3 calculated fields, and since the overall average was required at a level higher than the level of detail being displayed (which is month in this viz), I figured LoD calculations were the way to go.

Count Orders Per Segment

{FIXED [Segment]: COUNTD([Order ID])}

For each segment, count the number of distinct orders that exist.

Count Days Per Segment

{FIXED [Segment]: COUNTD([Order Date])}

For each segment, count the number of distinct days on which an order was placed.

Overall Avg Orders Per Day Per Segment

SUM([Count Orders per Segment]) / SUM([Count Days Per Segment])

Format this to 2dp.

2. Avg no of orders per day per segment per month

This is what each coloured ‘bar’ represents, and for this I needed to find the total number of orders placed per segment per month, and the total number of days in each month on which orders were placed for each segment.   I chose to stick with LoDs again for this :

Count Orders Per Segment Per Month

{FIXED [Segment], MONTH([Order Date]): COUNTD([Order ID])}

For each segment and month, count the number of distinct orders that exist.

Count Days Per Segment Per Month

{FIXED [Segment], MONTH([Order Date]): COUNTD([Order Date])}

For each segment and month, count the number of distinct days that an order was placed.

Avg Orders Per Day Per Segment Per Month

SUM([Count Orders Per Segment Per Month]) / SUM([Count Days Per Segment Per Month])

Format this to 2dp.

Putting these fields out in a table, you can see the first three columns contain the same values for each segment even though the data is being displayed at the month level.  The final 3 columns are the monthly figures.  These numbers all reconcile back to the data displayed on the viz.

So I’m heading in the right direction, now onto the next bit.

The requirement is to Label the bars with the percent difference between the monthly value and the overall value”, so I need another field…

3.  % Difference

…which is basically the difference between column 3 and column 6 above, as a proportion of column 3

([Avg Orders Per Day Per Segment Per Month]-[Overall Avg Orders Per Day Per Segment])/[Overall Avg Orders Per Day Per Segment]

By default though, this shows 0s when added to the table as the numbers calculated are actually 0.38, -0.33 etc, so it needs formatting as a percentage to 0 decimal places.  However, applying the standard ‘percentage’ number format won’t quite cut it for this challenge, as Luke has labelled the positive numbers with a + too.  To get this, I need to apply custom formatting.

The easiest way to get this right I find, is to use one of the default number formatting options to set the number in whatever ‘main’ format you need, eg if you have a monetary value to display in £k, use the Currency (Custom) to get all the settings right.  In this instance though I want Percentage set to 0 dp..

Once done, press ok to close the dialog box, then go to set the number format again, but this time choose Custom.  The formatting ‘style’ applied previously will be shown

and it can then be modified to get the desired format, in this instance I change to

It’s often these little formatting tips that get thrown into the #WorkoutWednesday challenges that I love the most, although sometimes they can be tucked away and hard to find (or remember).

So at this point I now think I can start to build the viz, so adding the various pills I need, and adding a reference line I get this…

A line chart isn’t what I want though, and it’s not bars either, as they start from 0.  The mark type I need is gantt

which makes me realise I need to create another field…

4.  Difference

The gantt chart has the marks in the right position, but to make the ‘bars’ I need to alter the size, and that size is the difference between the mark position and reference line, which is

([Avg Orders Per Day Per Segment Per Month]-[Overall Avg Orders Per Day Per Segment])

Placing this field on size, gives me this but while the bars are the right height, they’re not in the right place.

Using gantt bars in this way is akin to the technique used in building waterfall charts, and is rectified simply by applying a multiple of -1 to the pill on the size shelf (in this instance [Difference]).

As a shortcut, I simply type this into the pill on the size shelf itself, which is a nifty little trick.

And voila! The viz is essentially there now.  Just need to add colour and further formatting …

The ‘bars’ are coloured based on whether they’re above or below the line, ie whether the difference is positive or negative, so another field is needed :

5. COLOUR : Difference

IF [Difference]>=0 THEN ‘green’ ELSE ‘blue’ END

And popping this on the colour shelf, and adjusting the colours to suit, gives me

The main thing left now is the little formatting bits and pieces :

Borders : remove columns

Axis : remove title, set to be independent, and set not to start at zero

Reference Line : change the label font size, align left middle, and set the shading to have a white background and 100% opacity

Reference Line TootlTip (new feature in v2019.2): set to Custom as below

Note if I’d simply called my field ‘Average’ rather than ‘Overall Avg Orders Per Day Per Segment’, I wouldn’t have needed this step, though it’s always useful to try out the new ‘little’ features if you can 🙂

Label : add [% Difference] to label shelf, and format centre middle

Tooltip : add relevant fields and adjust formatting as appropriate

The final thing I noticed was the axis scale – my scale was 2dp due to the formatting of my [Avg Orders Per Segment Per Month] field.  Luke’s axis was mixed – some scales at 1dp and some with 0.

I tried a few things, like formatting the axis to be Number (Standard) which has the effect of ‘automatically’ showing a number as a decimal or a whole numbers (something I recall from a very early WorkoutWednesday challenge a couple of years back).  But this didn’t give me the desired effect.  I ended up setting the axis format to be 1 dp.  But this then meant the value on the tooltip also ended up displaying as 1dp, when I wanted it to be 2 *sigh*

To fix this I created a duplicate field of the measure being displayed (a copy of [Avg Orders Per Segment Per Month]), and formatted it to 2dp, and placed this on the tooltip instead. 

I’m going to have to have a peak at Luke’s solution to figure out what magic he’s done here….

Very final step was to add to a dashboard, and add the title and my own custom footer.

So phew done!  My published version is here.

If you’ve got this far, thank you for reading J  I can’t guarantee all write ups will be to this level – it’ll partly depend on the challenge itself, and what path I head down to solve it.

I’m now off to have a peak at Luke’s challenge to figure out that pesky axis……. or I would if his workbook was downloadable 😦

Happy vizzin!

Donna