Can you build a Control Chart?

Lorna Brown returned this week to set another table calculation based challenge involving a line chart which ‘on click’ of a point, exposed the ability to ‘drill down’ to view a tabular view of the data ‘behind’ the point. This is classic Tableau in my opinion – show the summarised data (in the line chart) and with ‘drill down on demand’. Lorna added some additional features on the dashboard; hiding/showing filter controls to change how the data is displayed in the chart, and a back navigational button on the ‘detail’ list.

The areas I’m going to focus on in this blog are

  • Setting up the parameters
  • Defining the date to plot on the chart
  • Restricting the data to the relevant years
  • Defining the reference bands
  • Colouring the marks
  • Working out the date range for the tooltip
  • Building the table
  • Drill down from chart to table
  • Un-highlight selected marks
  • Hide/Show filter controls
  • Add navigation button

Setting up the parameters

This challenge requires 3 parameters.

Select a Date

a string parameter containing the 2 options available (Date Submitted & Date Selected) for selection, which when displayed on the dashboard will be set to a single value list control (ie radio buttons)

Latest X Years

an integer parameter, defaulted to 3, which allows a range of values from 1 to 5.

NOTE – Ensure the step size is set to 1, as this is what allows the Show buttons option to be enabled when customising the Slider parameter control type.

STD

another integer parameter, defaulted to 1, that allows a range of values from 1 to 3

Defining the date to plot on the chart

The Select a Date parameter is used to switch the view between different dates in the data set. This means you can’t plot your chart based on date field that already exists in the data set. We have to create a new field that determines which date field to select based on the parameter

Date to Plot

DATE(DATETRUNC(‘week’, IIF([Select a Date]=’Date Submitted’,[Date sent to company], [Date received])))

The nested IIF statement, is basically saying, if the parameter is ‘Date Submitted’ then use the Date sent to company field, else use the Date received field. This is all wrapped within a DATETRUNC statement to reset all the dates to the 1st day of the week (since the requirement is to report at a weekly level).

Note – there was some confusion which field the parameter option should map to. I have chosen the above, but you may see solutions with the opposite. Don’t get hung up on this, as the principal of how this all works is most important.

Restricting the data to the relevant years

The requirement is to show ‘x’ years worth of data, where 1 year’s worth of data is the data associated to the latest year in the data set (ie from 01 Jan to latest date, rather than 12 months worth of data). So to start with I calculated, rather than hardcoded, the maximum year in the data

Year of Latest Date

YEAR({MAX([Date to Plot])})

Then I could work out which dates I wanted via

Dates to Include

[Date to Plot]>= MAKEDATE([Year of Latest Date] – ([Latest X Years]-1),1,1)

In the MAKEDATE function, I’m building a date that is the 1st Jan of the relevant based on how many years we need to show.

So if Year of Latest Date is 2020 and Latest X Years =1 then Year of Latest Date – (Latest X Years -1) = 2020 – (1-1) = 2020 – 0 = 2020. So we’re looking for dates >= 01 Jan 2020.

So if Year of Latest Date is 2020 and Latest X Years =3 then Year of Latest Date – (Latest X Years -1) = 2020 – (3-1) = 2020 – 2 = 2018. So we’re looking for dates >= 01 Jan 2018.

This field is added to the Filter shelf and set to true.

So at this point, our basic chart can be built as

  • Year on Columns (where Year = YEAR([Date to Plot])), and allows the Year header to display at the top
  • Date to Plot on Columns, set to Week Number display via the pill dropdown option, and also set to be discrete (blue pill). This field is ultimately hidden on the display.
  • Number of Complaints on Rows (where Number of Complaints = COUNT([XXXX.csv], the auto generated field relating to the name of the datasource).

To get the line and the circles displayed, this needs to become a dual axis chart by duplicating the Number of Complaints measure on the Rows, synchronising the axis and setting one instance to be a line mark type, and the other a circle.

Defining the reference bands

The reference bands are based on the number of standard deviations away from the mean/ average value per year.

Avg Complaints Per Year

WINDOW_AVG([Number of Complaints])

Once we have the average, we need to define and upper and lower limit based on the standard deviations

Upper Limit

[Avg Complaints Per Year] + (WINDOW_STDEV([Number of Complaints]) * [STD])

Lower Limit

[Avg Complaints Per Year] + (WINDOW_STDEV([Number of Complaints]) * -1 * [STD])

Add both these fields to the Detail shelf of the chart viz (at the All marks card level) and set the table calculation of each field to Compute By Date to Plot

This ‘squashes’ everything up a bit, but we’ll deal with that later.

Add a Reference Band (right click on axis – > Add Reference Line) that ranges from the Lower Limit to Upper Limit.

If an Average Line also appears on the display, then remove it, by right clicking on the axis -> Remove Reference Line – > Average

Colouring the marks

I created a boolean field based on whether the Number of Complaints is within the Upper Limit and Lower Limit

Within STD Limits?

[Number of Complaints]<[Upper Limit] AND [Number of Complaints]>[Lower Limit]

Add this to the Colour shelf of the circle mark type, and set to Compute Using Date to Plot. The values will be True, False or Null. Right click on the Null option in the Colour Legend, and select Exclude. This will add the Within STD Limits? to the Filter shelf, and the chart will revert back to how it was. Adjust the colours accordingly.

The Tooltip doesn’t show true or false though, so I had another field to use on that

In or Out of Upper or Lower Limits?

If [Within STD Limits?] THEN ‘In’ ELSE ‘Out’ END

Working out the date range for the tooltip

The Tooltip shows the start and end of the dates within the week. I simply built 2 calculated fields to show this.

Date From

[Date to Plot]

This 2nd instance is required as Date to Plot is formatted to dd mmm yyyy format and also used in the Tooltip. Whereas Date From is displayed in a dd/mm/yyyy format.

Date To

DATE(DATEADD(‘day’, 6, [Date to Plot]))

Just add 6 days to the 1st day of the week.

Building the table

Create a new sheet and add all the relevant columns required to Rows in the required order. For the last column, Company response to consumer, add that to the Text shelf instead (to replace the usual ‘Abc’ text). The in the Columns shelf, double click and type in ‘Company response to consumer’ which creates a ‘fake’ column heading. Format all the text etc to make it all look the same.

Add the Dates to include = true filter.

Also add the WEEK(Date to Plot) field to the Rows shelf, as a blue discrete field (exactly the same format as on the line chart). But hide this field (uncheck Show Header). This is the key linking field from the chart to the detail.

Drill down from chart to table

Create one dashboard (Chart DB) that displays the chart viz. And another dashboard that displays the table detail (Table DB). On the Chart dashboard, add a Filter Dashboard Action (Dashboard menu -> Actions -> Add Action -> Filter), that starts from the Chart sheet, runs as a Menu option, and targets the Detail sheet on the Detail dashboard. Set the action to exclude all values when no selection has been made. Name the action Click to Show Details

On the line chart, if you now click a point on the chart, the tooltip will display, along with a link, which when clicked on, will then take you to the Detail dashboard and present you with the list of complaints. The number of rows displayed should match the number you clicked on

Un-highlight selected marks

What you might also notice, is when you click on a point on the chart, the other marks will all ‘fade’ out, leaving just the one you selected highlighted. It’s not always desirable for this to happen. To prevent this, create a new field called Dummy which just contains the text ‘Dummy’. Add this onto the Detail shelf of the All marks card on the chart viz.

Then on the chart dashboard, add another dashboard action, but this time choose a highlight action. Set the action to run on select and set the source & target sheets to be the same sheet on the same dashboard. But target highlighting to selected fields, and select the Dummy field only

Hide/Show filter controls

Check out this post by The Data School that explains very simply how to work with floating containers to show/hide buttons. When creating in Desktop, the ‘onclick’ interactivity won’t work, you’ll have to manually select to show and hide, but once published to Tableau Public, it’ll behave as desired.

You have options to customise what the button looks like when the container contents are hidden, and what it looks like when they’re shown, via the Button Appearance

Add Navigation Button

On the Detail dashboard, simply add a Navigation object to the dashboard

and edit the settings to navigate back to the chart dashboard, as well as customise the appearance

Hopefully I’ve covered all the key features of this challenge. My published viz is here.

Happy vizzin’! Stay safe!

Donna

Can you build a Customer Lifetime Value Matrix?

Ann Jackson challenged us this week to build this matrix depicting the average worth of customer cohorts during their lifetime.

This challenge involves a mix of LoDs (Level of Detail calculations) and table calculations.

First up , we need to define our customer cohorts (ie group the customers), which for this challenge is based on identifying the quarter they placed their first order in. This will involve an LoD calculation. For a good introduction to LoDs with some worked examples (including a similar cohort analysis example), check out this Tableau blog post.

ACQUISITION QUARTER

DATE(DATETRUNC(‘quarter’,{FIXED [Customer ID] : MIN([Order Date])}))

The 2nd part of the formula in the { … } returns the earliest Order Date associated to the Customer ID, which is then truncated to the 1st day of the quarter that date falls in ie 23 Feb 2019 is truncated to 01 Jan 2019.

For the ‘quarters since birth’ field, we need to calculate the difference in quarters, between the ACQUISITION QUARTER and the ‘quarter’ associated to the Order Date of each order in the dataset.

QUARTERS SINCE BIRTH

DATEDIFF(‘quarter’,[ACQUISITION QUARTER],DATETRUNC(‘quarter’,[Order Date]))

Drag this field into the ‘dimensions’ area of the left hand data pane (above the line if you’re using later versions of Tableau).

Lets sense check what this looks like, by adding

  • ACQUISITION QUARTER to Rows (Discrete, Exact Date)
  • ORDER DATE to Rows, set to Quarter (quarter year ie May 2015 format which will make a green pill), then set to discrete to change to blue
  • QUARTERS SINCE BIRTH to Rows

You can see that while the first row against each cohort starts with a different quarter, the QUARTERS SINCE BIRTH always starts at 0 and counts sequentially down the table.

Next we want to count the number of distinct customers in each cohort, and we’ll use another LOD for this.

CUSTOMERS

{FIXED [ACQUISITION QUARTER]: COUNTD([Customer ID])}

Once again move this field into the Dimensions section of the data pane.

Add this onto the Rows of the above data table, and you should get every row for the same cohort displaying the same number

Add Sales onto Text to get the value of sales made by the customer in each cohort in each quarter. The ‘customer lifetime value’ we need is defined as the total sales so far / number of customers in the cohort.

Remove the QUARTER(Order Date) field from the table, as we’re not going to need this for the display, and it’ll affect the next steps if it’s left.

To get the cumulative sales, we need a Running Total Quick Table Calculation. Click on the Sales pill on the Text shelf and select Quick Table Calculation -> Running Total. The click again and Compute By -> QUARTERS SINCE BIRTH. Add Sales back into the table, so you can see the quarterly Sales value and how it’s cumulating until it reaches the next cohort.

We’ve now got the building blocks we need for the CLTV value we need to plot

Avg Lifetime Value

RUNNING_SUM(SUM([Sales])) / SUM([CUSTOMERS])

Note – I purposefully haven’t called this field what you might expect, as I’m going to ‘fill in the gaps’ that Ann describes in the requirements, and I’ll use that name then.

Pop this field into the table above, again setting the table calculation to compute by QUARTERS SINCE BIRTH

You can now use the data table above to validate the calculation is what you expected.

Now let’s build the viz out.

On a new sheet

  • QUARTERS SINCE BIRTH to Columns
  • ACQUISITION QUARTER (exact date, discrete blue pill) to Rows
  • Avg Lifetime Value to Text, setting the table calculation to Compute By QUARTERS SINCE BIRTH

From this basic text table, you can see the ‘blank’ fields, Ann mentioned. In the data table view, it’s not so obvious. The blank is there because there are no sales in those quarters for those cohorts. To fix we need another table calculation

CUSTOMER LIFETIME VALUE (CLTV)

IF ISNULL([Avg Lifetime Value]) AND NOT ISNULL(LOOKUP([Avg Lifetime Value],-1)) AND NOT ISNULL(LOOKUP([Avg Lifetime Value],1))

THEN LOOKUP([Avg Lifetime Value],-1) ELSE [Avg Lifetime Value] END

This says, if the Avg Lifetime Value field is NULL but neither the previous or the subsequent values are NULL, then use the Avg Lifetime Value value from the previous column (LOOKUP).

Replace the Avg Lifetime Value with the CUSTOMER LIFETIME VALUE (CLTV) field (setting the Compute By again), and the empty spaces have disappeared.

If you hover over the cells in the lower right hand side of the view, you’ll see tooltips showing, indicating that a mark has been drawn on the viz with Null data. To fix this, add CUSTOMER LIFETIME VALUE (CLTV) to the Filter shelf and specify non-null values only to show via the Special tab.

Now if you hover over that area you don’t get any tooltips displaying, as there aren’t any marks there.

Now it’s just a case of formatting the viz a bit more

  • Add CUSTOMERS to Rows
  • Add CUSTOMER LIFETIME VALUE (CLTV) to the Colour shelf by holding down the Ctrl key then clicking on the field that’s already on the Text shelf, and dragging via the mouse onto the Colour shelf. Using Ctrl in this way has the effect of copying the field including the table calculation settings, so you don’t need to apply them again. This will change the colour of the Text.
  • Then change the mark type to Square, which will then fill out the background based on the colour.
  • Then edit the colour legend to the relevant palette (which you may need to install via Ann’s link).
  • Set the border of the mark via the Colour shelf to white
  • Remove the row & column dividers
  • Set the row Axis Ruler to a dark black/grey line
  • Format the 1st 2 columns so the font is the same and centred. Widen the columns if required.
  • Update the tooltip

And then you should be ready to add the viz to your dashboard. My published version is here.

This blog is a bit more detailed that my recent posts, but I’m also conscious I’ve skipped over some bits that if you’re brand new to Tableau, you may not be sure how to do. Feel free to add comments if you need help!

Happy vizzin’! Stay Safe!

Donna

Can you find the variance along a line?

So after a couple of weeks off blogging due to Christmas, I’m back providing solutions for #WOW2021! New contributor Candra Mcrae started off the year with this gentle workout focussing on table calculations. For those who have followed my previous blogged solutions, you’ll probably recall that table calcs don’t phase me as much as they do some – I started using Tableau before LODs existed, so they were the only tool at my disposal in the past. That said, I’m still not such an expert that I get it right first time – there’s often plenty of trial and error as I choose which fields I want to compute by, although I’ve got much better at this since I read Andy Kriebel’s Table Calculations Overview a couple of years ago.

As with any challenge that involves table calcs, I tend to start by building out a tabular view of all the data I’m going to need to build the viz. This ensures I can validate the data much easier and set the table calculation settings to what I need.

So let’s start….

For the line chart itself, we’re simply going to be plotting Year against Food insecurity which has been formatted to 1 decimal place and displays the % as a suffix

We will also need to capture 2 other values which will represent the coloured circles. Parameters are needed to help identify what these circles will be.

pSelected Year is an integer parameter that can be built by right clicking on Year and selecting Create -> Parameter. This will populate the parameter with all the values of the Year field. Default the value to 2018 and adjust the formatting of the displayed value so that it is 0 decimal places and include thousand separators is unchecked.

With this parameter, we can capture the value associated which is represented by the pink dot in the viz

Selected Year %

IF [Year] = [pSelected Year] THEN [Food insecurity (includes low and very low food security) Percent of households] END

It’s a bit more work to identify the black dot, as this will vary based on another parameter

pComparison

I created this an integer parameter storing the values 0, 1 & 2 (defaulted to 2), but aliased for display as First Year, Most Recent Year, Previous Year. I’ll be using this is an case statement/if clause later and comparing integers to strings is much more efficient in Tableau.

I like to take things step by step, especially when there’s table calcs involved, to ensure all the values I’m referencing are correct, so rather than identifying the selected value in a single calculated field, I’m using multiple.

Firstly I want to identify the 1st year in the dataset (without hardcoding).

First Year

WINDOW_MIN(MIN([Year]))

This will store the value of the earliest year (1995) against every row in the data that is outputted in the view (a bit like you would get with {FIXED : MIN([Year])}, but this is a no LOD challenge).

From this I can work out

First Year %

IF MIN([Year]) = [First Year] THEN MIN([Food insecurity (includes low and very low food security) Percent of households]) END

Notice the MIN() functions used in this statement, as opposed to the Selected Year % above. This is because First Year is a table calc which is an aggregation, and subsequently other fields referenced in the calculation also need to be aggregated. In this case other aggregations such as AVG, MAX, ATTR would also suffice.

Similarly, I’m going to derive the Latest Year % with

Latest Year

WINDOW_MAX(MAX([Year]))

Latest Year %

IF MIN([Year]) = [Latest Year] THEN MIN([Food insecurity (includes low and very low food security) Percent of households]) END

Finally, I’m also going to work out the value for the previous year

Previous Year

[pSelected Year]-1

Previous Year %

IF [Year] = [Previous Year] THEN [Food insecurity (includes low and very low food security) Percent of households] END

With these 3 % fields, I can now create a fourth field which stores the value of the % I want to compare with

Selected Comparison %

CASE [pComparison]
WHEN 0 THEN [First Year %]
WHEN 1 THEN [Latest Year %]
WHEN 2 THEN SUM([Previous Year %])
END

This ultimately just stores the 2nd value required for the black dot. This could all have been written within this single calculation, but I find it easier to troubleshoot if things are broken down a bit.

Putting these onto a table we can see how the values in each row change, as the parameters are changed.

Note, for all the table calculation fields (all denoted by the triangle symbol on the pill), I have explicitly set them to Compute Using the specific dimension of Year rather than the default of table down. While this will give the same result, I prefer to be explicit to ensure the values don’t change if pills get subsequently moved around the canvas (in the case of the Selected Comparison % field, all Nested Calculations within the Edit Table Calculation dialog box need to be set).

This is enough information to build the main viz itself by

  • adding Year to Columns (green continuous pill)
  • adding Food insecurity to Rows
  • adding Selected Year % to Rows next to Food insecurity
  • then drag Selected Comparison % to the Selected Year % axis which will automatically change the display to have Measure Values on Rows instead. Set the table calculation setting to compute by Year

This chart can then be set to be dual-axis and the axes synchronised. The Food insecurities should remain as as line mark, and the Measure Values should be a circle. The colours, formatting and tooltip then need to be applied.

Now we need to go back to our table of data to build out other calculations. The requirement is a single viz, so we need to provide the % value of the selected year within the title of the chart, along with the difference from the comparison value. For this to work, we need to store the relevant values against every row in the data set.

We already have the Selected Year % value identified, but this is only captured against the row of the selected year in the data output. To get it to display against every row we need

Window Max – Selected Year %

WINDOW_MAX(MIN([Selected Year %]))

This is formatted to 0 dp with the % sign as suffix, as this is the field that will be displayed in the title. Added to our data table, with the table calc set to compute by Year, you can see the value replicated across every row

Similarly, we already have the Selected Comparison % captured, but in order to work out the difference, we also need to get this value against every row too

Window Max – Selected Comparison %

CASE [pComparison]
WHEN 0 THEN WINDOW_MAX([First Year %])
WHEN 1 THEN WINDOW_MAX([Latest Year %])
WHEN 2 THEN WINDOW_MAX(MAX([Previous Year %]))
END

Adding this in to the table (and remembering to set the table calc settings), you can also see the relevant value perpetuated against every row. Change the pComparison value and you’ll see the values change accordingly, but still the same on every row.

So for the difference…. at this point I chose to deviate from the solution published. We’re already dealing with a measure that is quantified as a % (as opposed to a measure that is quantified as £ say). So to get the % difference between two percentage measures, I simply chose to show the difference between the two values (ie selected – comparison). Some would refer to this a the point difference between the values. This makes most sense to me for this particular scenario. The alternative is to calculate the % difference in the more traditional way of (selected – comparison) / comparison as you may do if you were presenting the % difference between the value of sales in different years. But I personally found the result could be confusing to the reader .

Difference

[Win Max – Selected Year %]-[Win Max – Selected Comparison %]

BUT, the value displayed within the title is the absolute value (ie no -ve sign) and actually doesn’t display if the value is 0 (which you get if you select the year to be 2019 and comparison to most recent year). So I resolved this with

DISPLAY : Difference

IF [Difference] <> 0 THEN ABS([Difference]) END

which is formatted to 1 dp with % suffix

Finally, we have a symbol that is used to indicate if the change is +ve, -ve or remains the same.

Difference Indicator

IF [Difference] > 0 THEN ‘▲’
ELSEIF [Difference] < 0 THEN ‘▼’
ELSE ‘N/C’
END

Add this into the table too, once again remembering to set the table calc properties for all nested calculations.

So now you have these fields, you can add Window Max – Selected Year % and DISPLAY : Difference and Difference Indicator to the Detail shelf of the All marks card on the chart viz, and these will then be available in title dialog to add. Once again, ensure you set the table calcs to compute by Year.

You’ll also need to add some spacing in the title, to allow the parameter controls to be ‘floated’ into place on the dashboard. Getting the position right is really tricky. I positioned it very carefully on Desktop, but when I published, the controls were in completely different places. The easiest way I found to resolve this, was to use the Edit feature in Tableau public to edit the dashboard online and move the objects that way.

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you show the Year-on-Year Trend?

It’s Community Month over at #WOW HQ this month, which means guest posters, and Kyle Yetter kicked it all off with this challenge. Having completed numerous YoY related workbooks both through work and previous #WOW challenges, this looked like it might be relatively straight forward on the surface. But Kyle threw in some curve balls, which I’ll try to explain within this blog. The points I’ll be focussing on

  • YoY % calculation for colouring the map
  • Displaying the circles on the map
  • Restricting the Date parameter to 1st Jan – 14th July only
  • Showing Daily or Weekly dates on the viz in tooltip
  • Restricting to full weeks only (in weekly view)

YoY % calculation

The data provided includes dates from 1st Jan 2019 to 21st July 2020. We need to be able to show Current Year (CY) values alongside Previous Year (PY) values and the YoY% difference. I built up the following calculations for all this

Today

#2020-07-15#

This is just hardcoded based on the requirement. In a business scenario where the data changes, you may use the TODAY() function to get the current date.

Current Year

YEAR([Today])

simply returns 2020, which I could have hardcoded as well, but I prefer to build solutions as if the data were more dynamic.

CY

IF YEAR([Subscription Date]) = [Current Year] THEN [Subscription] END

stores the value of the Subscription field but only for records associated to 2020

PY

IF YEAR([Subscription Date]) = [Current Year]-1 THEN [Subscription] END

stores the value of the Subscription field but only for records associated to 2019 (ie 2020-1)

YoY%

(SUM([CY])- SUM([PY]))/SUM([PY])

format this to a percentage with 0 decimal places. This ultimately is the measure used to colour the map. CY, PY & YoY% are also referenced on the Tooltip.

Displaying circles on the map

This is achieved using a dual axis map (via a second instance of the Latitude pill on Rows). One ‘axis’ is a map mark type coloured by the YoY% and the other is a circle mark type, sized by CY, explicitly coloured black.

The Tooltip for the circle mark type also shows the % of Total subscriptions for the current year, which is a Percent of Total Quick Table Calculation

Restricting the Date parameter to 1st Jan – 14th July only

As mentioned the Subscription Date contains dates from 01 Jan 2019 to 21 July 2020, but we can’t simply add a filter restricting this date to 01 Jan 20 to 14 Jul 20 as that would remove all the rows associated to the 2019 data which we need available to provide the PY and YoY% values.

So to solve this we need a new date field, and we need to baseline / normalise the dates in the data set to all align to the same year.

Baseline Date

//set all dates to be based on current year
MAKEDATE([Current Year], MONTH([Subscription Date]), DAY([Subscription Date]))

So if the Subscription Date is 01 Jan 2019, the equivalent Baseline Date associated will be 01 Jan 2020. The Subscription Date of 01 Jan 2020 will also have a Baseline Date of 01 Jan 2020.

We also want to ensure we don’t have dates beyond ‘today’

Include Dates < Today

[Baseline Date]< [Today]

Add Include Dates < Today to the Filter shelf, and set to True.

Add Baseline Date to the Filter shelf, choose Range of Dates , and by default the dates 01 Jan 2020 to 14 Jul 2020 should be displayed

Select to Show Filter, and when the filter displays, select the drop down arrow (top right) and change to Only Relevant Values

Whilst you can edit the start and end dates in the filter to be before/after the specific dates, this won’t actually use those dates, and the filter control slider can only be moved between the range we want.

The Baseline Date field should then be custom formatted to mmmm dd to display the dates in the January 01 format.

Showing Daily or Weekly dates on the viz in tooltip

The requirements state that if the date range selected is <=30 days, the trend chart shown on the Viz in Tooltip should display daily data, otherwise it should be weekly figures, where the week ‘starts’ on the minimum date selected in the range.

There’s a lot going on to meet this requirement.

First up we need to be able to identify the min & max dates selected by the user via the Baseline Date filter.

This did cause me some trouble. I knew what I wanted, but struggled. A FIXED LOD always gave me the 1st Jan 2020 for the Min Date, regardless of where I moved the slider, whereas a WINDOW_MIN() table calculation function caused issues as it required the data displayed to be at a level of detail that I didn’t want.

A peak at Kyle’s solution and I found he’d added the date filters to context. This means a FIXED LOD would then return the min & max dates I was after.

Min Date

{MIN([Baseline Date])}

Note this is a shortened notation for {FIXED : MIN([Baseline Date])}

Max Date

{MAX([Baseline Date])}

With these, we can work out

Days between Min & Max

DATEDIFF(‘day’,[Min Date], [Max Date])

which in turn we can categorise

Daily | Weekly

IF [Days between Min & Max]<=30 THEN ‘Daily’ ELSE ‘Weekly’ END

We also need to understand the day the weeks will start on.

Day of Week Min Date

DATEPART(‘weekday’,[Min Date])

This returns a number from 1 (Sunday) to 7 (Saturday) based on the Min Date selected.

Using this we can essentially ‘categorise’ and therefore ‘group’ the Baseline Date into the appropriate week.

Baseline Date Week

CASE [Day of Week Min Date]
WHEN 1 THEN DATETRUNC(‘week’,([Baseline Date]),’Sunday’)
WHEN 2 THEN DATETRUNC(‘week’,([Baseline Date]),’Monday’)
WHEN 3 THEN DATETRUNC(‘week’,([Baseline Date]),’Tuesday’)
WHEN 4 THEN DATETRUNC(‘week’,([Baseline Date]),’Wednesday’)
WHEN 5 THEN DATETRUNC(‘week’,([Baseline Date]),’Thursday’)
WHEN 6 THEN DATETRUNC(‘week’,([Baseline Date]),’Friday’)
WHEN 7 THEN DATETRUNC(‘week’,([Baseline Date]),’Saturday’)
END

Ideally we want to simplify this using something like DATETRUNC(‘week’, [Baseline Date], DATEPART(‘weekday’, [Min Date])), but unfortunately, at this point, Tableau won’t accept a function as the 3rd parameter of the DATETRUNC function.

Let’s just have a look at what we’ve got so far

Rows for California only showing the Subscription Dates from 01 Jan 2019 – 10 Jan 2019 and 01 Jan 2020 to 10 Jan 2020. Min & Max date for all rows are identical and matches the values in the filter. The Baseline Date field for both 01 Jan 2019 and 01 Jan 2020 is January 01. The Baseline Date Week for 01 Jan 2019 – 07 Jan 2019 AND 01 Jan 2020 – 07 Jan 2020 is 01 Jan 2020. The other dates are associated with the week starting 08 Jan 20202.

So now we have all this information, we need yet another date field that will be plotted on the date axis of the Viz in Tooltip.

Date to Plot

IF [Days between Min & Max] <=30 THEN ([Baseline Date]) ELSE [Baseline Date Week] END

If you add this field to the tabular display I built out above, you can see how the value changes as you move the filter dates to be within 30 days of each other and out again.

When added to the actual viz, this field is formatted to dd mmm ie 01 Jan, and then is plotted as a continuous, exact date (green pill) field on the Columns alongside the Daily | Weekly field, with State & Subscription on Rows. The YEAR(Subscription Date) provides the separation of the data into 2 lines.

Restricting to full weeks only (in weekly view)

The requirements state only full weeks (ie 7 days of data) should be included when the data is plotted at a weekly level. For this we need to ascertain the ‘week’ the maximum date falls in

Max Date Week

CASE [Day of Week Min Date]
WHEN 1 THEN DATETRUNC(‘week’,([Max Date]),’Sunday’)
WHEN 2 THEN DATETRUNC(‘week’,([Max Date]),’Monday’)
WHEN 3 THEN DATETRUNC(‘week’,([Max Date]),’Tuesday’)
WHEN 4 THEN DATETRUNC(‘week’,([Max Date]),’Wednesday’)
WHEN 5 THEN DATETRUNC(‘week’,([Max Date]),’Thursday’)
WHEN 6 THEN DATETRUNC(‘week’,([Max Date]),’Friday’)
WHEN 7 THEN DATETRUNC(‘week’,([Max Date]),’Saturday’)
END

so if the maximum date selected is a Thursday (eg Thurs 11th June 2020) but the minimum date happens to be a Tuesday, then the week starts on a Tuesday, and this field will return the previous Tuesday date (eg Tues 9th June 2020).

And then to restrict to complete weeks only…

Full Weeks Only

IF [Daily | Weekly]=’Weekly’ THEN
[Date To Plot]< [Max Date Week]
ELSE TRUE
END

If we’re in the ‘weekly’ mode, the Date To Plot field will be storing dates related to the start of the week, so will return true for all records where the field is less than the week of the max date. Otherwise if we’re in ‘daily’ mode we just want all records.

This field is added to the Filter shelf and set to true.

Hopefully that covers off all the complicated bits you need to know to complete this challenge. My published solution is here.

Happy vizzin’! Stay Safe!

Donna

Can you find all counties within n miles of a selected county?

I’ve been on annual leave for a few weeks. I’ve managed to catch up on all the challenges but haven’t blogged a solution for a while. It’s been a real struggle to get back into things to be honest – back to work, back to school, back to football clubs for my kids, and I’m wondering where I found the time before 😦

Trying to think about when/if I was going to get this blog out has caused me a bit of stress, which I don’t need, and so I need to change my mindset a bit…try to relax a bit … if I don’t manage to post a blog, just accept it and move on. I’ve also got to try to reduce the time it takes me to blog. I’m a ‘detail’ person, so often end up documenting to such a low level, but for my own sanity, I’m going to have to make an effort to be a bit briefer. I can’t guarantee I’ll stick to this though.. will just have to see how things pan out.

Going forward, I’m going to try to focus on the points that I think are key to the challenge, or those I found a bit tricksy.

So onto this week’s challenge. Sean Miller returned as guest poster with a COVID-19 related distance challenge.

Note – I connected to the provided hyper extract file rather than the csv file.

The core points I’m going to discuss are

  • Identify counties within n miles
  • How to select a county without set actions
  • How to handle ‘All’ counties being selected
  • Colouring based on ‘percentile’ of hospital beds
  • How to only show county borders of counties within selected range

Identify counties within n miles

The number of miles is stored within an integer parameter, n miles, that is defaulted to 100.

A county to act as the ‘start point’ is stored in a set, Selected County, based on the County, State field.

The long/lat coordinates of this selected county need to be captured.

Selected County Lat

{FIXED : AVG(IF [Selected County] THEN [Latitude] END)}

Selected County Long

{FIXED : AVG(IF [Selected County] THEN [Longitude] END)}

By using a FIXED LoD calculation, the values are stored against every ‘row’ in the data set.

With these, the starting point can be determined

Selected County Start Point

MAKEPOINT([Selected County Lat],[Selected County Long])

The position of every other county – the ‘destination’ / end point – also needs to be determined

End Point

MAKEPOINT([Latitude],[Longitude])

With these, the distance between can be computed

Distance

DISTANCE([Selected County Start Point],[End Point],’miles’)

And the counties can then be restricted by

Within n miles

[Distance]<= [n miles]

which can be used as a filter and set to True.

How to select a county without set actions

This is managed via the set control feature; right click on the Selected County set and choose Show Set to display the list of counties with the option to select which ones are in or out of the set. Change the display of the control via the dropdown arrow (top right) to be Single value dropdown which automatically provides as ‘All’ option or the ability to select a single set only.

How to handle ‘All’ counties being selected

When ‘All’ is chosen via the Set Control selector, this has the effect of adding all the State, County values into the set, which means we don’t really have a starting state. So the n miles parameter is essentially redundant. But we need to make the Within n miles filter understand this.

We can manage this by first identifying how many values we have in the set

Count Selected States

{FIXED : COUNTD(IF [Selected County] THEN [County, State] END)}

There will either be 1 or if All is selected, then they’ll be 3000+, and we use the FIXED LoD, so the total is stored against all rows.

We can then update our Within n miles filter to consider this value

Within n miles

([Count Selected States]=1 AND [Distance]<= [n miles]) OR [Count Selected States] > 1

This now returns true if either one State, County is selected and the other records are within n miles OR all the states are selected (the count is > 1).

Colouring based on ‘percentile’ of hospital beds

This caused me a little bit of headscratching. I assumed ‘percentile’ would be based on a percentage of the total num_staffed_beds (note I simply renamed this field to Hospital Beds), associated to the state,counties being displayed (ie within n miles). But after building the calculation I thought, and adding it to colour and choosing the Green-Gold colour range, I didn’t get the same spread as shown in the solution.

I messaged Sean to question whether I’d made the right assumption, but while waiting for a response, I did an online search for “Tableau Percentile Rank”, and quickly spotted that a Table Calculation exists. As you can guess, I don’t use this particular calculation very much at all 🙂

So to colour, add Hospital Beds , and simply add a Percentile table calculation

How to only show county borders of counties within selected range

When working with Maps, you can use the options within the Map Layers menu to select which features of the map should be displayed. One of these options is State/Province Borders, which you might think is what is needed.

However, this will show all the state borders, which is evident if you zoom out a bit, including those for states that don’t have counties within n miles.

This isn’t the requirement – we only want to show the borders of the states which have counties within the desired range. So instead, we don’t want to show any State/Province Borders via the Map Layers. And we’ll utilise a dual axis instead, by first identifying the states that are within n miles

States within n miles

If [Within n miles] THEN [State] END

Duplicate the Latitude (generated) field on the Rows, remove all the fields on that mark, and then add the States within n miles to the Detail shelf. Set the colour of this mark to white, and swap the Latitude (generated) fields if needed, so the States are on the top, as below.

Then make dual axis 🙂

And that’s the key points that I hope will help you solve this challenge. My published version is here, which is available to be downloaded. If there’s anything I haven’t covered that you’re not sure about, feel free to contact me.

Happy vizzin’! Stay safe!

Donna

Are Sales on track with Goal?

This week’s #WOW challenge was set by Ann primarily to demonstrate the recent 2020.2 release of relationships in Tableau. This is a change to how the data modelling works when combining data together, in this case actual sales data and sales goal (target) data. Ann provided a ‘tweaked’ version of superstore sales data for the actual sales, alongside a custom ‘goals’ data set which provided a sales goal value for each month per Segment and Category (ie the goals data was not as granular as the actual sales data).

Combining the Data

Obviously, to make use of the new relationships feature, you’ll need at least v2020.2.1 installed.

When you add the 2 data sources to the Data Source pane and add the Orders sheet from the Superstore data and then add the Sheet 1 from the Sales Goals data, you’re prompted to add a relationship if matches can’t automatically be found (ie the field names aren’t identical).

I added relationships for the Segment, Category and Month of Order Date fields, which then provides a ‘noodle’ link between the two sets of data

Note – to add multiple relationships, you have to close the field listing to then see the Add more fields option.

With relationships, Tableau is now smarter at how it manages the queries between the data. I’m not going to go into it at length (as there’s plenty of good Tableau documentation on the subject), but I’ll quickly show the difference between this and a traditional ‘join’.

In the Relationships model, if I show Sales and Goal by Month & Segment, the Goal is based on the sum of the rows associated to that Month and Segment that you can find if you looked at the source spreadsheet directly

But if I was using a traditional join with these data sets :

and then presented the same info as above

The Goal data is way off, as it’s being influenced by all the additional rows of data that have been constructed by the join clause.

If we expand to the Category level in each example above (the level at which the Goal data is stored), the Relationship version still shows the correct value

while the Join version is still incorrect, and only changing to an Average aggregation do we get the figure needed

Joining the data requires more work (calculations / knowledge) to ensure the Goal data is reporting correctly at the various levels, whereas the Relationships model ‘just works’.

Building the Table

When I started building the solution, I actually started with the YoY trend chart, but found I had to revisit it (swap a couple of pills about) as I progressed through, so I’m going to start with the table of data itself.

This turned out to be really quite tricksy… it’s just a table right… looks harmful enough…. but Ann had lobbed a few ‘gotcha’ grenades into the mix, that took a bit of head-scratching!

First up, Ann likes to use capitalisation, so I simply renamed the following fields

  • Category (from Sheet 1 – the Goals data) -> CATEGORY
  • Segment (from Sheet 1 – the Goals data) -> SEGMENT
  • Month of Order Date (from Sheet 1 – the Goals data) -> MONTH
  • Sub-Category (from Orders – the Superstore Sales data) -> SUB-CATEGORY

And I then created a hierarchy called SEGMENT which stored SEGMENT – > CATEGORY -> SUB-CATEGORY.

We also needed to ‘fake’ today so we can show sales to date. The Orders Superstore data set contains data from the start of 2016 to the end of 2019. We needed to pretend ‘today’ was 1st July 2019 and not show any sales beyond the end of the month. So I created the field

Today

#2019-07-01#

and then

ACTUAL SALES

IF DATETRUNC(‘month’,[Order Date])<= [Today] THEN [Sales] END

If we build out the basics of the table

  • SEGMENT hierarchy on Rows
  • MONTH (discrete, exact date, formatted to mmmm yyyy) on Rows
  • Measure Names on Columns and on Filter, filtered to ACTUAL SALES and Goal
  • Measure Values on Text
  • MONTH on Filter, set to exclude 2016

we get

and as we expand the hierarchy to the next level, the ACTUAL SALES and Goal adjusts as expected

but when we reach the 3rd level we see a couple of issues

  1. The Goal is showing the values that were displayed at the CATEGORY level, since it isn’t set at any further level. The requirement we have is to show the value ‘No Goal’ instead once we reach this point.
  2. We also have a set of values with a NULL SUB-CATEGORY. This is because there are dates with goal values in the Goals data set, but we have no Sales data for those dates, so a match at the SUB-CATEGORY level can’t be found. We need to remove these records when at this level, and actually only show records where we have ACTUAL SALES, so records from July 2019 onwards should also be excluded.

At this point, I did end up having a sneak look at Ann’s solution. I had an idea of what I wanted to do, BUT I was worried that I may be missing something related to the new data model… that perhaps there was a feature of relationships that I needed to be exploiting as part of this, rather than go down the table calculation route I was heading towards. After all, part of doing these challenges for me is to make use of new features & functions if I can, so I have a reference workbook for future use. I had a read up on the blog posts related to relationships, but couldn’t spot anything obvious, hence a quick check at Ann’s solution…. nope nothing special required in respect of the data model….so instead, we need some additional calculations to help resolve all this.

What we need is some way to identify when we’re at the lowest SUB-CATEGORY level. We’ll start by counting the number of distinct sub-categories we have

# Sub Cats

ZN(COUNTD([SUB-CATEGORY]))

Wrapped in a ZN will return the value 0 when none exist.

Dropping this into the table of data, we can see the 0s for all the future Goal records, and 1 for the rest of the rows, since we’ve expanded to the SUB-CATEGORY level.

If you collapse up to the CATEGORY level, the #Sub Cats count changes, as each row typically contains more than 1 SUB-CATEGORY.

The term ‘typically’ in the above statement though is the reason we can’t assume we’re at the lowest level if the count is 1 though. It is possible for the sales against a particular CATEGORY in a MONTH to only span a single SUB-CATEGORY (Check out Home Office -> Furniture in Jan 2018). So we also need

Max Sub-Cats in Window

WINDOW_MAX([# Sub-Cats])

Which will count the maximum number of SUB-CATEGORY records for each SEGMENT and CATEGORY. In the case of Furniture Sales in the Home Office in Jan 2018, you can see below the count of sub-categories is 1 but the maximum count in the window is 4.

This shows we’re not at the SUB-CATGEORY level, as when we are, the #Sub Cats = Max Sub Cats in Window (ie 1=1)

So with that knowledge we can now derive an alternative field for the goal

SALES GOAL

IF ([# Sub-Cats] = 1 AND [# Sub-Cats]=[Max Sub-Cats in Window]) THEN 0 ELSE SUM([Goal]) END

Remove the original Goal field from the table and add this instead, setting the table calculation to compute by MONTH and you’ll see the SALES GOAL value change from a value to 0 as you expand down to the SUB-CATEGORY level.

So how do we show ‘NO GOAL’ instead of 0? Well this is a bit of a sneaky formatting trick, and one I couldn’t figure out (this step was actually the last thing I changed before publishing).

In my original SALES GOAL calculation I had set the value to be NULL rather than 0 expecting there to be an option to show the NULL values as something else (ie NO GOAL). But the feature I was looking for wasn’t available (maybe I was mistaken about when it appears.. maybe it’s disappeared in 2020.2… I still need to investigate). Anyhow, my alternative was to create a string field to store the value in, but this would make formatting the number quite cumbersome, and I was sure that wasn’t the case.

So I had to check Ann’s solution. Instead of NULL, she had set the value to be 0, and consequently used the following custom number formatting :

“$”#,##0;-“$”#,##0;”NO GOAL”

where the last bit is how to format 0. Very very sneaky huh!

So we’ve worked out how to identify when we’re at the SUB-CATEGORY level, and managed to display ‘NO GOAL’ when at that level, now we need to filter out some of the rows (but only when at this level).

Records to Show

IF [Max Sub-Cats in Window] = 0 THEN FALSE
ELSEIF ([# Sub-Cats] = 1 AND [# Sub-Cats]=[Max Sub-Cats in Window])
THEN //at lowest level
IF ATTR([MONTH]) < [Today] THEN TRUE
ELSE FALSE
END
ELSEIF [# Sub-Cats] = 0 AND [Max Sub-Cats in Window] = 1 THEN FALSE //still at lowest level but no sales so exclude
ELSE TRUE
END

This took a bit of trial & error to work out, and there may well be something much more simplified.

Adding this to the table, and again setting to compute by MONTH you can see which rows are reporting True and False. When collapsed to the SEGMENT or CATEGORY level, all rows are True, but the values start to vary when at the SUB-CATEGORY level.

Now, we need to work on the fields to help show us the actual & % difference.

RESULT

IF ([# Sub-Cats] = 1 AND [# Sub-Cats]=[Max Sub-Cats in Window]) THEN NULL ELSE SUM([ACTUAL SALES])-([SALES GOAL]) END

The difference should only show if we’re not at the lowest level. This needs to be custom formatted as

↑#,##0;↓#,##0

I use this site to find the arrows needed.

ACTUAL vs GOAL

[RESULT]/SUM([ACTUAL SALES])

will give me the % difference (format to percentage with 0 dp).

And the final piece of data we need (I think) is to compute a RAG status for each row. This is dependent on some parameters

GREEN (WITHIN X%)

Type Float, set to 0.1 but displayed as a percentage

and similarly

RED (ABOVE Y%)

With these we can work out

RAG

IF ABS([ACTUAL vs GOAL]) > [RED (ABOVE Y%)] THEN ‘WAY OFF TRACK’
ELSEIF ABS([ACTUAL vs GOAL]) < [GREEN (WITHIN X%)] THEN ‘ON TRACK’ ELSEIF ATTR([MONTH])>=[Today] THEN ‘NO COMPARISON’
ELSEIF [SALES GOAL]=0 THEN ‘NO COMPARISON’
ELSE ‘OFF TRACK’
END

If you add all these fields to your table then you can check how they all compare/change as you expand/collapse the hierarchy (make sure you set the compute by MONTH on all fields).

So now we’ve got all the building blocks you can create the actual table.

I’d recommend duplicating the above (and keeping as a ‘check data’ sheet). Then remove the fields you don’t need, and add the Records To Keep = True to the Filter shelf. Format the table appropriately, and add the title so you’re left with

Building the Line Chart

As mentioned above, I started with this chart, but ended up having to revisit it as I found I needed to use different pills that got created as I built the table. There is actually 1 additional field needed, which isn’t obvious until you start playing with the dashboard actions later:

GOAL

IF [SALES GOAL] > 0 THEN [SALES GOAL] END

This is needed so that when we filter by SUB-CATEGORY later and there is no goal, we only get a single line displaying the sales values, rather than also having a line plotted at 0.

  • MONTH on Filter, excluding Year 2016
  • MONTH (exact date, continuous) on Columns
  • Measure Values on Rows and Filtered to ACTUAL SALES and GOAL
  • Coloured accordingly by Measure Names

Then add TODAY to Detail and add a Reference Line to the MONTH axis, setting it to TODAY, adjusting the format of the line, and setting the fill above colour to grey.

Format the gridlines, tooltips & axis accordingly to match

Barcode / Strip Chart

For this we’re creating a bar chart using MIN(1), and fixing the axis to range from 0-1 so it fills up the entire space as follows

  • MONTH on Filter excluding Year 2016
  • MONTH as exact date, discrete (blue pill) on Columns
  • MIN(1) on Rows
  • RAG on Colour adjusted accordingly
  • ACTUAL vs GOAL on Tooltip

Then use the ‘type in’ functionality to create a pill labelled ‘vs Goal’ on the Rows

Creating the Dashboard

I used a vertical container to add the components onto the dashboard, as I used blank objects that are formatted with a black background, 0 padding and a height of 4 to create the horizontal line separators.

The interactivity between the table and the other charts is managed via Dashboard Filter actions.

To get these to work properly though, and in such a way that the selected SEGMENT and/or CATEGORY and/or SUB-CATEGORY can be referenced in the title of the line chart, 3 separate dashboard actions are required, with each one specifying the field to filter by, as follows :

Once these are set up, the associated pills will appear on the Filter shelf of the Line and Barcode charts, and are then available for selection when building the title of the line chart

You should now have your completed chart.

My published version is here.

Happy vizzin’! Stay Safe!

Donna

Can you compare a 3-day vs 14-day moving average and describe the latest trend?

This week for #WOW2020, Ann provided a table calculation feast of a challenge! This certainly is not for the faint-hearted! As well as cracking all the table calcs, the challenge features multiple views, measure swapping, parameters, BANs, filtering, sorting …. it’s got it all going on!

Ann hinted you’d probably want to start with the table, and even if there hadn’t been a table output in the display, this is what I would have done. If you’ve read enough of my blogs, you’ll know I often like to build up a ‘check data’ sheet, which just contains the data I need in tabular form as a quick reference. When working with table calculations this is an absolute must have!

So let’s build out that Check Data table to start with. I have a feeling this is going to be a lengthy blog 🙂

Initial Set up

First up, the requirements stated that the latest date would be 7 June, but I found records with a 8 June date. All the associated info for this date was null though, so I set a data source filter to exclude this. This means I wouldn’t get any issues if I needed to store the max date in a FIXED LoD calculation at any point.

I also found it easier to rename a couple of the measures provided to match the output, so rename PEOPLE_POSITIVE_NEW_CASES_COUNT to New Cases and PEOPLE_POSITIVE_CASES_COUNT to Reported Cases. I’ll refer to these renamed fields going forward.

Building all the Calculated Fields

To build out the table, we’re just going to focus on one State & County, as there’s a lot of data. So add Province State Name = Tennessee and County = Davidson to the Filter shelf.

Add Report Date (discrete exact date – blue pill) and New Cases & Reported to Rows. As you scroll down, you’ll see data starting to come in on 8 March.

We want to create our moving average calculations

3 Day Moving Avg

WINDOW_AVG(SUM([New Cases]), -2, 0)

14 Day Moving Avg

WINDOW_AVG(SUM([New Cases]), -13, 0)

Notice the number of rows to average over is 1 less than you might expect, as the current row is included, so the calculation is saying ‘current row’ and 2 | 13 previous rows.

Add these to the table, and adjust the table calculation so it is explicitly calculating by Report Date. This would have happened automatically, as the calculation would have been computing ‘down’ the table, but it’s best to fix the computation, so it doesn’t matter where the pill gets moved to in the view.

We now need to work out whether there is an increase or not between the 3-day and 14-day average.

Is Increase?

IF [3 Day Moving Avg] > [14 Day Moving Avg] THEN 1 ELSE 0 END

Is Decrease?

IF [3 Day Moving Avg] <= [14 Day Moving Avg] THEN 1 ELSE 0 END

I’m using 1s and 0s as it’s going to help with a later calculation.

NOTE – I’m assuming that if there is ‘no change’ it’ll be recorded as a decrease. This is how I interpreted the requirement, “ …whether it is an increase or a decrease (or no change)” and it wasn’t easy to find any matches anyway.

I also need some text to indicate the increase or decrease

Increase | Decrease

UPPER(IF [Is Increase?]=1 THEN ‘Increase’ ELSE ‘Decrease’ END)

The UPPER is used as that’s part of the tooltip formatting.

Let’s get these onto the view, always making sure the table calculations are set to Report Date.

We need to calculate the number of days that has been reported INCREASE in succession, and the number of days where successive DECREASE has been reported.

So first, let’s identify which rows match the previous row.

Match Prev Value?

LOOKUP([Is Increase?],-1) = [Is Increase?]

If the value of the Is Increase? field in the previous (-1) row is the same as the Is Increase? field in the current row, then this is true, else false.

Add to the view, and verify the table calculation for itself and all nested calculations being referenced, is set to Report Date.

We now have all the information we need to help us work out the number of days in the increase/decrease ‘trend’.

Days in Trend

IF (FIRST()=0) OR(NOT([Match Prev Value?])) THEN 1
ELSEIF [Increase | Decrease] = ‘INCREASE’ THEN ([Is Increase?]+PREVIOUS_VALUE([Is Increase?]))
ELSEIF [Increase | Decrease] = ‘DECREASE’ THEN ([Is Decrease? ]+PREVIOUS_VALUE([Is Decrease? ]))
END

If the row in the table is the very first entry (so there’s nothing previous to compare against), or the row in the table didn’t match it’s predecessor (ie there was a change), then we’re starting a new ‘trend run’, which obviously starts at 1.

Otherwise, if the current row we’re on indicates an increase, then we’ll add the value of the Is Increase? field (which is 1) to the previous value (which is also 1). PREVIOUS_VALUE works recursively though, so it essentially builds up a running sum, which gives our trend.

We ultimately do the same thing using the Is Decrease? column. This is why using 1 & 0s in the earlier calculation help.

Adding into the view, and setting the table calculation correctly, you should get something similar to this…

Finally, there’s one key field we need to add; something to help identify the latest row as we will need it for filtering in the table that’s displayed on the dashboard. Simply applying a standard ‘quick filter’ won’t work, as the table requires we show the 3-day & 14-day moving averages. A ‘quick filter’ to limit the data to the latest date (7th June), will show the wrong values, as the data related to the other days will be filtered out, so the table calc won’t have the information to correctly compute over.

We need to create another table calculation that we can use as a filter, and that due to Tableau’s ‘order of operations’ will apply later in the filtering process than a traditional quick filter.

Max Date

{FIXED : MAX([Report Date])}

The latest date in the whole data set.

Show Data for Latest Date

LOOKUP(MIN([Report Date]),0) = MIN([Max Date])

If the Report Date of the current row is the same as the maximum date in the whole data set, then return true.

We’ve now got all the core data components we need to create the various charts.

In the interest of time (my time in writing this out), I’m going to attempt not to describe the building of all the charts in too much detail, but just call out the useful bits you might need. If you’re attempting this challenge with the table calcs above, I’m assuming you know Tableau enough to not need everything defined to the lowest level.

The whole report is driven off a parameter which the user must enter a State – County combo.

You’ll need a calculated field to store the combo

State – County

[Province State Name] + ‘ – ‘ + [County]

and then create a parameter (State – County Parameter) off of this (right click, Create -> Parameter) which will create a string parameter with all the permutations.

When displaying on the dashboard, set this to be of type Type In

BAN

The BAN is a basic summary of the latest trend for the entered state county.

We need to filter the sheet to the value entered in the parameter

Is Selected State County?

[State – County Parameter] = [State – County]

Add this to the Filter shelf as true, along with the Show Data for Latest Date.

Add the relevant fields to the Text shelf to display the required text. The Report Date needs to be custom formatted to ddd, mmm d to get the Sun, Jun 7 display

Map

For the map, as well as filtering the latest date, we’re also going to need to filter just to the state only (not state & county) as above. So I created

Is Selected State?

LEFT([State – County Parameter], FIND([State – County Parameter],’-‘)-2) = [Province State Name]

This is unpicking the State – County combined string stored in the parameter, to just find the State part and compare to the Province State Name.

Build a filled map based on County and filter to the latest date and the selected state. I set the Map Layers to that below, which seems to match up

You’ll need to set both the Is Selected State County? and Increase|Decrease fields to the Colour shelf.

Bar & Line Chart

You’re going to need a few more calculated fields for this.

Moving Avg Selector

for the user to choose what the line should display. I’ve set it to an intger parameter that displays text

We then need a field to show on the display depending on what’s been selected in the parameter

Moving Avg to Display

If [Moving Avg Selector] = 3 THEN [3 Day Moving Avg] ELSE [14 Day Moving Avg] END

You’ll need a Dual Axis chart plotting New Cases and Moving Avg to Display against Report Date (continuous exact date)

The data only starts from 8th March, so I added Report Date to filter to start from 8th March. 8th March is also added as a constant reference line.

Table

Based on the State / County entered, the table is filtered to show the data for the latest date for all the counties in the state entered. Although not stated in the requirements, the first row is the county selected, with the rest ordered by Reported Cases.

You can get the selected county to the top, by adding Is Selected State County as a hidden field to the Rows, and moving ‘True’ to the top.

And that should be everything you need to build the dashboard, which is pretty much just stacking all the sheets one on top of each other in a single column.

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you build a dynamic date drilling chart?

Week 19 of #WOW2020 saw Ann provide this challenge, which she declared would be easier than Luke’s from the previous week.

The aim was to start with a chart by week, but on selection drill in to show a by day view, with the slight twist that all days between the first and last selected dates should display, even if the specific week in between wasn’t explicitly selected.

I knew this was going to involve set actions (although there was a clue on the ‘latest challenges‘ page – I tend to pick up the challenges through Twitter, so often bypass this page, but it’s worth a look for extra clues :-))

So on the face of it, the challenge seemed as if it should be ok, but it’s usually only when I start building that things can start to unravel.

One area that Ann hadn’t been explicit about in the requirements, was the behaviour in certain scenarios. I had to revise some of my initial attempts/calculations in order to match Ann’s workbook. This week I’m not going to go into all the wrong turns, but this is the behaviour I observed in Ann’s workbook, and so ultimately tried to replicate.

  1. When at the week level, if multiple points are selected, the days then displayed should start from the first day of the first week selected (which will be a Sunday as we’re working with how the US standardises a week start), and finish on the last day of the last week selected (which will be a Saturday).
  2. When at the week level, if a single point is selected then the 7 days in that week only should be displayed.
  3. When at the day level, if multiple points are selected, the days then displayed start from the first day of the week the first day is in and end on the last day of the week the last selected day is in (so you may get more days than actually selected). eg if the 1st day selected is a Tuesday and the last day selected is a Friday, the ‘drill in’, will start on the previous Sunday, and end on the following Saturday. This isn’t necessarily what you may expect to happen.
  4. Based on the above, when at the day level, a single day is selected, the ‘drill in’ will show the 7 days in the same week. Once you’re down to displaying the 7 days in the same week, clicking on any single day or selecting a couple of days, won’t appear to do anything as the same results are displayed.

Being aware of the above, will explain why some the calculations I end up with look the way they do.

Ok, let’s get on with the build…..

Ann stated that she’d hardcoded to the ‘last 52 weeks’, but essentially it looked like she was displaying data for all of 2019. This wasn’t really something the challenge was testing, so I simply started by adding a data source filter for the year 2019 (right click on data source -> Edit data source filters). I also had to set the date properties of my data source to set a week to start on a Sunday as I’m UK based, so my week’s are defaulted to start on Mondays (again right click on data source -> Date Properties). As a result of my year filter, I didn’t seem to have the same starting/ending dates as Ann, but as already stated, this wasn’t the main aim of the challenge, so I didn’t stress about it.

I decided quite early on that I was going to also use a Parameter Action to decide the level I was at (ie had I drilled in or not). I wasn’t entirely sure when I started how many ‘levels’ I might need, so chose to use an integer parameter for this. I ended up only needing 2 levels, so a boolean could have worked equally as well, or, as I type, I think I could have used this parameter to store the ‘date level’ (day or week) I want to display my dates at, which would have made some of my calculations easier to read. I’m not going to do this though.

Drill Down

Integer parameter set to 0 by default.

Now I want to define a date field that I’m going to use on the axis, that will vary depending on the ‘level’ we’re at.

Date to Plot

IF [Drill Down]>0 THEN DATETRUNC(‘day’, [Order Date]) ELSE
DATETRUNC(‘week’, [Order Date])
END

Note – I used >0 as I wasn’t sure if I’d have levels 0, 1 & 2… as it turns out I just used 0 & 1 in the end.

Add Date to Plot to Columns as a Continuous, Exact Date and Sales to Rows and we’ve got our starting point

Change the Drill Down parameter to 1 and the chart will change to display at the day level

Drilling Down on selection

On selection, we want to add the selected dates into a set, so first up, we need to define that set.

Right click on Date to Plot and Create -> Set

Selected Dates

Select some random dates so we can test with. These will get set properly later based on the Set Action we define on the dashboard.

Based on the dates in the set, we need to determine a min and a max date we can then use to restrict the dates being plotted on the chart.

Min Date

{FIXED:MIN(IF [Selected Dates] OR [Drill Down]=0 THEN DATETRUNC(‘week’,[Date to Plot]) END) }

This either gets the 1st day of the week based on the earliest date in the set, or the 1st day of the week of the whole data set (when we’re at the ‘starting’ level with Drill Down =0).

We also need

Max Date

IF [Drill Down]=0 THEN
{FIXED:MAX(DATEADD(‘day’,-1,[Date to Plot]))}
ELSE
DATEADD(‘week’,1,{FIXED:MAX(IF [Selected Dates] THEN DATETRUNC(‘week’,[Date to Plot]) END)})-1
END

If we’re at the starting level (Drill Down = 0) then we want the last day within the latest week in the data set (this field is used in the title display, so necessary to get this to display right), otherwise we need to get the last day of the week associated to the maximum date in the set.

Now we need to be able to restrict the dates displayed in the chart based on these

Dates to Include

[Order Date]>= [Min Date] AND [Order Date]<= [Max Date]

Add this to the Filter shelf and set to True.

Your display shouldn’t change, as we’re still at ‘level 0’, even though our set has random dates selected. If you now manually change Drill Down to 1, you should see a change

Setting up the dashboard actions

Ok, now we’ve got the basic idea, let’s get it all working properly with dashboard actions before we sort out all the other bits n bobs.

First up, let’s manually reset everything by setting the Drill Down parameter to 0 and emptying all the values selected in the set.

Add the sheet onto the dashboard, and create the set action, which is set to target the Selected Dates set and to Keep set values if you click on a blank area of the chart after selection.

However, this on it’s own won’t change the display. We need to set the Drill Down parameter to 1 too.

For this we need another field

Set Drill Down Level

1

Add this to the Detail shelf of the chart.

Then back on the dashboard, add a Parameter Action that targets the Drill Down parameter using the value stored in the Set Drill Down Level field

Now if you select the dates in the dashboard, you should get the desired behaviour, and if you select again, you should filter the days selected further (as per the behaviour described at the top of the page).

Resetting the display

The Reset button is actually another sheet.

I created a ‘fake’ bar chart by adding MIN(1) to Rows, and double clicking in the space below the Detail and Tooltip shelves on the Marks card, and typing the text ‘CLEAR SELECTION’. This creates a ‘pill’ without having it defined explicitly as a calculated field, and I added this to the Text shelf, and centred/formatted appropriately. I then set the axis to be fixed from 0 to 1 and hid it.

I set the colour to #a26dc2 and set the text to ‘match mark colour.

We only want this sheet to display, if we’ve ‘drilled down’, so I need

Show Reset

[Drill Down]=1

which is added to the Filter shelf and set to True.

I also need a parameter action off this sheet, to reset the Drill Down to 0 on selection. For this I need another field

Reset

0

which is added to the Detail shelf.

Depending on what you’ve already been playing around with, there’s a chance this sheet may already be empty. Get the dashboard into a state where you’ve drilled down to the day level, then add this sheet, and add another Parameter Action.

Reset is set to run off the ‘button’ sheet only, to target the Drill Down parameter by using the value in the Reset field.

You should now be able to test all this out and get the desired behaviour.

Adding Animations

To get the chart to transition between selections, we need to use the Animations functionality.

On the Format menu, select Animations and adjust the settings as you choose.

Play around and you should have the main features of this challenge now working

Dynamic Title

The title needs to change based on whether you’re at the weekly or daily level. It also needs to show total and average sales. So for all this I need

LABEL: Level

IF [Drill Down]>0 THEN ‘Day’ ELSE ‘Week’ END

LABEL: Subtitle Level

IF [Drill Down]>0 THEN ‘Daily’ ELSE ‘Weekly’ END

LABEL: Instruction

IF [Drill Down] = 0 THEN ‘SELECT WEEKS TO DRILL DOWN TO DAILY VIEW’
ELSE ‘CLEAR SELECTION USING BUTTON’
END

Total Sales

WINDOW_SUM(SUM([Sales]))

Avg Sales

WINDOW_AVG(SUM([Sales]))

All these fields are added to the Detail shelf of the main chart along with Min Date and Max Date, and then the title is edited and formatted accordingly to reference these.

Tooltips

In a similar manner, the tooltips also need adjusting, they just need to refer to the LABEL: Level field

Colour of the Line

Although not explicitly mentioned, the line colour seemed to change from a lighter shade at the weekly level to more intense at the daily level. To achieve this I created

COLOUR

[Drill Down]

and added to the Colour shelf. This gives a colour legend displaying either 1 or 0 depending at what state you are in the interaction. Set colour values accordingly.

Average Line

Add this to the chart simply by selecting Average Line from the Analytics pane and dragging then dropping onto the Table shelf that appears. Format appropriately.

And apart from other basic formatting to remove gridlines/ axis titles etc, that should be it.

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Which products are most profitable?

I’m starting to write this blog with a bit of uncertainty today as I’m not ultimately sure where I’m going to end up….

I wasn’t even sure I was going to pen an entry this week…. I found Luke’s challenge tough, and the path I took full of multiple wrong turns, that meant trying to write out a comprehensible ‘how I did it’ quite tricky, as reviewing now I’m questioning ‘what did I do that for…’.

Whilst with perseverance and a bit of inspiration from Rob Saunders, I did manage to post a working solution, I knew I wasn’t overly happy with it, particularly because my expand and collapse functions didn’t behave as I saw others do… I had to click twice to collapse.

Before starting to write, I decided to check out Luke’s workbook which he’d finally published to see if I could understand where I was going wrong.

I’d used both a set action and parameter action in my attempt; Luke had just a parameter action. In trying to understand why I ended up with a set action and see if I could do without it, my solution gradually started unravelling, as various calculated fields needed changing.

So, I think the best way to approach this blog is to rebuild my solution from scratch, using only my existing workbook as a reference, and I will attempt to write and screen shot as I build. This could take some time, and I have no idea how successful I’ll be…. I may well get to a point where I’ve taken a wrong turn again, and everything I’ve written needs to be scrapped… at which point I may just have to say ‘sorry, I tried!’……

Ok, let’s get cracking.

Top N Sub-Categories

At the ‘first’ level of the table, we need to display the top n sub-categories ordered by Profit. Those not in the top n should be displayed under an ‘All Others’ grouping, and always displayed at the bottom. The top n can vary based on the user input.

We need a parameter to define the value of the top n.

SUB-CATEGORIES TO SHOW

This is an integer parameter ranging from 1-10, defaulted to 3. I’ve titled it exactly as displayed on the output, so it’s one less change to make later (hence the capitals).

The easiest way to group the sub-categories into those in the top n, is to use a set. Right click Sub-Category and Create -> Set.

Top N SubCats by Profit

Use the Top tab to define the rules for which Sub-Category to include, referencing the SUB-CATEGORIES TO SHOW parameter and the Profit field as shown below.

SubCat Group

IF [Top N SubCats by Profit] THEN [Sub-Category] ELSE ‘All Others’ END

If the Sub-Category is in the set then the name will display, otherwise it will be grouped under the ‘All Others’.

Add SubCat Group to Rows and change the sort on the pill to sort by Field = Profit desc

This will result in All Others being listed at the top, but to resolve that, drag the Top N SubCats by Profit set onto the Rows an place in front of the SubCat Group.

Let’s get some of the measures we need out onto the canvas.

Margin

SUM([Profit])/SUM([Sales])

format to percentage with 0dp.

Add Margin, Profit & Sales onto the sheet.

Top N Products

The next level in the table displays the Top N products per SubCat Group based on their margin. For those not in the Top N, the products should be grouped under ‘All Others’, and listed at the bottom. The Top N is once again defined by a user input via a parameter.

PRODUCTS TO SHOW

Once again an integer parameter ranging from 1-10 but defaulted at 5 this time.

Unfortunately, this time we can’t use a set to define our Top N grouping. This is because the set will only consider the Top N across all Products and will not consider the fact the Products are nested per Sub-Category grouping. So we need to come up with an alternative. I did do a bit of research to find ideas, and found this article by Emma Whyte at The Information Lab to get me started : Showing a Nested Top N with Other in Tableau.

The Top N needs to be based on the order of the Margin, or the rank.

Margin Rank

RANK_UNIQUE([Margin])

Change this to be a Discrete field.

This will give us a unique ‘number’ per row displayed based on the value of Margin. If records have the same Margin value, using RANK_UNIQUE will mean they get a different rank number (as opposed to how other ranking functions work). The table calculation of INDEX() could work just as well. NOTE – there is potential though that when the Margin values are the same for different products, what makes the Top N and what doesn’t may differ, so you might find you get a slightly different list from some of the solutions you see.

Add Margin Rank and Product Name onto the sheet, and edit the table calculation on Margin Rank, to compute by Product Name only

We now want to group the Product Name based on the Top N products.

Product Name Group

IF [Margin Rank] <= MIN([PRODUCTS TO SHOW])
THEN ATTR([Product Name])
ELSE ‘All Others’ END

We also only want to show (PRODUCTS TO SHOW + 1) rows per SubCat Group ie if PRODUCTS TO SHOW = 5, we want to display 6 rows per SubCat Group, where the 6th row displays ‘All Others’. The 6th row also needs to show the Profit, Sales and Margin values associated to all the Products in the ‘All Others’ Product Name Group.

So we need to calculate some new fields that will store a revised value for Proft, Sales and Margin, depending what row we’re working with.

Sales For Others

WINDOW_SUM(IF [Margin Rank] > [PRODUCTS TO SHOW] THEN SUM([Sales]) END)

If the Product Name isn’t in the Top N, then get it’s Sales value, and then sum all of those rows that meet the same condition.

If you put this onto the sheet, and set the table calculation for each of the nested calculations (Sales for Others & Margin Rank) to be by Product Name only, you’ll see that the value displayed in every row for each SubCat Group is the sum of the values associated to the rows in the ‘All Others’ group.

So now we need a field that’s either going to display the Sales for the 1 product or the sales for the group of products, depending on what row we’re on.

Grouped Sales

IF [Margin Rank]<= MIN([PRODUCTS TO SHOW]) THEN SUM([Sales]) ELSE [Sales For Others] END

Again the table calculation settings need to be set to compute by Product Name only. This is the field we ultimately want to display, so it needs to be formatted accordingly. The Sales and Sales For Others fields can be removed.

Along similar lines, we need

Profit For Others

WINDOW_SUM(IF [Margin Rank] > [PRODUCTS TO SHOW] THEN SUM([Profit]) END)

Grouped Profit

IF [Margin Rank]<= MIN([PRODUCTS TO SHOW]) THEN SUM([Profit]) ELSE [Profit For Others] END

Margin For Others

IF [Margin Rank] > [PRODUCTS TO SHOW] THEN [Profit For Others]/[Sales For Others] END

Grouped Margin

IF [Margin Rank] <= MIN([PRODUCTS TO SHOW]) THEN [Margin] ELSE [Margin For Others] END

When added to the sheet, once again make sure all the table calculation properties for all the nested calculations are set to compute by Product Name only.

As mentioned above, we only want to show PRODUCTS TO SHOW + 1 rows, so let’s create a field we can filter by :

Show?

[Margin Rank] <= MIN([PRODUCTS TO SHOW]) + 1

Add this to the Filter shelf and set to True. Once again the table calculation needs to be set to compute by Product Name only.

Add Totals

We want a grand total and subtotals only at the level of SubCat Group. Add the totals by Analysis -> Totals -> Show Column Grand Totals to get the overall total. Then on the SubCat Group pill, click and add SubTotals

We’ve now got the main components of the table. We now need to get the interactivity working to allow the expand / contract on arrow selection.

Expand / Contract All Others Product Group

First up, we’ll just tidy up our table display

  • Hide the In/Out Top N SubCats By Profit field
  • Hide the Margin Rank field
  • Hide the Product Name field.
  • Remove the Margin field
  • Alias the Grouped Margin, Grouped Profit, Grouped Sales fields
    • Right click and Edit Alias. Name the field ‘ Margin ‘ (note the trailing and leading spaces. You can’t alias just as Margin as a field already exists with that name, but the spaces make it think it’s a new name.

We need an additional field that will store our ‘arrow’ icon. We’re going to revisit this field. For starters

Product Group Header

IF [Margin Rank] <= [PRODUCTS TO SHOW] THEN ”
ELSE ‘►’
END

I use this site to get my icon characters from. Add this field to the sheet after the SubCate Group pill, once again setting the table calculation to compute by Product Name.

In the dashboard, the aim is to click on an arrow associated to a single SubCat Group, which will expand the Product Group Name field to display the actual Product Name (rather than ‘All Others’) with their associated Margin, Sales & Profit values, and also show a ▼ icon.

This will be achieved using Parameter Actions, for which we need a parameter :

Selected Sub Category Group

String parameter defaulted to ”

Display this parameter on the sheet, as we can start to test the interactivity ‘manually’ without the need for the dashboard. What the dashboard action will do is on ‘click’, it will be set to populate the value of this parameter with the associated SubCat Group value. We can then do some checks based off of this and set various fields accordingly. It means we need to revisit some of the fields.

First up let’s set the arrow….

Edit Product Group Header to be

IF [Selected SubCat Group] = MIN([SubCat Group]) THEN
IF [Margin Rank] > [PRODUCTS TO SHOW] THEN ‘▼’ ELSE ” END
ELSEIF [Margin Rank] <= [PRODUCTS TO SHOW] THEN ”
ELSE ‘►’
END

Test this by entering the value of ‘Copiers’ into the Selected SubCat Group parameter. The arrow against ‘All Others’ should change.

We also need to change the value of the Product Name Group to show the actual Product Name on selection, so

edit the Product Name Group

IF [Margin Rank] <= MIN([PRODUCTS TO SHOW])
THEN ATTR([Product Name])
ELSE
IF [Selected SubCat Group] = MIN([SubCat Group])
THEN ATTR([Product Name])
ELSE ‘All Others’ END
END

Again test this out by changing the value in the parameter.

But we need to make more rows show too, so

edit Show?

([Margin Rank] <= MIN([PRODUCTS TO SHOW]) + 1) OR (MIN([SubCat Group]) =[Selected SubCat Group])

The values of our measures are still the totals though, so we need to edit these fields to

Grouped Sales

IF ([Margin Rank]<= MIN([PRODUCTS TO SHOW]) OR ([Selected SubCat Group] = MIN([SubCat Group])) ) THEN SUM([Sales]) ELSE [Sales For Others] END

Grouped Profit

IF ([Margin Rank]<= MIN([PRODUCTS TO SHOW]) OR ([Selected SubCat Group] = MIN([SubCat Group])) ) THEN SUM([Profit]) ELSE [Profit For Others] END

Grouped Margin

IF ([Margin Rank] <= MIN([PRODUCTS TO SHOW]) OR ([Selected SubCat Group]=MIN([SubCat Group]))) THEN [Margin] ELSE [Margin For Others] END

However while this works if you play with setting and clearing the parameter on the sheet, it won’t quite fully work if added as a dashboard action, as while the action can set the parameter we can’t ‘clear it’.

We need to ‘tie’ the parameter action to another field

SubCat Group for Reset

IF [Selected SubCat Group] = MIN([SubCat Group]) THEN ”
ELSE MIN([SubCat Group])
END

Add this to the Detail shelf (if you want to see how it changes based on the parameter value, add it to the Rows and test changing the parameter).

It needs to be on the sheet so it can be referenced from the dashboard action.

Adding the Action

Add the sheet to a dashboard, then add Parameter Action as below where the Target Parameter is Selected SubCat Group and the field it references is SubCat Group for Reset

And after all that, you should have a working solution. Phew!

Pretty pleased I got there without taking a detour 🙂 The table just now needs various formatting applied, which I’m going to leave to you to do 🙂 Just tweet me if you’re having problems!

The parameter action may be confusing you a bit – it took a while to really get my head round it, so I’ve tried to explain this a bit more below…

How the parameter action works

On initial load of the sheet, the Selected SubCat Group parameter is blank. So for the SubCat Group = Copiers, the SubCat Group for Reset will also be Copiers as SubCat Group is not the same as Selected SubCat Group parameter. Other fields are also set based on the fact these two fields aren’t the same (like the arrow pointing to the right etc).

When the right arrow is clicked on the dashboard, the value of SubCat Group for Reset is used to populate the Selected SubCat Group parameter. So in this example, Selected SubCat Group will now contain the value Copiers. As the Selected SubCat Group parameter is now the same as the SubCat Group, various fields change their behaviour (like the arrow now points down, and more rows are displayed). But also, the value of the SubCat Group for Reset is also changed; as the SubCat Group is the same as the Selected Sub Cat Group parameter, SubCat Group for Reset now contains a blank string.

So at the point the down arrow is now clicked again on the dashboard, the value of the SubCat Group for Reset is again used to populate the Selected SubCat Group parameter. As SubCat Group for Reset is blank, then the parameter will now be populated with a blank value, and so all the login in the fields will be based on the fact that SubCat Group is not the same as the Selected SubCat Group parameter, and the table will display just as it did on first load.

Hope that helps to demystify what’s going on… it’s certainly helped me!

Thanks for sticking with me if you got this far 🙂

The version of the challenge I built while writing this (my 3rd published version) is here.

Happy vizzin’! Stay Safe!

Donna

Can you show the adjusted target and missing pipeline?

For #WOW2020 Week 16, Lorna set a slightly different challenge that involved data blending. Blending is a technique in Tableau used to combine data from different data sources. You can read more about it here.

Lorna’s scenario is quite a common one – you have a data source which stores some ‘actual’ data (that in a typical scenario is likely to change as you move through the year), along with a more static data source, storing plan/budget/target data for each month. This is typically created at the start of the year and rarely changes. Comparing actuals to target is a very common business requirement.

Once again, I’m going to tackle this challenge but working out all the numbers I need for each month in a tabular format, before I go onto build the viz.

Building out the data

For this challenge we have 2 data sources, the pipeline data containing multiple years and the target data just containing data for 2020. So the first this we need to do is add a filter for Closed Date from the Pipeline Data source to be the Year 2020.

The data has been specially crafted as if it’s at a particular point in time in April, in my case at the point of building it was 15 April 2020. If this was being built for a real life scenario, we’d want to be reporting based off the Today() function. To simulate this, I created a calculated field to hardcode my ‘today’ date, but if I was doing this ‘for real’, I’d have set it to TODAY().

Today

#2020-04-15#

I need to be able to report the Pipeline Data that is at Stage=Closed Won separately from data that is still in the pipeline (hasn’t been closed as won or lost). I’ll use some calculated fields for this

Closed Won

ZN(IF Stage = ‘Closed Won’ THEN [Sales] END)

Note – the ZN will display as 0 if there is no Sales.

Pipeline

IF [Stage]= ‘Negotiating’ OR [Stage] = ‘Proposing’ THEN [Sales] END

Let’s start to build the table out:

  • Month of Closed Date on Rows
  • Closed Won and Pipeline on Cols (as Measure Values)
  • Year of Closed Date = 2020 on Filter Shelf

Let’s now add in the target from the Target Data source. This will be a blend. When we blend we need to define how to ‘join’ the data sources together. I prefer to make it obvious what fields I am blending on, so although I can use existing fields and define a rule, I prefer created explicit calculated fields so it’s clear.

The Target Data contains a record for each month, dated as per the 1st of each month. In the Target Data, create a new field

BLEND – Month

[Date]

The in the Pipeline Data, create a field named exactly the same

BLEND – Month

DATE(DATETRUNC(‘month’,[Closed Date]))

but in this case we’re truncating the Closed Date to the 1st of each month, and ensuring it too is a Date rather than Datetime data type, so the fields can match.

Now add Target from the Target Data onto the table. If you get a warning message, click ok, then click the ‘link’ symbol that is currently greyed out against the Blend – Month field in the Target Data.

The ‘link’ symbol will go red and indicate that the data is being ‘joined’ on this field. The Target values in the table will now match the values if you check the data source excel file directly, and the Target pill in the Measure Values will show a ‘database’ symbol with an ‘orange tick’ which indicates it’s from a secondary data source. The data sources listed in the Data pane (top left) will also be coloured blue (primary) and orange (secondary), which indicates data blending is being used.

We now need to start working out how much off the YTD target we are so far, so we then work out how much pipeline is potentially missing from each future month.

So first up, how much has been closed won so far this year (only considering complete months). Ie how much has been won in Jan, Feb & March?

YTD Closed

WINDOW_SUM(SUM(IF [Closed Date] < DATETRUNC(‘month’, [Today]) THEN [Closed Won] END))

If the Closed Date is before the 1st of the current month (ie April in this example), get the Closed Won value already computed, but SUM all the values we have for all the months.

Add this onto the table, and you can see the total of the Closed Won values for Jan, Feb & Mar is listed against every month.

The table calc has automatically computed ‘table down’, but I’m, going to explicitly set it, as I know I’m going to move the fields around later, and I don’t want that value to change based on where it gets moved to,

Right click on the YTD Closed pill -> Edit Table Calculation and check Month of Closed Date

We need to work out how much we should have closed in the first 3 months of the year too. So in the Target Data, create a similar calculated field

YTD Target

WINDOW_SUM(SUM(IF [Date] < DATETRUNC(‘month’, [Today]) THEN Target END))

(Note – a Today calculated field also hardcoded to 15th April 2020 needs to be added to this data source too).

Add this field into the table too, and again, set the table calculation to be explicitly set against Month of Closed Date.

In order to work out how much missing pipeline to add to each month, we need to figure out how far we’re currently ‘off’, and then distribute this value across the remaining months.

I’m doing all this in steps, so I can sense check the calcs as I go. We can work out how much we’re off by creating a new field in the Pipeline Data

Missed Sales Value

[Monthly Target (2020_04_15_WW16_Sales Pipeline)].[YTD Target] – [YTD Closed]

Basically this is YTD Target YTD Closed, but when you refer to a field from the secondary data source, the field will be prefixed by the data source name.

Add this to the table, and again verify the table calc is set explicitly.

As this is a field based on other table calcs, you will see them listed as Nested Calculations, and you need to verify each one listed is set appropriately.

To work out how many months in the year are remaining that we need to distribute the above value over, we need

Remaining Months

12 – (DATEPART(‘month’, [Today]) -1)

As Today is in April, which is month 4, then the remaining months is 12 – (4-1) = 9.

An now we can work out how much needs to be added per month

Distributed Missed Sales Value

[Missed Sales Value] / [Remaining Months]

Pop this onto the table, and verify the table calc again.

Now we have this, we can work out what the Target needs to be adjusted to for each of the remaining months to make up the shortfall, which is basically adding the monthly shortfall above to the existing Target for the month (but only for the current and future months).

Adjusted Target

IF MIN([Closed Date]) >= DATETRUNC(‘month’, [Today]) THEN SUM([Monthly Target (2020_04_15_WW16_Sales Pipeline)].[Target]) + [Distributed Missed Sales Value] END

Note – we wrap Closed Date in a MIN function as we’re working with aggregated fields, so the Date needs to be aggregated too. MAX would work just the same.

Finally we need to work out what the shortfall is in the existing Pipeline to meet the Adjusted Target (if there is any).

For the months beyond the current month, this is simply the difference between the Pipeline value and the Adjusted Target (but only if the Pipeline is less than the Adjusted Target). For the current month though, it’s the difference between the Pipeline + Closed Won values and the Adjusted Target.

Missing Pipeline

IF ZN(SUM([Pipeline])) = 0 THEN NULL

ELSEIF DATETRUNC(‘month’, MIN([Closed Date])) = DATETRUNC(‘month’,[Today]) THEN
//it’s current month, so need to consider what’s closed & what’s remaining
IF (SUM([Closed Won]) + SUM([Pipeline])) < [Adjusted Target]
THEN ZN([Adjusted Target] – (SUM([Closed Won]) + SUM([Pipeline])))
END

ELSEIF SUM([Pipeline]) < [Adjusted Target] THEN ZN([Adjusted Target] – SUM([Pipeline]))

ELSE 0
END

Add this onto the table

And we’ve now got all the pieces we need to start to build the viz. Name this sheet Check Data or similar. We want this as our reference sheet to make sure our figures remain correct.

Building the Bar Chart

Firstly, duplicate the table viz, and remove the fields we don’t need in the final display (YTD Closed, YTD Target, Missed Sales Value, Distributed Missed Sales Value).

Now move the pills as follows :

  • Closed Date from Rows to Columns
  • Measure Values from Text to Rows
  • Measure Names from Rows to Colour shelf
  • Change Mark Type to Bar

Now move Adjusted Target and Target to the Detail shelf.

Adjust the colours of the remaining measures to suit, and reorder, so that the bars a stacked with Closed Won on the bottom and Missing Pipeline on the top.

Before we deal with the target lines, we’re going to sort the Tooltip out. It’s quite tricky… it might be there’s a better way, but I had to create a few custom calculated fields to get the display required.

Creating the Tooltip

For the first 3 months, the tooltip just needs to display the Closed Won value, but from April onwards, we need to display values for Closed Won, Pipeline & Missing Pipeline, even if the values are 0. Also the first 3 months just show the Target, but the remaining months need the Adjusted Target too. These values are displayed with | symbols in between along with labels, which should only show if relevant.

Firstly, we need to make sure all the measure values displayed, are accessible regardless as to which bar we hover over. So all of the 3 measures (Closed Won, Pipeline & Missing Pipeline) need to be added to the Tooltip. This is done by holding down Ctrl as you drag each pill from the Measure Values area onto the Tooltip shelf. This has the effect of duplicating the pill, and retaining any table calc settings that have been applied.

We only want the text ‘| Adjusted Target :’ to display if there is an Adjusted Target value :

Tooltip : Adjusted Target

IF [Adjusted Target] > 0 THEN ‘ | Adjusted Target : ‘
END

Add this to the Tooltip shelf.

We only want the text ‘| Pipeline :’ to display if there is a Pipeline value

Tooltip : Pipeline

IF [Pipeline] > 0 THEN ‘ | Pipeline : ‘
END

Add this to the Tooltip shelf.

And we only want the text ‘| Missing Pipeline:’ to display if we’re in the current or future months.

Tooltip : Missing Pipeline

IF DATETRUNC(‘month’, [Closed Date]) >= DATETRUNC(‘month’,[Today]) THEN ‘ | Missing Pipeline : ‘ END

Add this to the Tooltip shelf.

Now modify the Tooltip so the various pills are referenced and formatted as required

Finally adjust the Month axis, to set the months to be displayed as abbreviated values.

Adding the Target lines

At first glance, you might think the two target lines are both reference lines. However, if you hover over the tooltip of the Target (the solid line), you’ll see you have the same tooltip as the bars. Whilst there is some ability to control the tooltip of a reference line now, you can’t reference all the pills this tooltip requires.

So the Target is actually a dual axis mark. The Adjust Target however, is a reference line.

To get the Target to display, hold ctrl & drag the Target pill from the Detail shelf to the Rows shelf (to duplicate the pill), next to Measure Values.

On the Target marks card,

  • Remove Measure Names from the Colour shelf
  • Change the Mark Type to Gantt
  • Change the Colour to black, and add a black border too (to make the mark thicker)
  • Make the chart Dual Axis and Synchronise Axis
  • Uncheck Show Header on the Target axis

If you hover over the Gantt mark/Target line, you should have the same tooltip as when you hover over the bar.

The Adjusted Target is a reference line. To add this, right click on the left hand axis and Add Reference Line. Adjust settings as follows :

  • Scope – per cell
  • Value – Adjusted Target
  • Label – None
  • Tooltip – Custom, set to ‘Adjusted Target (Dashed) :’ then add Value from the selector
  • Change the Line to be black and dashed

Both target lines should now be displayed. It’s just now a case of applying some formatting to remove gridlines, row & column lines, adjust font sizes and remove axis title and column titles.

Building the legend

The dashboard displays a custom colour legend. As always there are multiple ways to do this. I chose to ‘fake it’ using aliases and some values associated to a completely different and unused dimension in the data.

Duplicate the Opportunity Name dimension. I just left it as Opportunity Name (copy). On a new sheet, add Oppotunity Name (copy) to the Filter shelf, and select 5 values only.

Then right click on Opportunity Name (copy) and select Aliases. For each of the values you selected in the filter, set an alias based on the legend names to display

Then build the legend as follows

  • Add Opportunity Name (copy) to Columns
  • Type in MIN(1) to the Columns shelf to create a fake axis
  • Add Opportunity Name (copy) to the Text shelf
  • Add Opportunity Name (copy) to the Colour shelf
  • Fix the axis of Min(1) to start at 0 and end at 1
  • Reorder the displayed values to suit.
  • Format to remove all rows/column lines and hide the headers.
  • Format the Label to be centred and size font
  • Clear the tooltip.

Note – I chose to copy the Opportunity Name pill just to make sure I didn’t inadvertently break anything, and to easily revert if things didn’t go to plan :-).

Now the 2 sheets can be placed on the dashboard along with a suitable title.

One final tip – to prevent the user from inadvertently clicking on the legend viz when on the dashboard, add a floating blank image and position over the top of the legend.

My published viz is here.

Happy vizzin’ & stay safe!

Donna