Erica kicked off the 1st year of #WOW2025 with a table calculation based challenge asking us not to just use quick tableau calculations, to avoid the use of the RUNNING_SUM table calculation and to just create 2 calculated fields.
Defining the calculations
On a new sheet, add Order Date to Rows as a discrete (blue) pill at the Month-Year format and add Sales to Text.
For the running total, we want
Running Total
SUM([Sales]) + PREVIOUS_VALUE(0)
which takes the Sales from the current row, and adds it to the value of the cumulative Sales (ie this calculation) in the previous row
For the moving average, we want
3 Month Moving Average
WINDOW_AVG(SUM([Sales]), -2, 0)
which averages the Sales for the current row plus the previous 2 rows (ie 3 rows in total)
Building the Viz
ON a new sheet add Order Date as a continuous (green) pill at the Month-Year format to Columns and Running Total to Rows.
Change the marktype to Area, and set the Colour to #a16eaf with a 25% Opacity.
Add Order Date to Tooltip, and set to the MIN aggregation. Format the pill on the Tooltip shelf to have the <month year> format when displayed on the pane.
Add another instance of Running Total to Rows. Set the Mark type to be Line and reset the opacity on the colour shelf to be 100%. Make the chart dual axis and synchronise the axis.
Add 3 Month Moving Average to Rows. Set the Colour to be #67c79c at 25% opacity.
The add another instance of 3 Month Moving Average to Rows. Set the colour to #67c79c and the mark type to line and increase opacity to 100%. Make dual axis and synchronise the axis.
Update the Tooltip on the All Marks card, referencing the MIN(Order Date) pill and adding $ before the values
Edit the Order Date axis, to start from 01 Jan 2021 and end on 31 Dec 2024. Remove the axis title.
Format the Order Date axis, and set the Dates on the scale to be formatted with the MMMMM notation to just show the first letter of the month
Add a reference line to the Order Date axis, set to be a constant value of 01 Jan 2021, with a custom label of 2021 and the line is formatted to be a white dashed line of 100%
Then format the reference line so the label is aligned top right
Repeat this step 3 more times, adding Reference Lines for 01 Jan 2022 (labelled 2022), 01 Jan 2023 (labelled 2023) and 01 Jan 2024 (labelled 2024).
Finally, hide the right hand axis (uncheck show header), remove column dividers, but show the axis rules on the rows.
And that should complete the challenge – just pop the viz onto a dashboard and publish. My published version is here.
It was Luke’s turn to set the #WOW2023 challenge this week and he chose to focus on remaking a visualisation relating to the change in the Antarctic Sea Ice, inspired by charts created by Zach Labe.
The challenge involved the use of extensive calculations, which at times I found hard to validate due to the steps involved in reaching the final number, and only having visibility of the final number on hover on a point on the chart. If it didn’t match, it became a bit of a puzzle to figure out where in the process I’d gone wrong.
Getting the data for the faded yearly line charts was ok, but I ended up overthinking how the decade level darker line chart was calculating and couldn’t get matches. Anyway, after sleeping on it, I realised my error, and found I didn’t need half the calculations I’d been playing with.
So let’s step through this. As we’re working with moving averages, we’re looking at using table calculations, so the starting point is to build out the data and the calculations required into a tabular form first.
Setting up the calculations
I used the data stored in the Google sheet that was linked to in the challenge, which I saved down as a csv file. After connecting to the file, I had separate fields for Day, Month and Year which I changed to be discrete fields (right click on field and Convert to discrete).
We need to create two date fields from these fields. Firstly
Actual Date
MAKEDATE([Year],[Month],[Day])
basically combines the 3 separate fields into a proper date field. I formatted this to “14 March 2001” format.
Secondly, we’ll be plotting the data on an axis to span a single year. We can’t use the Actual Date field for that as it will generate an axis that runs from the earliest date to the latest. Instead we need a date field that is ‘normalised’ across a single year
Date Normalise
MAKEDATE({max([Year])}, [Month], [Day])
the {max([Year])} notation is a short cut for {FIXED: MAX([Year])} which is a level of detail (LoD) expression which returns the greatest value of the Year field in the data set. In this case it returns 2023. So the Date Normalise field only contains date for the year 2023. Ie if the Actual Date is 01 Jan 2018 or the Actual Date is 01 Jan 2020, the equivalent Date Normalise for both records will be 01 Jan 2023.
Let’s start to put some of this out into a table.
Put Year on Columns, and Date Normalise as a blue (discrete) exact date field on Rows. Add Area(10E6M2) to Text and change to be Average rather than Sum (in leap years, the 29 Feb seems to have been mapped to 01 March, so there are multiple entries for 01 March). This gives us the Area of the Ice for each date in each year.
We need to calculate the 7 day moving average of this value. The easiest was to do this is add a Moving Average Quick Table Calculation to the pill on the Text shelf.
Once done, edit the table calculation, and set so that is average across the previous 6 entries (including itself means 7 entries in total) and it computes down the table (or explicitly set to compute by Date Normalise).
It is best to create an explicit instance of this field, so if you click on the field and press ctrl while you drag and drop it into the data pane on the left hand side, you can then rename the field. I named mine
Moving Avg: Area
WINDOW_AVG(AVG([Area (10E6M2)]), -6, 0)
It should contain the above syntax as that’s what the table calculation automatically generates. If you’re struggling, just create manually and then add this into the table instead.
Add Area (10E6M2) back into the table too. You should have the below, and you should be able to validate the moving average is behaving as expected
Now we need to work out the data related to the ‘global’ average which is the average for all years across a single date.
Average for Date
{FIXED [Date Normalise]: AVG([Area (10E6M2)])}
for each Date Normalise value. return the average area.
Pop this into the table, and you should see that you have the same value for every year across each row.
We can then create a moving average off of this value, by repeating similar steps above. In this instance you should end up with
Moving Avg Date
WINDOW_AVG(SUM([Average For Date]), -6, 0)
Add into the table, and ensure the table calculation is computing by Date Normalise and again you should be able to validate the moving average is behaving as expected
Note – you can also filter out Years 1978 & 1979 as they’re not displayed in the charts
So now we have the moving average per date, and the global moving average, we can compute the delta
Ice Extent vs Normal
[Moving Avg: Area] -[Moving Avg Date]
Format this to 3 dp and add to the table. You should be able to do some spot check validation against the solution by hovering over some of the points on the faded lines and comparing to the equivalent date for the year in the table.
This is the data that will be used to plot the faded lines. For the bolder lines, we need
Decade
IF [Year] = {max([Year])} THEN STR([Year]) ELSE STR((FLOOR([Year]/10))*10) + ‘s’ END
and we don’t need any further calculations. To verify, simply duplicate the above sheet, and then replace the Year field on Columns with the Decade field. You should have the same values in the 2023 section as on the previous sheet, and you should be able to reconcile some of the values for each decade against marks on the thicker lines.
Basically, the ‘global’ values to compare the decade averages against are based on the average across each individual year, and not some aggregation of aggregated data (this is where I was overthinking things too much).
Building the viz
On a new sheet add Date Normalise as a green continuous exact date field to Columns, and Ice Extent vs Normal to Rows. Add Year to Detail and Decade to Colour. Adjust colours to suit and reduce to 30% opacity. Reduce the size to as small as possible. Add Decade to Filter and exclude 1970s. Ensure both the table calculations referenced within the Ice Extent vs Normal field are computing by Date Normalise only.
Add Actual Date to the Tooltip and and adjust the tooltip to display the date and the Ice Extent vs Normal field in MSM.
Now add a second instance of Ice Extent vs Normal to Rows. On the 2nd marks card that is created, remove Year from Detail and Actual Date from Tooltip. Increase the opacity back up to 100% and increase the Size of the line. Sort the colour legend to be data source order descending to ensure the lines for the more recent decades sit ‘on top’ of the earlier ones.
Modify the format of the Date Normalise field to be dd mmmm (ie no year). Adjust the Tooltip as below
Make the chart dual axis and synchronise the axis. Remove the right hand axis.
Edit the axis titles, remove row and column dividers and add row & column gridlines.
Adding the labels
We want the final point for date 18 June 2023 to be labelled with the actual Area of ice on that date and the difference compared to the average of that date (not the moving average). I create multiple calculated fields for this label, using conditional logic to ensure the value only returns for the maximum date in the data
Max Date
{max([Actual Date])}
Label:Date
IF MIN([Decade])=’2023′ AND MIN([Actual Date])=MIN([Max Date]) THEN MIN([Max Date]) END
Label: Area
IF MIN([Decade])=’2023′ AND MIN([Actual Date])=MIN([Max Date]) THEN AVG([Area (10E6M2)]) END
Label:Ice Extent v Avg for Date
IF MIN([Decade])=’2023′ AND MIN([Actual Date])=MIN([Max Date]) THEN AVG([Area (10E6M2)]) – SUM([Average For Date]) END
Label:unit of measure
IF MIN([Decade])=’2023′ AND MIN([Actual Date])=MIN([Max Date]) THEN ‘MSM’ END
Label: unit of measure v avg
IF MIN([Decade])=’2023′ AND MIN([Actual Date])=MIN([Max Date]) THEN ‘MSM vs. avg’ END
All these fields were then added to the Text shelf of the 2nd marks card and arranged as below, formattign each field accordingly
And this sheet can then be added to the dashboard. The legend needs be adjusted to arrange the items in a single row.
This week’s challenge, set by Lorna, made use of the newly released ‘dotted line’ format in v2023.2, so you’ll need that version in order to achieve the dotted line that spans the 2 marks being compared. If you don’t have this version, you can still recreate the challenge, but you’ll just have a continuous line.
Building the calculations
This challenge involves table calculations, so my preferred startig point is to build out all the fields I need in a tabular format.
So start by adding Category to Columns and Order Date at the discrete (blue) week level to Rows. Add Sales to Text (I formatted Sales to $ with 0 dp, just for completeness).
Apply a Moving Average Quick Table Calculation to the Sales pill, then edit the table calculation to it is averaging over the previous 5 values (including the current value – ie 6 weeks in total), and it is computing by Order Date only.
Add another instance of Sales back into the table, so you can check the values.
The ‘moving average’ Sales pill is what will be used to plot the main line chart.
But we need to identify 2 points on the chart to compare – the values associated to a start date determined by the user clicking on the chart, and the values associated to an end date determined by the user hovering on the chart. We’ll make use of parameters to store these dates
pDateClick
date parameter defaulted to 27th Dec 2020
pDateHover
date parameter defaulted to 28 Nov 2011
We can then determine what the moving average Sales values were at these two dates
Sales to Compare
IF DATETRUNC(‘week’, MIN([Order Date])) = [pDateClick] OR DATETRUNC(‘week’, MIN([Order Date])) = [pDateHover] THEN WINDOW_AVG(SUM([Sales]), -5, 0) END
Add this to the table, and the values for the moving average will only be populated for the two dates stored in the parameters
This is the field we will use to plot the points to draw the lines with.
But we also need to work out the difference between these values so we can display the labels.
If the date matches the pDateHover (end) date then return the moving average and then spread that value over every row.
Add these into the table, and you can see how the table calculations are working
The moving avg value for the start date is displayed in every row against the Sales to Compare Start column, while the moving avg for the end date is displayed in every row for the Sales to Compare End column.
With these values now displayed on the same row, we can calculate
Difference
[Sales to Compare End]-[Sales to Compare Start]
formatted to $ with 0 dp
and
% Difference
[Difference]/[Sales to Compare Start]
formatted to a custom number format of ▲0.0%;▼0.0%;0.0%
Popping these into the tables, the values populate over every row, but when it comes to the label, we only want to show data against the comparison date, so I created
Label Difference
IF DATETRUNC(‘week’, MIN([Order Date])) = [pDateHover] THEN [Difference] END
formatted to $ with 0 dp, and
Label Difference %
IF DATETRUNC(‘week’, MIN([Order Date])) = [pDateHover] THEN [% Difference] END
formatted to a custom number format of ▲0.0%;▼0.0%;0.0%
With all these fields, we can now build the chart
Building the viz
On a new sheet, add Order Date set to the continuous (green) week level to Columns and Category to Rows. Add Sales to Rows and apply the moving average quick tablecalculation discussed above, adjusting so it is computing over 5 previous weeks and by Order Date only
Add Category to Colour and adjust accordingly, then reduce the opacity to around 40%
Add pDateClick to Detail then add a reference line to the Order Date axis the references this field. Adjust the Label and Tooltip fields, and set the line format to be a thick grey line.
Format the reference line text so it is aligned top right.
Then add pDateHover to Detail and add another reference line, but this time set the line so it is formatted as a dashed line.
Next, add Sales to Compare to Rows. Adjust the colour of the associated marks cards so it is 100% opacity.
Right click on the Sales to Compare pill and Format. On the Pane tab on the left hand side, set the Marks property of the Special Values section to Hide (Connect Lines). Click on the Path shelf and choose the dashed display option.
Set the chart to be Dual axis and synchronise the axis. Remove Measure Names from the All marks card. Hide the right hand axis.
Add Label Difference and Label Difference % to the Label shelf. Set the font to match mark colour and adjust font size and positioning. I left aligned the font but added a couple of spaces so it wasn’t as close to the line. I also added a couple of extra carriage returns so the text shifted up too.
Finally adjust tooltips, remove column dividers and hide the Category column title. Adjust the axis titles.
Adding the interactivity
Add the sheet onto a dashboard, then add 2 parameter actions
Set Start on Click
On select of the Viz sheet, pass the Order Date (week) into the pDateClick parameter. Leave the current value when selection is cleared (ie when user clicks elsewhere in viz, but not on another mark).
and
Set Comparison on Hover
On select of the Viz sheet, pass the Order Date (week) into the pDateHover parameter. Leave the current value when selection is cleared (ie when user clicks elsewhere in viz, but not on another mark).
Lorna set this challenge this week to test some LoD fundamentals. My solution has a mix of LoDs and table calculations – there weren’t any ‘no table calcs’ allowed instructions, so I assumed they weren’t off limits and in my opinion, were the quickest method to achieve the line graph.
Note – For this challenge, I downloaded and installed the latest version of Tableau Desktop v2022.2.0 since Lorna was using the version of Superstore that came with it. The Region field in that dataset was set to a geographic role type. I built everything I describe below using the field fine, but when I extracted the data source at the end and ‘hid all unused fields’ before publishing, the Region field reported an error (pill went red, viz wouldn’t display). To resolve, I ended up removing the geographic role from the field and setting it just to be a string datatype. At this point I’m not sure if this an ‘unexpected feature’ in the new release…
Ok, let’s get on with the build.
Building the basic viz
I started by building out the basic bar & line chart. Add Region to Rows, Order Date as continuous month (green pill) to Columns and Sales to Rows. Change the mark type to bar, change the Size to manual and adjust.
Drag another copy of the Sales pill to Rows, so its next the other one. Click on the context menu of that 2nd pill, and select Quick Table Calculation -> Moving Average
Change the mark type of the 2nd Sales pill to line.
Now click on the context menu of the 2nd Sales pill again and Edit Table Calculation. Select the arrow next to the ‘Average, prev 2, next 0’ statement, and in the resulting dialog box, change the Previous values to 3 and uncheck the Current Value box
At this point you can verify whether the values match Lorna’s solution when set to 3 previous months.
But, we need to be able to alter the number of months the moving average is computing over. For that we need a parameter
pPriorMonths
integer parameter, default to 3, that ranges from 2 to 12 in steps of 1.
Then click on the 2nd Sales pill and hold down Shift, and drag the pill into the field pane on the left hand side. This will create a new field for you, based on the moving average table calculation. Edit the field. Rename it and amend it so it references the pPriorMonths parameter as below
Moving Avg Sales
WINDOW_AVG(SUM([Sales]), -1*[pPriorMonths], -1)
Adjust the tooltip for both the line and the bar (they do differ). Ignore the additional statement on the final bar for now.
Colour the line black and adjust size. Then make the chart dual axis and synchronise axis. Hide the right hand axis. Remove Measure Names from the Colour shelf of both marks cards.
Colouring the last bar
In order to colour the last bar in each row, we need 3 pieces of information – the value of Sales for the last month, the moving average value for the last month, and an indicator of whether one is bigger than the other. This is where the LoDs come in.
First up, lets work out the latest month in the dataset.
finds the latest date in the whole dataset and truncates to the 1st of the month. Note, this works as there’s sales in the last month for all Regions, if there hadn’t been, the calculation would have needed to be amended to be FIXED by Region.
From this, we can get the Sales for that month for each Region
Latest Sales Per Region
{FIXED [Region] :SUM( IF DATETRUNC(‘month’, [Order Date]) = [Latest Month] THEN [Sales] END)}
To work out the value of the moving average sales in that last month, we want to sum the Sales for the relevant number of months prior to the last month, and divide by the number of months, so we have an average.
First let’s work out the month we’re going to be averaging from
This subtracts the relevant number of months from our Latest Month, so if the Latest Month is 01 Dec 2022 and we want to go back 3 months, we get 01 Sept 2022.
Avg Sales Last n Months
{FIXED [Region]:SUM( IF DATETRUNC(‘month’, [Order Date]) >= [Prior n Month] AND [Order Date] < [Latest Month] THEN [Sales] END)} / [pPriorMonths]
So assuming we’re looking at prior 3 months, for each Region, if the Order Date is greater than or equal to 01 Sept 2022 and the Order Date is less than 1st Dec 2022, get me the Sales value, then Sum it all up and divide by 3.
And now we determine whether the sales is above or below the average
Latest Sales Above Avg
SUM([Latest Sales Per Region]) > SUM([Avg Sales Last n Months])
If you want to sense check the figures, and play with the previous months, then pop the data into a table as below
So now we’re happy with the core calculations, we just need a couple more to finalise the visualisation.
If we just dropped the Latest Sales Above Avg pill onto the Colour shelf of the bar chart, all the bars for every month would be coloured, since the calculation is FIXED at the Region level, and so the value is the same for all rows associated to the the Region. We don’t want that, so we need
Colour Bar
IF DATETRUNC(‘month’, MIN([Order Date])) = MIN([Latest Month]) THEN [Latest Sales Above Avg] END
If it’s latest month, then determine if we’re above or below. Note the MIN() is required as the Latest Sales Above Avg is an aggregated field so the other values need to be aggregated. MAX() or ATTR() would have worked just as well.
Add this field to the Colour shelf of the bar marks card and adjust accordingly.
Sorting the Tooltip for the last bar
The final bar has an additional piece of text on the tooltip indicating whether it was above or below the average. This is managed within it’s own calculated field.
Tooltip: above|below
IF DATETRUNC(‘month’ ,MIN([Order Date])) = MIN([Latest Month]) THEN IF [Latest Sales Above Avg] THEN ‘The latest month was above the prior ‘ + STR([pPriorMonths]) + ‘ month sales average’ ELSE ‘The latest month was below the prior ‘ + STR([pPriorMonths]) + ‘ month sales average’ END END
If it’s the latest month, then if the sales is above average, then output the string “The latest month was above the prior x month sales average” otherwise output the string “The latest month was below the prior x month sales average”.
Add this field onto the Tooltip shelf of the bar marks card, and amend the tooltip text to reference the field.
Finalise the chart by removing column banding, hiding field labels for rows, and hiding the ‘4 nulls’ indicator displayed bottom right.
Creating an Info icon
On a new sheet, double click into the space within the marks card that is beneath the Detail, Tooltip, Shape shelves, and type in any random string (”, or ‘info’ or ‘dummy’). Change the mark type to shape and select an appropriate shape. I happened to have some ? custom shapes, so used that rather than create a new one. For information on how to create custom shapes, see here. Amend the tooltip to the relevant text. When adding to the dashboard, this sheet was just ‘floated’ into the position I was afte. I removed the title and fit to entire view.
Following the #WOW survey where practice in table calculations was the most requested feature, Lorna continues with the theme in this challenge, where the focus is on the moving average table calculation, plus a couple of extra features thrown in.
Moving average
This is based on the values of data points before and after the ‘current’ point, as defied by the parameters which will need to be created.
pPrior
Integer parameter ranging from 1 to 6 and defaulted to 3. You need to explicitly set the Step size to 1 to ensure the step control slider appears when you add the parameter to the dashboard. This will be used to define the number of data points prior to the current to use in the calculation.
Create an identical parameter pPost to define the number of data points to use after the current one.
With these parameters, we can now create the core calculation
As the requirement states that the ‘prior’ parameter needs to include the ‘current’ value, then we need to adjust the calculation – ie if the parameter is 3, we actually only want to include 2 prior data points, as the 3rd will be the current point itself. This is what the +1 is doing in the 2nd argument of the function.
Lorna has stated that 3 Sub-Categories are grouped to form a Misc category, so we need to create a group off of Sub-Category (right click Sub-Category -> Create -> Group).
Multi-select the 4 options that need to be grouped (hold down Ctrl as you select), and then group, and rename the group Misc.
Now we can check what the calculation is doing. If you add the fields onto the view as below, and set the Moving Avg table calculation to compute using Month of Order Date only (see further below), you should be able to see that each month’s moving avg value is calculated based on the sales value of the set of previous & post months as defined by your parameters. In the image below the Moving Avg for Accessories in June 2018, is the average of the Sales values from April 2018 – Sept 2018.
With this you can start the beginnings of the viz – don’t forget to set the table calc as above.
Colouring the lines
This will be managed by using a set.
Right click on the Sub-Category (group) field -> Create -> Set. Initially select all values. Add this field to the Colour shelf. Additionally, click the Detail symbol (…) to the left of the Sub-Category (group), and select the Colour symbol, so this field is also added to the Colour shelf.
The resulting colour legend will look something like thisEdit the colour legend, then choose Hue Circle and select Assign Palette to randomly assign colours to all the options
To show the set values, click on the context menu of the Sub-Category (Group) field on the Colour shelf, and Show Set.
This will add the list of options for selection
Uncheck All so none are selected, which will change the colour legend to read ‘Out, xxx’. Edit the colour legend again, and control-click to multi select all options, then set to a single grey
Now if you select a few options, the ones selected will be coloured, while the others remain grey
Additionally add the set field onto the Size shelf and make the In option bigger than the Out.
Shading the background
For this we need to create an unstacked area chart with one measure representing the maximum moving average value for the month, and the other representing the minimum moving average value for the month. We’ll need new calculated fields for this:
Window Max Avg
WINDOW_MAX([Moving Avg])
Window Min Avg
WINDOW_MIN([Moving Avg])
If you’ve still got your data sheet available, then move Sub-Category (Group) onto Rows, then add the two newly created fields.
In this case there are ‘nested’ table calcs. You need to ensure the setting related to the Moving Avg is computing by Month Order Date only, but the setting related to the Window Max Avg (or Window Min Avg)is computing by Sub-Category (Group).
If set properly, you should see that for each month the max / min values are displayed against every row.
Back to your chart viz sheet, and add Window Max Avg to Rows. Set the table calc settings as described above, then remove the Sub-Category (group) Set field from the Colour shelf of this measure, and change the Sub-Category (group) to be on the Detail rather than Colour shelf.
Change the mark type to Area, set the Opacity of the colour to 100% and set stack Marks to be Off (Analysis Menu -> Stack Marks -> Off).
Now drag Window Min Avg onto the Window Max Avg axis and drop it when the ‘2 columns’ image appears.
This will change the view so Measure Values is now on the Rows shelf and Window Min Avg is now displayed in the Measure Values section on the left hand side.
Adjust the table calc setting of Window Min Avg to be similar to how we set the Max field. And now drag the fields so Window Min Avg is listed before Window Max Avg. Measure Names will now be on the Colour shelf of this marks card, so adjust so Window Min Avg is white and Window Max Avg is pale grey.
Now make the chart dual axes, synchronise the axes, and set the Measure Values axis to the ‘back’.
Everything else is now just formatting and adding onto a dashboard. My published viz is here.
This viz is essentially equivalent to a small multiple display where the charts for a specific dimension (in this case State) get displayed across numerous rows and columns. The difference here, is the row and column for the State to be displayed in, is specifically defined, rather than just sequentially based on how the data is being sorted.
Luke very kindly provided the logic to determine the rows and columns.
Building out the data
Once again, I’m going to start by putting all my data into a table so I can check my calculations, especially since this challenge does involve table calculations (there’s a hint on the Latest Challenges page)
Data Source Filter
Although not explicitly mentioned in the requirements, the information we need to present is based on the dates from 1st March 2020 to 31st July 2020. I messaged Luke to check this, before I then realised it was stated in the title of the viz – doh!
To make things easier, I therefore added a data source filter to remove all the other dates, setting the Report Date to range from 01 March 2020 to 31 July 2020 (right click on the data source -> add data source filter
I then added the basic fields I needed to a table
Province State Name to Rows
Report Date discrete, exact date (blue pill) to Rows, custom formatted to mmmm, dd
People Positive New Cases to Text
I excluded the fields where Province State Name = Null
We need to calculate the 7 day rolling average per Province State Name which we can do by using the UI to create a Moving Average quick table calculation against the People PositiveNew Cases pill, and then editing to compute over the previous 6 records (+ the current record makes 7 days). But I want to be able to reference this field, so I’m going to ‘bake it’ into the data model by creating a specific calculated field
7 day moving Avg
WINDOW_AVG(SUM([People Positive New Cases Count]), -6, 0)
Add this into the table, and edit the table calculation to compute by Report Date only and set the Null if not enough values checkbox
Do a basic sense check that the averages are correct by summing up 7 sequential rows and working out the average by dividing by 7.
So it looks like we’ve got the core measure to be plotted, but we’re going to need some additional fields in the presentation.
Again it’s not explicitly stated in the requirements, but it is in the title, that the values plotted need to be normalised. This is to ensure the data for each state is visible; if a state with a relatively low number of cases is positioned in the same row as one with a very high number of cases, it will be hard to see the data for the state with the low cases, because while the axis can be set to be independent, this will only work against a row and not an individual instance of a chart.
To normalise, we need to understand the maximum 7 day rolling average for each state.
Max Avg Per State
WINDOW_MAX([7 day moving Avg])
Add this to the table, setting both the nested table calcs to compute by Report Date.
In normalising, what we’re essentially going to do is determine the 7 day moving Avg as a proportion of the Max Avg Per State, so every value to plot will be on a range of 0-1.
Normalised Value
[7 day moving Avg]/[Max Avg Per State]
Format this to 2 dp, and add to chart, remembering to check the table calcs continue to compute by Report Date only.
Above you can see the max value for Alabama occurred on 19th July, so the Normalised Value to plot is 1
In the chart displayed, each State is titled by the name of the State. In a small multiple grid of rows & columns, we can’t use a dimension field for this, as it won’t appear where we want it. Instead we’re going to achieve this using dual axis, and plotting a mark at the centre point. For this we need to determine the centre date
This looks a bit complex, so I’ll break it down. What we’re doing is finding the number of days between the minimum date in the data set and the maximum date in the data set. This is
We’re then halving this value ( /2) and rounding it down to the nearest whole number (FLOOR).
We then add this number of whole days to the minimum date in the data set (DATEADD), to get our central date – 16 May 2020.
Now we need a point where we can plot a mark against that date. It needs to be above the maximum value in the main chart (which is 1). After a bit of trial and error, I decided 1.75 worked
Plot State
IF [Report Date] = [Centre Date] THEN 1.75 END
Finally we need to create our Rows and Columns fields which provides the co-ordinates to plot each state. The calculations for these were just lifted straight out of the requirements – thanks Luke!
Building the Viz
Start by adding the Rows and Columns fields to their respective shelf. Set them to be discrete dimensions (blue pills). You should immediately see a ‘map’ type layout of the US States.
Exclude the Null Rows value.
Now add Report Date as a continuous exact date (green pill) to Columns and Normalised Value to Rows, remembering to set the table calc to compute by Report Date only for all nested calculations. Change the mark type to Area.
Add 7 day moving Avg to Label and set the label to display the max value only and adjust the font size – I ended up at 7pt. Then add Province State Name & People Positive New Cases Count to Tooltip. Format the tooltip to match.
Remove all column/row lines and grid lines, zero lines etc.
There is a requirement to ‘add a line underneath each of the area trends’.
For this I added a 0 constant reference line formatted to be a solid black line.
But you’ll notice that for the charts that sit directly side by side, the line seems to be continuous, but I want to break it up. I re-added the column divider line to be a thick white line to get the desired effect.
Right, now lets get the State label added.
Add Plot State to Rows before Normalised Value and change the aggregation from SUM to MIN.
Change the Mark Type to Text and move the Province State Name field from Tooltip to the Text shelf. Adjust the text label to remove any other fields that are displaying, and resize the font – again I used 7. Clear the Tooltip for the this mark, so nothing displays on hover.
Make the chart dual axis and synchronise axis. Remove the Measure Names pill from the Colour shelf on both marks cards which will have automatically been added.
And now all you need to do is remove all the headers (uncheck Show Header) against Rows, Columns, Report Date & Plot Value, then right click on the >8k nulls label at the bottom right and select Hide Indicator.
You’re all done – you just need to add to a dashboard now. My published version is here.
I really enjoyed this challenge – a nice mix of calculations & format complexity but not overly cumbersome, which meant this blog didn’t take so many hours to write this week 🙂
This week for #WOW2020, Ann provided a table calculation feast of a challenge! This certainly is not for the faint-hearted! As well as cracking all the table calcs, the challenge features multiple views, measure swapping, parameters, BANs, filtering, sorting …. it’s got it all going on!
Ann hinted you’d probably want to start with the table, and even if there hadn’t been a table output in the display, this is what I would have done. If you’ve read enough of my blogs, you’ll know I often like to build up a ‘check data’ sheet, which just contains the data I need in tabular form as a quick reference. When working with table calculations this is an absolute must have!
So let’s build out that Check Data table to start with. I have a feeling this is going to be a lengthy blog 🙂
Initial Set up
First up, the requirements stated that the latest date would be 7 June, but I found records with a 8 June date. All the associated info for this date was null though, so I set a data source filter to exclude this. This means I wouldn’t get any issues if I needed to store the max date in a FIXED LoD calculation at any point.
I also found it easier to rename a couple of the measures provided to match the output, so rename PEOPLE_POSITIVE_NEW_CASES_COUNT to New Cases and PEOPLE_POSITIVE_CASES_COUNT to Reported Cases. I’ll refer to these renamed fields going forward.
Building all the Calculated Fields
To build out the table, we’re just going to focus on one State & County, as there’s a lot of data. So add Province State Name = Tennessee and County = Davidson to the Filter shelf.
Add Report Date (discrete exact date – blue pill) and New Cases & Reported to Rows. As you scroll down, you’ll see data starting to come in on 8 March.
We want to create our moving average calculations
3 Day Moving Avg
WINDOW_AVG(SUM([New Cases]), -2, 0)
14 Day Moving Avg
WINDOW_AVG(SUM([New Cases]), -13, 0)
Notice the number of rows to average over is 1 less than you might expect, as the current row is included, so the calculation is saying ‘current row’ and 2 | 13 previous rows.
Add these to the table, and adjust the table calculation so it is explicitly calculating by Report Date. This would have happened automatically, as the calculation would have been computing ‘down’ the table, but it’s best to fix the computation, so it doesn’t matter where the pill gets moved to in the view.
We now need to work out whether there is an increase or not between the 3-day and 14-day average.
Is Increase?
IF [3 Day Moving Avg] > [14 Day Moving Avg] THEN 1 ELSE 0 END
Is Decrease?
IF [3 Day Moving Avg] <= [14 Day Moving Avg] THEN 1 ELSE 0 END
I’m using 1s and 0s as it’s going to help with a later calculation.
NOTE – I’m assuming that if there is ‘no change’ it’ll be recorded as a decrease. This is how I interpreted the requirement, “ …whether it is an increase or a decrease (or no change)” and it wasn’t easy to find any matches anyway.
I also need some text to indicate the increase or decrease
Increase | Decrease
UPPER(IF [Is Increase?]=1 THEN ‘Increase’ ELSE ‘Decrease’ END)
The UPPER is used as that’s part of the tooltip formatting.
Let’s get these onto the view, always making sure the table calculations are set to Report Date.
We need to calculate the number of days that has been reported INCREASE in succession, and the number of days where successive DECREASE has been reported.
So first, let’s identify which rows match the previous row.
Match Prev Value?
LOOKUP([Is Increase?],-1) = [Is Increase?]
If the value of the Is Increase? field in the previous (-1) row is the same as the Is Increase? field in the current row, then this is true, else false.
Add to the view, and verify the table calculation for itself and all nested calculations being referenced, is set to Report Date.
We now have all the information we need to help us work out the number of days in the increase/decrease ‘trend’.
Days in Trend
IF (FIRST()=0) OR(NOT([Match Prev Value?])) THEN 1 ELSEIF [Increase | Decrease] = ‘INCREASE’ THEN ([Is Increase?]+PREVIOUS_VALUE([Is Increase?])) ELSEIF [Increase | Decrease] = ‘DECREASE’ THEN ([Is Decrease? ]+PREVIOUS_VALUE([Is Decrease? ])) END
If the row in the table is the very first entry (so there’s nothing previous to compare against), or the row in the table didn’t match it’s predecessor (ie there was a change), then we’re starting a new ‘trend run’, which obviously starts at 1.
Otherwise, if the current row we’re on indicates an increase, then we’ll add the value of the Is Increase? field (which is 1) to the previous value (which is also 1). PREVIOUS_VALUE works recursively though, so it essentially builds up a running sum, which gives our trend.
We ultimately do the same thing using the Is Decrease? column. This is why using 1 & 0s in the earlier calculation help.
Adding into the view, and setting the table calculation correctly, you should get something similar to this…
Finally, there’s one key field we need to add; something to help identify the latest row as we will need it for filtering in the table that’s displayed on the dashboard. Simply applying a standard ‘quick filter’ won’t work, as the table requires we show the 3-day & 14-day moving averages. A ‘quick filter’ to limit the data to the latest date (7th June), will show the wrong values, as the data related to the other days will be filtered out, so the table calc won’t have the information to correctly compute over.
We need to create another table calculation that we can use as a filter, and that due to Tableau’s ‘order of operations’ will apply later in the filtering process than a traditional quick filter.
Max Date
{FIXED : MAX([Report Date])}
The latest date in the whole data set.
Show Data for Latest Date
LOOKUP(MIN([Report Date]),0) = MIN([Max Date])
If the Report Date of the current row is the same as the maximum date in the whole data set, then return true.
We’ve now got all the core data components we need to create the various charts.
In the interest of time (my time in writing this out), I’m going to attempt not to describe the building of all the charts in too much detail, but just call out the useful bits you might need. If you’re attempting this challenge with the table calcs above, I’m assuming you know Tableau enough to not need everything defined to the lowest level.
The whole report is driven off a parameter which the user must enter a State – County combo.
You’ll need a calculated field to store the combo
State – County
[Province State Name] + ‘ – ‘ + [County]
and then create a parameter (State – County Parameter) off of this (right click, Create -> Parameter) which will create a string parameter with all the permutations.
When displaying on the dashboard, set this to be of type Type In
BAN
The BAN is a basic summary of the latest trend for the entered state county.
We need to filter the sheet to the value entered in the parameter
Is Selected State County?
[State – County Parameter] = [State – County]
Add this to the Filter shelf as true, along with the Show Data for Latest Date.
Add the relevant fields to the Text shelf to display the required text. The Report Date needs to be custom formatted to ddd, mmm d to get the Sun, Jun 7 display
Map
For the map, as well as filtering the latest date, we’re also going to need to filter just to the state only (not state & county) as above. So I created
Is Selected State?
LEFT([State – County Parameter], FIND([State – County Parameter],’-‘)-2) = [Province State Name]
This is unpicking the State – County combined string stored in the parameter, to just find the State part and compare to the Province State Name.
Build a filled map based on County and filter to the latest date and the selected state. I set the Map Layers to that below, which seems to match up
You’ll need to set both the Is Selected State County? and Increase|Decrease fields to the Colour shelf.
Bar & Line Chart
You’re going to need a few more calculated fields for this.
Moving Avg Selector
for the user to choose what the line should display. I’ve set it to an intger parameter that displays text
We then need a field to show on the display depending on what’s been selected in the parameter
Moving Avg to Display
If [Moving Avg Selector] = 3 THEN [3 Day Moving Avg] ELSE [14 Day Moving Avg] END
You’ll need a Dual Axis chart plotting New Cases and Moving Avg to Display against Report Date (continuous exact date)
The data only starts from 8th March, so I added Report Date to filter to start from 8th March. 8th March is also added as a constant reference line.
Table
Based on the State / County entered, the table is filtered to show the data for the latest date for all the counties in the state entered. Although not stated in the requirements, the first row is the county selected, with the rest ordered by Reported Cases.
You can get the selected county to the top, by adding Is Selected State County as a hidden field to the Rows, and moving ‘True’ to the top.
And that should be everything you need to build the dashboard, which is pretty much just stacking all the sheets one on top of each other in a single column.