For 2020 Week 42, the #WoW founder, Andy Kriebel, returned with a challenge to reproduce the Strava training calendar. Compared to some challenges recently, this looked to be quite straight forward; Andy threw in some specific requirements to test certain features – ie no data modelling and no LoDs.
I’ve been doing #WorkoutWednesday challenges since they first started, so I know that Andy is a stickler for formatting and layout – points not necessarily listed as a requirement, just expected as part of the challenge to reproduce. I kept my fingers tightly crossed when I published that I’d got all the finer details, but alas, Andy still found fault – my month summaries weren’t right aligned (my bad – missed that little nuance completely), and my bars had borders on them… Andy must have eyes like Superman to have seen that, as it wasn’t obvious. It also wasn’t a setting I’d intentionally added. I later found out that adding a particular type of pill to the Detail shelf caused borders to automagically be added… There’s always something to learn when Andy’s about!
So onto the challenge – as with previous weeks, I’m going to try to focus on the areas that may be a bit trickier / newer to some rather than detail the complete build step by step.
- Using the data sets – blending
- Building the calendar grid
- Ensuring a 0 measure value is displayed for missing days
- Adding the monthly hours summary
- Building the BANs
- Year Filter control
- Remove highlighting
- Setting the colour of the Calendar chart background
Using the data sets – blending
Andy was very specific that the 2 data sets provided should be used separately and not joined in any via the data pane.
This meant the data sources would need to be blended (further detail on this is here). Blending used to be one of the only ways within Tableau you could combine data together.
When blending, the number of rows in your output will never be more than the number of rows in your primary* data source. If there are multiple matching rows in the secondary data source, then the results will be aggregated in the display.
* whatever data source the first pill you add to your canvas comes from, will be the primary, and is denoted by a small blue icon by your data source. Secondary data sources are denoted by a small orange icon.
In the case of this challenge, we had a data set containing a list of dates (1 row per day from 01 Jan 2014 up to 31 Dec 2021), along with Andy’s Strava activity, containing a row for each activity recorded, which included the date time the activity occurred. This data could vary in that there could be multiple activities on the same day, and equally days when no activity occurred at all.
So the Calendar data set is our primary data source, as we need to show a bar on the calendar chart for every day of the year, regardless if there’s any activity. The Activity data set is our secondary data source. The number of hours, number of activities etc can all be aggregated from this data set.
When blending data sources, especially on dates, I prefer to create explicit calculated fields that define the fields I want to blend on. So in the Calendar data source I created
essentially just a duplicate of the existing Date field, and in the Activity data source, I also created
Note the fields are spelled exactly the same, so Tableau automatically uses them as the linking fields when the view is built.
If you now do the following
- Add the Calendar.BLEND: Date field to the Filter shelf, and select the Year = 2020,
- Add Calendar.BLEND: Date as an exact date to Rows
- Add Activity.BLEND: Date as an exact date to Rows
- Add Activity.Seconds to Text
You can see that fields from the secondary data source have an orange icon by them; and that there are Null/missing values for the records from the secondary data source as these were the days when there was no activity recorded. You can also see a red link icon against the BLEND: Date field in the left hand data source pane, as this identifies how the two data sets are being matched.
Building the calendar grid
The calendar is essentially a ‘small multiple’ layout with each month being positioned in a particular row or column. To build out this layout we need to define the row number and the column number. There are many ways to build a dynamic small multiple grid which can flex based on the number of items you might be trying to organise, but for the purpose of this exercise, we can keep it simple. We’re working with 12 months that are to be displayed in a 4 x 3 grid layout. Create the following calculated fields in the Calendar data source.
IF MONTH([Date])<=4 THEN 0
ELSEIF MONTH([Date]) <=8 THEN 1
I make both of these to be dimensions rather than measures by dragging them above the line on the left hand data source pane. If you build out the view as below, you can see how these calcs are working
As we want to show a mark for every day in the month, we need to add the day of the month from the Calendar data source to Columns. Drag Calendar.BLEND: Date to Columns, then select the drop down to change to the Day date part
We need to show the amount of time in hours rather than seconds. In the Activity data source, create the field
and drag this onto the Rows, and change the mark type to bar. If need be re-add the YEAR(BLEND: Date) = 2020 to the Filter shelf. Now add Calendar.BLEND: Date as an exact date to the Detail shelf. You should now have
where you can see the gaps in the days where no activities took place, and if you hover vertically, you should find that the days of the month are vertically aligned – ie 30th Jan aligns with 30th May etc.
Ensuring a 0 measure value is displayed for missing days
With the above we displayed the Activity.Hours field, but if you hover over the day when there is no activity, nothing displays on the tooltip rather then 0.
To fix this, create a calculated field in the primary Calendar data source
ZN(SUM([Sheet1 (Activities Summary)].[Hours]))
This is basically just referencing the field in the secondary blended data source, but wrapping in a ZN() function means it will display 0 when no match can be found
Use this field from the primary data source instead on the calendar viz.
Adding the monthly hours summary
The requirements meant Andy expected the summary to be displayed within the same sheet as the daily calendar viz.
For this I used an old friend MIN(0) to create another axis, which is placed on the Rows in front of the Hours measure.
What I now plan to do is set this axis to be Text and plot the month, monthly hours, and the word ‘hours’ at a specific point to the right of the each cell – I’m choosing day 28 – you might want to experiment and choose a different day.
First up though, I need to build some fields to plot.
Month Name Abbrev
IF DAY([Date]) = 28 THEN
Hours in Month
IF MIN(DAY([Date])) = 28 THEN
IF DAY([Date]) = 28 THEN ‘HOURS’ END
Add Month as an exact discrete date to Detail and the other 3 fields to the Text shelf of the Min(0) marks card (change the mark type to Text if you haven’t already done so). Alter the table calculation setting of the Hours in Month field to compute by all fields except Month
Building the BANs
These use a similar concept as above, by using 3 instances of MIN(0) placed side by side on the Columns shelf and set to the Text mark type. This creates 3 marks cards which you can then add the relevant measures and text on.
The measures are all coming from fields in the primary data source that reference measures in the seconday data source ie
# Activities (in Activity data source)
# Activities (in Calendar data source)
ZN([Sheet1 (Activities Summary)].[# Activities])
#Miles (in Calendar data source)
ZN(SUM([Sheet1 (Activities Summary)].[Miles]))
Year Filter Control
All the sheets you are building need to be filtered by the same BLEND: Date field from the Calendar data source (set the filter to Apply to all worksheets).
When this field is added to the dashboard, you can customise it so the All values does not show and the slider control also doesn’t display
To stop items on the dashboard from highlighting when they are clicked on, I use a trick that has been probably been the ‘most used trick of #WOW2020’ 🙂
In the primary data source, create a field called True which contains the value TRUE and a field False containing the value FALSE. Add both these fields to the Detail shelf of each sheet you don’t want highlighting on.
On the dashboard, create a Filter URL action for the each sheet that goes from the sheet on the dashboard to the sheet itself, and passes selected fields setting true = false. As this condition will never be true, then there is nothing to ‘filter’ so the marks don’t highlight. This needs to be repeated for each sheet on the dashboard, so I had 3 filter dashboard actions.
NOTE – a consequence of adding the True and False fields to the Detail shelf on the bar sheets, was that it caused a border to be added around the bars.
This wasn’t something I noticed, as it isn’t at all obvious, but Andy called it out!
Setting the colour of the Calendar chart background
You need to format the sheet and set the fill colour of the Pane rather than the whole sheet to grey.
There’s obviously a lot of other formatting settings to apply to get rid of all the row/column borders and gridlines etc, but this was a slight difference that I wanted to call out, as ended up with a ‘border’ on my dashboard that wasn’t required when I set the whole worksheet background.
Right, I think that’s about it for this week! Thanks for the fun challenge Andy – great to have you back!
My published viz is here.
Happy vizzin’! Stay Safe!