After connecting to the data, add Team to Rows and Timestamp as a continuous (green) pill at the Day level to Columns. Create a new field
Call Count
COUNT([synthetic_call_center_data.csv])
and then add this to Rows.
Add Timestamp to Filter, and select relative date. Set the options to Last 3 weeks, and then additionally check the anchor relative to check box and enter 23 Dec 2024. This is because the data set only goes up to the end of December 2024. Not setting this field will apply the date filter based on ‘today’ so it’s unlikely anything will appear.
(Note – you may also need to update the date properties of the data set to ensure a week starts on a Sunday to get matching numbers: right click the data source and select the date properties option).
To add a marker to the last point, create a field
Call Count – Most Recent
IF LAST()=0 THEN [Call Count] END
Add this to Rows and adjust the table calc setting so it is computing specifically by Day of Timestamp only. By default it was doing this via the Table (across) option, but I tend to always prefer to always explicitly fix what the calculation is computing over, as it won’t then matter where I then move that field too if I choose to change the layout of the viz.
Set the mark type on the Call Count marks card to line, and then adjust the colour to grey and reduce the size. Set the mark type of the Call Count – Most Recent marks card to circle, set the colour to blue and increase the size. Hide the null indicator (right click > hide).
Set the chart to dual axis, synchronise the axis and then remove the Measure Names field from the All marks card.
remove both the axis titles (right click axis > edit axis), hide the right hand axis (right click, untick show header), and format to remove the column divider from the header section only.
Now we’ve got the core display, we need to create the following fields
No. of Calls
WINDOW_SUM([Call Count])
Highest Call Vol
WINDOW_MAX([Call Count])
Lowest Call Vol
WINDOW_MIN([Call Count])
Avg Call Vol
WINDOW_AVG([Call Count])
format this to a number with 0 dp
Calls this period
WINDOW_MAX([Call Count – Most Recent])
the window_max is required here, as the data set we’re displaying at the day level, has 2 values – the latest value and null. We only want to return 1 value, which is the maximum of these.
Previous Period
WINDOW_MAX(IF LAST()=1 THEN [Call Count] END)
LAST()=1 returns the value of the next to last record, and the window_max is again applied, as the nested IF clause will return null for all others records.
Period Var
[Calls this period] – [Previous Period]
Add each of these fields, one by one, to Rows following the steps below
Add to rows (it will automatically display as a green continuous pill).
change to discrete (right click on the pill and select discrete – the pill will turn blue and move to before the green pills)
Explicitly set the table calc to be computing by Timestamp (as above)
Once, you should have something that looks like this
but I noticed, that the display in the solution is sorted based on the total number of calls and not by Team, so add a Sort to the Team pill to sort by Call Count descending
Update the Tooltip if you wish, and then add the viz to a dashboard, floating the Timestamp filter.
For this week’s challenge we’re going to look at building an alternative to a stacked bar chart. This challenge was inspired by this post by the Flerlage twins, which in turn was inspired by other members of the #datafam community. Once again, we’re making use of some Premier League data.
Building the viz
Add Team to filter and select Manchester United, Chelsea, Arsenal and Manchester City then add Season to filter and select the last five seasons. Then add Team and Season to Columns and Points to Rows.
Sort team by the field Points descending. Add Team to colour and adjust accordingly. Show Mark labels to display the number of points for each bar.
Add subtotals via the analysis menu > totals > show all subtotals
Hide the subtotal bar by clicking on one of the total bars and selecting hide from the drop down which is defaulted to automatic
Right click on the Total label on the axes and select format – in the left hand side, delete the label field
Create a new calculated field
Total Points
WINDOW_SUM(SUM([Points]))
And add this to rows. Remove Team from the colour shelf on the marks card, and change the colour to pale grey. Increase the size of the bars to be as wide as possible. Uncheck show mark labels. Hide the total bar like we did above.
Make the chart dual axis and synchronise axis. If need be, right click the right axis and move marks to back. Adjust the table calculation on Total Points so it is computing by Season only. Hide the total bar again if it reappears.
We want to label the total bar with the Team and the Total Points so we need to create some more calculated fields
Label: Team
IF [Season] = ‘2021-22’ THEN [Team] ELSE NULL END
We’re being sneaky here, and just labelling the central bar.
Label: Total Points
IF MIN([Season]) = ‘2021-22’ THEN [Total Points] END
Add Label: Team to the label shelf of the Total Points marks card and adjust so it is an attribute – this means it’s not referenced in any table calculations. Add Label: Total Points to the label shelf too.
Adjust the label as required and then change the alignment so the direction of the text is swapped. Format the font as desired.
Delete all the text from the Tooltip on the Total Points marks card and adjust the text on the Points marks card to match the required format.
Hide the right hand axes (right click axis > uncheck show header). Hide the Team column heading. Remove all gridlines/ axis lines/ zero lines and row and column dividers.
Hide the Season label heading (right click on the label > hide field labels for columns).
Format the axes font to be smaller and rotate the axes labels on the Season axis.
Name the sheet Bar chart or similar and add to dashboard.
Ann Jackson set this week’s #WOW2021 challenge, based on a recent ‘real world’ situation she had encountered.
Analysing Ann’s solution (by interacting with her published solution), I deduced we’d need to use set actions to add and remove the selected Sub-Categories into and out of the set (I hadn’t noticed Ann had tagged the challenge on the main page with Set Actions 🙂 ). I also realised the initial visual in the first column, wasn’t using reference lines to depict the target, as the tooltip displayed on hover, was much more detailed than what you can add to a reference line tooltip.
So armed with this knowledge, I set about building what was required.
Defining the required calculations
Building the BANs
Building the viz
Adding the interactivity
Defining the required calculations
This is one of those challenges where I want to get all the calcs sorted up front in a tabular view, before even attempting the viz. I’ll go through what I ended up with, but be assured this did take a bit of time and change of direction to get what I needed.
Let’s start with parameters. Firstly I created a parameter to simulate ‘today’, that is hardcoded to 9th June 2021
I then created a parameter to store the % uplift we want to use for one of the goal options. I default this to 0.1 (ie 10%)
We need to work out the sales so far this year (ie the sales in 2021 up to and including 9th June 2021), and the equivalent sales for the previous year (ie the sales in 2020 up to and including 9th June 2020).
SALES YTD by Sub Cat
IF YEAR([Order Date]) = YEAR([pToday]) AND [Order Date] <= [pToday] THEN [Sales] END
SALES LY YTD by Sub Cat
IF YEAR([Order Date]) = YEAR([pToday])-1 AND [Order Date] <= DATEADD(‘year’,-1,[pToday]) THEN [Sales] END
Let’s pop these into a table with the Category and Sub-Category fields, and add grand totals, so we can see what figures we need to be aiming for the BANs at the top.
Now let’s work out the values for the goals.
One goal is based on finding the average sales per day in 2021, then extrapolating this across the whole year.
So firstly, we need to know how many days in the year up to ‘today’.
This finds the number of days between 01 Jan 2021 and 01 Jan 2022.
These methods ensure the right number of days is recorded if the ‘current’ year happens to be a leap year.
We can now work out one of the goals
Same Pace Goal
(SUM([SALES YTD by Sub Cat])/[# Days So Far This Year]) * [# Days In Year]
Add up the Sales so far this year and divide by the number of days so far to get the average sales value per day, then multiply by the total number of days in the year.
For the other goal, we need to determine the sales for the whole of the previous year, and multiple by the uplift %.
SALES LY
IF YEAR([Order Date]) = YEAR([pToday])-1 THEN [Sales] END
This gives the total sales for 2020.
LY + Percent Increase
SUM([SALES LY])*(1+[pPercentIncrease])
This applies the % increase parameter to the total sales for 2020.
Now we want to decide which of the goal values to use based on ‘selection’. To define the selected Sub-Categories, we’re going to use a set. Right-click on Sub-Category > Create > Set, name accordingly and select ‘Phones’ as the initial value in the set.
Use Last Year With Increase
Now we can use whether the record is in or out of the set in the logic to determine the goal to use
YEAR_END GOAL per Sub-Cat
IF ATTR([Use Last Year With Increase]) then [LY + Percent Increase Goal] ELSE [Same Pace Goal] END
Add the set and the goal field to the table, and you’ll see the value in the final column is matching the relevant previous columns, depending on whether the Sub-Category is IN or OUT of the set.
NOTE – when you do this, you’ll get In or Out displayed against each row. To get the description as I’ve got displayed, right click on the text In or Out and Edit Alias.
These data fields are going to be used to build out the central viz. There’s a couple of other fields we need to finish off the data requirements for the viz
Value to Goal
[YEAR-END GOAL Per Sub Cat]-SUM([SALES YTD by Sub Cat])
and
% of Goal
SUM([SALES YTD by Sub Cat])/[YEAR-END GOAL Per Sub Cat]
This field needs to be formatted to percentage with 0 dp (all other monetary fields need to be formatted to $ with 0 dp).
So now we have the core fields stored against each row that will help us build out the main viz. You can edit the set and add further sub-categories, so you can validate how the values change.
Now we want to work on the data we need for the BANs. You may think the grand totals sum up all the rows above, however while this works fine for the SALES YTD by Sub Cat and SALES LY YTD by Sub Cat columns, it isn’t the case for the YEAR-END GOAL by Sub Cat column.
If we select all the rows in that column, and examine the tooltip that appears on hover, the total of the selected columns differs from the total at the bottom of the column.
The value in the hover text is what we need. The discrepancy with the column grand total is because it’s working across the whole data and not row by row. Because at least 1 record is in the set, its taking the logic to use the same pace goal. You’ll notice the total of this column matches the total of the Same Pace Goal column. In fact as you add more values to the set, the total will match the Same Pace Goal total up until all values are in the set. At that point the total will match the LY + Percent Increase column.
So back to the summary measures. We’re going to use table calcs to solve this.
SALES YTD
WINDOW_SUM(SUM([SALES YTD by Sub Cat]))
SALES LY YTD
WINDOW_SUM(SUM([SALES LY YTD by Sub Cat]))
YEAR-END GOAL
WINDOW_SUM([YEAR-END GOAL Per Sub Cat])
These are all basically summing up the values displayed in the rows on the screen. Add them to the table, and you can see the same values are displayed on each row which tally to other data in the table
Right, we have all the data fields we need, let’s start building.
Building the BANs
On a new sheet, add the fields as below
We want to turn this into 1 row.
Create a field called
Index
INDEX()
and add this to the Rows shelf and change to be discrete (blue pill). Each row should be numbered from 1 to 17.
Now drag the Index field from the Rows shelf onto the Filter shelf and when prompted select 1.
You’ve now got 1 row displayed, but the data associated to all the rows is being included in the calcs (if you’d filtered just to Sub Category = Accessories for example, the date would be just related to the rows with the Accessories value).
Now we can tidy this to look how we want
Add Measure Names to Text shelf (change display to entire view if need be).
Hide Sub-Category from displaying (uncheck Show Header)
Right Click on each column title and Edit Alias to remove the ‘along…’ text
Format the text appropriately and align centrally
Adjust the tooltip to remove the Sub-Category info
Hide the Measure Names from displaying (uncheck Show Header).
Format the Row Dividers to be thick
Building the Viz
Add Category and Sub-Category to the Rows shelf on a new sheet. Change the sort of the Category field to sort by data source order, descending. Change the sort of the Sub-Category field to sort as below
Add SALES YTD by Sub Cat to Columns, then drag SALES LY YTD by Sub Cat onto the canvas and drop onto the SALES YTD by Sub Cat axis when you see the 2 columns appear
Drag Measure Names from the Rows shelf to the Size shelf. Add Measure Names to the Colour shelf. Adjust sizes and colours accordingly. Add a white border to the bars on the colour shelf.
Turn stack marks off (Analysis > Stack Marks > Off) to allow the bars to overlay each other.
Add both SALES YTD by Sub Cat and SALES LY YTD by Sub Cat to the Tooltip shelf, so you can reference both values regardless which bar you’re hovering over.
Add YEAR-END GOAL by Sub Cat to the Columns shelf and set to dual axis and synchronise axis. Reset the mark type of the Measure Values card to bar, and of the YEAR-END GOAL by Sub Cat card to Gantt.
Remove the Measure Names from the Size and Colour on the YEAR-END GOAL by Sub Cat card. Set the colour of this mark to black.
On the All marks card, add YEAR-END GOAL by Sub Cat to the Tooltip shelf so once again it’s value can be referenced by all marks.
Adjust the tooltip on the All marks card to match.
‘Type in’ MIN(0) to the Columns shelf (double click in the space next to the pills to enable the text edit feature). On this MIN(0) marks card, change mark type to Text and add Value to Goal to the Text shelf. Adjust the text to include the additional wording and adjust size too if need be. Adjust tooltip too.
Repeat this process for the % of Goal field too.
For the circles, create another MIN(0) column in the same way, but this time change mark type to circle, and add Use Last Year with Increase set to the Colour shelf and adjust. Adjust tooltips.
Format axes/gridlines/text/row banding accordingly and rotate text of the Category field.
Adding the interactivity
Add the sheets to a dashboard. Add a set action to add Sub-Categories to the set on click (Dashboard > Action > Add Action > Change Set Values). Set the action to run on select (on click) of the chart viz, to add values to the Use Last Year with Increase set.
We then need another set action to remove the Sub-Categories, but this needs to work by clicking a link in the tooltip which displays. The name of this action will display on the tooltip.
And fingers crossed.. that should be it! My published viz is here.
As soon as I saw that Candra’s challenge for this week was going to involve Regular Expressions (RegEx), I gave a little groan. RegEx just isn’t my thing 😦 I only ever seem to use them for these challenges, and not in my working life, so have minimal experience. I always think I should focus some time on learning them properly, but other things just end up taking priority. Ho Hum…
So most of my time was spent trying to wrangle the info I needed to identify ‘how many bedrooms’ each property had. I did a bit of googling to try to find the right expressions I think I needed, used the regex101 site to test my expression to find certain patterns of text against some of the data in the Description field, and then tried to plug that into a calculated field in Tableau to extract the data I needed.
But I couldn’t get it to work 😦 I could find matching text using the REGEXP_MATCH function, but when I then tried to use the REXP_EXTRACT functions I couldn’t get anything out…
So I ended up having to look at the solutions that had already been published by the time I started, Candra’s, Lorna Brown’s and Sam Epley’s. I just needed to get my head round what I was obviously doing wrong and give me some pointers. All 3 had slightly different approaches. I absorbed, then closed their workbooks and attempted again from memory. With a lot more trial and error I got somewhere… it isn’t perfect and has some mismatches from the others (but they don’t all match each other either…).
Once I’d got a grouping for each property, the actual Tableau stuff was quite straightforward…
Identifying the ‘Number of Bedrooms’
Building the Histogram
Adding the Average Price
Building the Map
Adding the Interactivity
Identifying the Number of Bedrooms
So the way I approached this, was to try to identify all the various permutations that represented the word ‘bedroom’ and replace it with the word ‘Bedroom’. But one of the options was BR or br, and the Description field contained html markup with the term <br />. I didn’t want all these to become ‘bedroom’, so I got rid of them all first,
Firstly, replace any occurence of <br /> with a space, then replace any occurrence of the text bedroom or br<space> or bdrm or bed or bd or br<comma> or br<forward slash> or rooms with the word Bedroom.
I basically added more options to the or statement (identified by the | separator), as I went on examining the descriptions that were left. Using the LOWER function meant that bedroom or Bedroom or BedRoom etc would all be covered with one option.
Then I attempted to extract the number of bedrooms or identify as a studio
Studio | Beds
IF CONTAINS(LOWER([Desc with Bedroom]), ‘studio’) THEN ‘Studio’ ELSEIF REGEXP_MATCH(LOWER([Desc with Bedroom]),’\d bedroom’) THEN REGEXP_EXTRACT(LOWER([Desc with Bedroom]),'(\d+) bedroom’) ELSEIF REGEXP_MATCH(LOWER([Desc with Bedroom]),’\d bedroom’) THEN REGEXP_EXTRACT(LOWER([Desc with Bedroom]),'(\d+) bedroom’) ELSEIF CONTAINS(LOWER([Desc with Bedroom]), ‘six bedroom’) THEN ‘6’ END
If the revised description contains the word ‘studio’ then assume its a Studio.
Else if the revised description contains a number (\d) followed by 2 spaces then the word ‘bedroom’ then extract the numbers (\d+) that occur before the word bedroom. The brackets around the \d+ is what is used to identify what bit of the matching pattern to extract… this is the bit that I didn’t really know about and why I couldn’t get things to work.
Else if the revised description contains a number (\d) followed by 3 spaces then the word ‘bedroom’ then extract the numbers (\d+) that occur before the word bedroom. This just happened to be another pattern that occurred and meant some records didn’t get picked up by the prior statement. There’s probably a better way of doing this in one statement…
Finally, if the revised description contains the text ‘six bedroom’ then assume the property has 6 rooms.
This logic seemed to get a match against every record although it’s not 100% accurate, but it was close enough given my struggles.
I then wanted to get the rooms grouped
Room Grouping
CASE [Studio | Beds] WHEN ‘Studio’ THEN ‘Studio’ WHEN ‘1’ THEN ‘1 Bedroom’ WHEN ‘2’ THEN ‘2 Bedrooms’ WHEN ‘3’ THEN ‘3 Bedrooms’ WHEN ‘4’ THEN ‘4 Bedrooms’ ELSE ‘5 or more Bedrooms’ END
I planned to use this field as my filter, but in doing so the value listed alphabetically, so Studio ended up at the bottom of the list.
To resolve this I created a parameter which meant I could define the order I wanted :
pBedroomSelector
And then I created a new field to use for the filter
Filter Room
[pBedroomSelector] = ‘All’ OR [pBedroomSelector] = [Room Grouping]
I could then add this onto the filter shelf of the sheets I needed to build, setting the value to True.
Building the Histogram
For this chart, we need to ‘bin’ the Price of each property into groups of $100 ranges. However if we use the built in ‘bin’ function, the field created can’t be referenced in other calculations, and I needed to do this. So instead I determined the ‘lower’ value of the range by
Price per Night Min
FLOOR([Price]/100) *100
Divide the price by 100, round down to the nearest whole integer (so 1.9 will round down to 1), then multiply the result by 100.
And given that, I can then calculate
Price per Night Max
[Price per Night Min]+100
I also created a ‘friendlier’ field to store the number of properties
# of Listings
COUNT([listings copy_listings copy])
which is just a reference to the auto generated field created when you connect to the data source.
With these I can plot the histogram
Price per Night Min on Columns (set to discrete, continuous)
# of Listings on Rows
Mark type of Bar
Size set to be Fixed with a width of 100
Filter Room on the Filter shelf, set to True.
Adjust the colour via the Colour shelf and set a white border
Show the pBedroomSelector parameter
Add Price per Night Max to the Tooltip shelf and set to be an attribute.
Set the Tooltip accordingly and format gridlines, axes labels etc
Adding the Average Price
I wasn’t entirely sure what the average price on Candra’s solution represented, so I chose to go for the average price of the properties in the filtered selection; that is of all the 2-bedroom properties for example, find the average price per night, based on the total price per night of all the properties divided by the number of properties. ie I was looking for these values in the 3rd column.
But I couldn’t simply add the Price field aggregated to Avg to the bar chart. Doing so gave me different values per Price per Night Min grouping.
I just want the value on the grand total line spread across the all the data in the chart. So I created
Window Avg Price
WINDOW_SUM(SUM([Price])) / WINDOW_SUM([# of Listings])
This table calculation, set to compute by Price per Night Min gives the value I want across all rows of data
Add Window Avg Price to the Detail shelf of the histogram, set the calc to compute as above. Then you can add a reference line to the Price per Night Min axis.
Building the Map
To build maps you need fields that are geographic data types. For me, the Longitude field was already set, but I had to manually set the Latitude field (right click -> Geographic Role -> Latitude).
Once done, the map could be quickly built by double-clicking the Longitude field, then double clicking the Latitude field, then adding Name and Listing URL to the Detail shelf, and Price to the Tooltip shelf. Finally set Filter Room = True to the Filter shelf.
I then adjusted the colour of the circles, reduced the opacity to 50% and added a border (all via the Colour shelf).
I also added Area Code Boundaries via the Map -> Map Layers menu to get the map style Candra had used.
Adding the Interactivity
Add the 2 sheets to a dashboard. Each chart can be used to filter each other. This functionality can easily be added by clicking on the context menu of the dashboard object, and selecting Use as Filter. A filter dashboard action will automatically be added. Do this for both charts.
The final requirement, is for a link to the actual listing to be available from the map tooltip. This is a dashboard URL Action (Dashboard -> Actions -> Add Action -> Go to URL). Set as below
The words in the Name field will what is displayed on the tooltip.
The layout requires use of containers, background colours and a bit of padding. This is typically a bit of trial and error to get this right. You can check out my published version here.
For 2020 Week 42, the #WoW founder, Andy Kriebel, returned with a challenge to reproduce the Strava training calendar. Compared to some challenges recently, this looked to be quite straight forward; Andy threw in some specific requirements to test certain features – ie no data modelling and no LoDs.
I’ve been doing #WorkoutWednesday challenges since they first started, so I know that Andy is a stickler for formatting and layout – points not necessarily listed as a requirement, just expected as part of the challenge to reproduce. I kept my fingers tightly crossed when I published that I’d got all the finer details, but alas, Andy still found fault – my month summaries weren’t right aligned (my bad – missed that little nuance completely), and my bars had borders on them… Andy must have eyes like Superman to have seen that, as it wasn’t obvious. It also wasn’t a setting I’d intentionally added. I later found out that adding a particular type of pill to the Detail shelf caused borders to automagically be added… There’s always something to learn when Andy’s about!
So onto the challenge – as with previous weeks, I’m going to try to focus on the areas that may be a bit trickier / newer to some rather than detail the complete build step by step.
Using the data sets – blending
Building the calendar grid
Ensuring a 0 measure value is displayed for missing days
Adding the monthly hours summary
Building the BANs
Year Filter control
Remove highlighting
Setting the colour of the Calendar chart background
Using the data sets – blending
Andy was very specific that the 2 data sets provided should be used separately and not joined in any via the data pane.
This meant the data sources would need to be blended (further detail on this is here). Blending used to be one of the only ways within Tableau you could combine data together.
When blending, the number of rows in your output will never be more than the number of rows in your primary* data source. If there are multiple matching rows in the secondary data source, then the results will be aggregated in the display.
* whatever data source the first pill you add to your canvas comes from, will be the primary, and is denoted by a small blue icon by your data source. Secondary data sources are denoted by a small orange icon.
In the case of this challenge, we had a data set containing a list of dates (1 row per day from 01 Jan 2014 up to 31 Dec 2021), along with Andy’s Strava activity, containing a row for each activity recorded, which included the date time the activity occurred. This data could vary in that there could be multiple activities on the same day, and equally days when no activity occurred at all.
So the Calendar data set is our primary data source, as we need to show a bar on the calendar chart for every day of the year, regardless if there’s any activity. The Activity data set is our secondary data source. The number of hours, number of activities etc can all be aggregated from this data set.
When blending data sources, especially on dates, I prefer to create explicit calculated fields that define the fields I want to blend on. So in the Calendar data source I created
BLEND: Date
[Date]
essentially just a duplicate of the existing Date field, and in the Activity data source, I also created
BLEND: Date
DATE([Date Time])
Note the fields are spelled exactly the same, so Tableau automatically uses them as the linking fields when the view is built.
If you now do the following
Add the Calendar.BLEND: Date field to the Filter shelf, and select the Year = 2020,
Add Calendar.BLEND: Date as an exact date to Rows
Add Activity.BLEND: Date as an exact date to Rows
Add Activity.Seconds to Text
You can see that fields from the secondary data source have an orange icon by them; and that there are Null/missing values for the records from the secondary data source as these were the days when there was no activity recorded. You can also see a red link icon against the BLEND: Date field in the left hand data source pane, as this identifies how the two data sets are being matched.
Building the calendar grid
The calendar is essentially a ‘small multiple’ layout with each month being positioned in a particular row or column. To build out this layout we need to define the row number and the column number. There are many ways to build a dynamic small multiple grid which can flex based on the number of items you might be trying to organise, but for the purpose of this exercise, we can keep it simple. We’re working with 12 months that are to be displayed in a 4 x 3 grid layout. Create the following calculated fields in the Calendar data source.
Rows
IF MONTH([Date])<=4 THEN 0 ELSEIF MONTH([Date]) <=8 THEN 1 ELSE 2 END
Cols
(MONTH([Date])-1)%4
I make both of these to be dimensions rather than measures by dragging them above the line on the left hand data source pane. If you build out the view as below, you can see how these calcs are working
As we want to show a mark for every day in the month, we need to add the day of the month from the Calendar data source to Columns. Drag Calendar.BLEND: Date to Columns, then select the drop down to change to the Day date part
We need to show the amount of time in hours rather than seconds. In the Activity data source, create the field
Hours
([Seconds]/60)/60
and drag this onto the Rows, and change the mark type to bar. If need be re-add the YEAR(BLEND: Date) = 2020 to the Filter shelf. Now add Calendar.BLEND: Date as an exact date to the Detail shelf. You should now have
where you can see the gaps in the days where no activities took place, and if you hover vertically, you should find that the days of the month are vertically aligned – ie 30th Jan aligns with 30th May etc.
Ensuring a 0 measure value is displayed for missing days
With the above we displayed the Activity.Hours field, but if you hover over the day when there is no activity, nothing displays on the tooltip rather then 0.
To fix this, create a calculated field in the primary Calendar data source
Hours
ZN(SUM([Sheet1 (Activities Summary)].[Hours]))
This is basically just referencing the field in the secondary blended data source, but wrapping in a ZN() function means it will display 0 when no match can be found
Use this field from the primary data source instead on the calendar viz.
Adding the monthly hours summary
The requirements meant Andy expected the summary to be displayed within the same sheet as the daily calendar viz.
For this I used an old friend MIN(0) to create another axis, which is placed on the Rows in front of the Hours measure.
What I now plan to do is set this axis to be Text and plot the month, monthly hours, and the word ‘hours’ at a specific point to the right of the each cell – I’m choosing day 28 – you might want to experiment and choose a different day.
First up though, I need to build some fields to plot.
Month Name Abbrev
IF DAY([Date]) = 28 THEN UPPER(LEFT(DATENAME(‘month’,[Date]),3)) END
Hours in Month
IF MIN(DAY([Date])) = 28 THEN WINDOW_SUM([Hours]) END
LABEL: Hours
IF DAY([Date]) = 28 THEN ‘HOURS’ END
Month
DATE(DATETRUNC(‘month’,[Date]))
Add Month as an exact discrete date to Detail and the other 3 fields to the Text shelf of the Min(0) marks card (change the mark type to Text if you haven’t already done so). Alter the table calculation setting of the Hours in Month field to compute by all fields except Month
Building the BANs
These use a similar concept as above, by using 3 instances of MIN(0) placed side by side on the Columns shelf and set to the Text mark type. This creates 3 marks cards which you can then add the relevant measures and text on.
The measures are all coming from fields in the primary data source that reference measures in the seconday data source ie
# Activities (in Activity data source)
COUNT([Activity ID])
# Activities (in Calendar data source)
ZN([Sheet1 (Activities Summary)].[# Activities])
#Miles (in Calendar data source)
ZN(SUM([Sheet1 (Activities Summary)].[Miles]))
Year Filter Control
All the sheets you are building need to be filtered by the same BLEND: Date field from the Calendar data source (set the filter to Apply to all worksheets).
When this field is added to the dashboard, you can customise it so the All values does not show and the slider control also doesn’t display
Remove highlighting
To stop items on the dashboard from highlighting when they are clicked on, I use a trick that has been probably been the ‘most used trick of #WOW2020’ 🙂
In the primary data source, create a field called True which contains the value TRUE and a field False containing the value FALSE. Add both these fields to the Detail shelf of each sheet you don’t want highlighting on.
On the dashboard, create a Filter URL action for the each sheet that goes from the sheet on the dashboard to the sheet itself, and passes selected fields setting true = false. As this condition will never be true, then there is nothing to ‘filter’ so the marks don’t highlight. This needs to be repeated for each sheet on the dashboard, so I had 3 filter dashboard actions.
NOTE – a consequence of adding the True and False fields to the Detail shelf on the bar sheets, was that it caused a border to be added around the bars.
This wasn’t something I noticed, as it isn’t at all obvious, but Andy called it out!
Setting the colour of the Calendar chart background
You need to format the sheet and set the fill colour of the Pane rather than the whole sheet to grey.
There’s obviously a lot of other formatting settings to apply to get rid of all the row/column borders and gridlines etc, but this was a slight difference that I wanted to call out, as ended up with a ‘border’ on my dashboard that wasn’t required when I set the whole worksheet background.
Right, I think that’s about it for this week! Thanks for the fun challenge Andy – great to have you back!
Luke set the #WOW challenge this week, stating it had a difficulty rating of 10/10, so it was with some trepidation when I sat down to tackle it. Was it going to be as difficult as Luke’s radial bar challenge from 2018 week 10….?
Looking at the viz, and reading through the requirements, it didn’t seem as bad to me – it’s a table calculations challenge, and I’m OK with those. I started using Tableau before LoDs were invented, so table calculations don’t scare too much. I’m no expert though, and it still often takes a bit of trial & error to get the settings right.
Building out the Calcs
With a lot of challenges, I often start by just trying to build a tabular view of the data to sense check I’m getting the right numbers, which I then publish onto Tableau Public along with my viz. With a table calculation challenge, building the table of data is crucial.
The 3 key pieces of existing data needed for this challenge are Customer Name, Order Date & Order ID.
To start we want to put these on the rows, but given this will initially generate a lot of rows of data, I chose to arbitrarily filter to a random set of 20 or so customers (include Noel Staavos, as he’s a slight exception, which may catch you out later).
You can see from the above, Noel has multiple orders on the same day, which we need to handle.
The first requirement states we need to order the data by total number of orders per customer, so for this we need the 1st of many calculated fields (this one isn’t a table calculation, though it could be….):
Total Orders
{FIXED [Customer Name]: COUNTD([Order ID])}
Add this as a discrete pill on the Rows (since the final viz needs to show it this way too). I’ve placed it after the Customer Name but it doesn’t have to be there at this point.
We need to sort the customers by this field, so click on Customer Name and select Sort to bring up the Sort dialog. Choose to Sort By Field, Sort Order Descending, Field Name Total Orders. Aggregation Sum
We’ve now got our data ordered in the way we need. The next step is to work out the 90-day reorder rate per customer, which we will tackle in several steps. In much of what follows, some of the calculations could well be done in one calculation, but I like to see the ‘building blocks’ to help verify the calcs are correct.
First up we need to work out how many days between each order, and to figure this out we need to compare the Order Date on each line to the Order Date of the previous line.
I want to display the date of the previous order on the same line as the current order, so create
Previous Order Date
LOOKUP(ATTR([Order Date]),-1)
This ‘looks up’ the Order Date on the previous row. If I wanted to look at the next row, the second attribute would be 1 rather than -1. Or if I wanted to look at the data in the row 2 rows before the current one, I’d use -2.
Add Previous Order Date to the Rows as a discrete field. By default it’ll probably show as ‘Null’, but this is because the table calculation is computing across the table (so is looking for a previous column containing Order Date which doesn’t exist), whereas we want to look down it (ie by Row).
Click on the triangle against the Previous Order Date pill (the triangle indicates its a table calculation), and select Edit Table Calculation
We need to change the settings so they calculate for each Customer Name. Set to Specific Dimensions, and uncheck Customer Name
You can see from the above, that for Noel Staavos’ multiple order on the same day, one of the orders is comparing against an order on a different date, while the other is comparing to the other order made on the same date.
Ok, so now we have the current date & previous date on the same row, we now need to work out the number of days between the dates.
Days Since Previous Order
DATEDIFF(‘day’, [Previous Order Date],ATTR([Order Date]))
Add this to the Text shelf. Hopefully it should automatically use the same table calculation settings defined for Previous Order Date, but if the numbers look off, double check the settings. They should match.
Now we can work out if the order is within 90 days of the previous order
Reorder Within 90 Days
IF ISNULL([Previous Order Date]) THEN NULL ELSEIF [Days Since Previous Order]<=90 THEN 1 ELSE 0 END
Note, I am purposely choosing to output 1 or 0 (ie a number) rather than true or false (a boolean), as it will make the next calculation easier.
Again add Reorder Within 90 Days to the table, and again sense check the table calculation settings if things don’t look right.
Now we have the information we need to work out the reorder rate per customer, which is the number of records where Reorder Within 90 Days is 1 as a proportion of the number of records with a Reorder Within 90 Days value of either 0 or 1, since the requirements state the first order for each customer shouldn’t be included in the metric.
90-Day Reorder Rate
ZN(WINDOW_SUM([Reorder Within 90 Days])/WINDOW_COUNT([Reorder Within 90 Days]))
Set this to be a percentage with 0 dp.
By choosing to set Reorder Within 90 Days to an integer, the WINDOW_SUM() is simply summing up the 1s & 0s in the column. WINDOW_COUNT() is just counting the number of 1s & 0s there are in the column. Wrapping with a ZN means any Customers without any reorders will report as 0% rather than NULL.
Let’s add that to the table now too. You’ll see it reports the same value down the whole table across all customers, whereas we need it to show a different value per customer (although still show the same value for all the rows of the same customer).
We need to once again adjust the table calculation settings for this field.
What you’ll notice this time though, is there are Nested Calculations within this field, so the settings need to be checked for both the Previous Order Date calculation and the 90-Day Reorder Rate calculation
Both need to be identical with Order Date and Order ID being checked.
At this point, we have enough that we could start building the main viz, but I’m going to continue building out the ‘check sheet’ with the data I need for the KPIs too.
For the Overall Reorder Rate < 90 days we need to get a count of all the reorders within 90 days across all customers, as a proportion of all reorders (ie not the 1st order for each customer).
Count Reorders < 90 days per customer
IF FIRST()=0 THEN WINDOW_SUM([Reorder Within 90 Days]) END
I’m choosing to output the value only against the 1st row for each customer. This is because I’m going to be adding up this column shortly, and if I outputted the value against each row, I’d be double-counting.
Add this field to the table, and once again verify the table calculation settings are applied to each nested calculation. If done right, you should get the correct number against the first row for each customer
Now I want a sum of this column
Total Reorders < 90 Days
IF FIRST() =0 THEN WINDOW_SUM([Count Reorders <90 days per Customer]) END
Again, I’m choosing just to show this value against the 1st row, but this time it will be the first row in the whole table, as this time the table calculation needs to considering all the customers.
Add this field to the view. It’ll probably automatically give you the right number, but it’s worth having a look at the table calculation settings.
It contains 3 nested calculations this time : Total Reorders < 90 days, Count Reorders < 90 days per Customer, Previous Order Date.
The settings for Count Reorders < 90 days per Customer and Previous Order Date should be as before, with Order Date & Order ID both checked.
The setting for Total Reorders < 90 days is different though. By default it like shows Table(down), which gives the right result, but to ensure things don’t go awry if the pill gets moved around for some reason, I like to explicitly set the computation, by changing to Specific Dimensions and selecting all 3 fields
So this gives me one of the values I need to help me work out the overall rate; I now need the other
Count All Reorders
IF FIRST()=0 THEN WINDOW_COUNT([Reorder Within 90 Days]) END
This again is a nested calculation. Previous Order Date should be set with Order Date & Order ID as usual, and Count All Reorders should be set across all 3 fields.
Now we have the numbers to work out
Overall Reorder Rate
[Total Reorders <90 days]/[Count All Reorders]
which is set to a Percentage to 0 dp.
Add this onto the table and check the calculation settings again. This time there are 4 nested calculations
Total Reorders < 90 days : set to all 3 fields
Count Reorders < 90 days per Customer : set to Order Date & Order ID
Previous Order Date : set to Order Date & Order ID
Count All Reorders : set to all 3 fields
Remember, you may have filtered the customers in your view, so you may get a different value from the solution at this point!
Right onto the final KPI – average number of reorders per customer. For this we need the total number of reorders, which we’ve already got- Count All Reorders, and number of customers.
There’s probably a simpler way of doing this, but I’m continuing down the same route I’ve been working on.
First I want a count of the customer for each customer (which is 1), then I want to sum that up.
Count Customers
IF FIRST()=0 THEN COUNTD([Customer Name]) END
then
Total Customers
IF FIRST()=0 THEN WINDOW_SUM([Count Customers]) END
When added to the table, this is again a nested calculation with Count Customers set to Order Date & Order ID, and Total Customers set to all 3.
Now we can work out
Average Reorders Per Customer
[Count All Reorders]/[Total Customers]
which is set to be a number to 1 decimal place.
Adding this to the table, and verify the nested calculations are all set properly
Count All Reorders : set to all 3 fields
Previous Order Date : set to Order Date & Order ID
Total Customer : set to all 3 fields
Count Customers : set to Order Date & Order ID
Yay! We’ve now got all the values we need to build the KPI table and the main Viz.
I would recommend naming this sheet as Check Data or similar.
Building the KPI Viz
First step, duplicate the Check Data sheet.
For the KPIs, we only need 2 of the measures we’ve created, so remove all the pills from the Measure Values section except Overall Reorder Rate and Avg Reorders per Customer.
Then remove Total Orders and Previous Order Date from the Rows.
Due to the way table calculations work, we need to keep Customer Name, Order Date & Order ID in the view, but we only need the 1st row in the whole table, as that is where the data we want has been stored.
Create a new field
First Row
FIRST()=0
Add this to the Filter shelf and select True
Hide Customer Name, Order Date & Order ID by unchecking Show Header from each pill
We need to create a fake axis to get the display we need. In the Columns type in Min(0) twice, then remove Measure Names
Expand one of the Agg(Min(0)) cards on the left, and then drag the Overall Reorder Rate pill from the Measure Values section at the bottom, so it sits directly on top of the Measure Values pill
The Avg Reorders per Customer will now be on the Label shelf of the other AGG(Min(0)) card.
Change the mark type of both cards to Text.
Uncheck Show Header against the MIN(0) pills, and format to remove all columns/row/grid & zero lines.
Now you can format the text for each mark to be the appropriate size and with the relevant wording.
Then finally turn all tooltips off, and remove the Customer Name filter if it’s still there.
Hopefully you should get the numbers that now match the solution! Rename the sheet KPIs.
Building the main Viz
Once again start by duplicating the Check Data sheet we built.
Move Order ID and Previous Order Date onto the Detail shelf.
Move 90-Day Reorder Rate onto the Rows, and set to discrete
Move Order Date to Columns and change to continuous. Remove Measure Names from Columns.
Change the mark type to shape and move the Reorder Within 90 Days pill from the Measure Values section onto the Shape shelf. Change the shapes so NULL is the triangle and 0 and 1 are both diamonds.
Add Reorder Within 90 Days to the Colour shelf too, by clicking on the one on the Shape shelf, holding down Ctrl and dragging onto Colour. This will have the affect of duplicating the pill and will retain the table calculation settings. If you can’t get this to work, then drag from the Measures pane on the right, but make sure you set the table calc settings to match.
Change the colours so Null & 0 are grey (#CAC4BE) and 1 is purple (#5557EB).
The Tooltip for each shape also needs to be coloured accordingly, based on whether the mark represents the First order, a reorder over 90 days, or a reorder under 90 days (I chose to add the logic for first). I created 3 calculated fields for this:
Tooltip: First
IF FIRST()=0 THEN ‘First Order’ END
Tooltip:Over 90
IF [Reorder Within 90 Days]=0 THEN ‘Over 90 days’ END
Tooltip:Witin 90
IF [Reorder Within 90 Days]=0 THEN ‘Over 90 days’ END
I placed all 3 of these fields on the Tooltip shelf which was then formatted as:
while all 3 fields are in the tooltip, only 1 will ever actually display a value,
Make sure the table calculation settings for all 3 tooltip fields are set with Order Date & Order ID checked.
Now we need to create the line to join the marks up.
Duplicate the Order Date pill on Columns to create 2 instances of the pill sitting side by side. Set to Dual axis, synchronise and s’move marks to the back’ so the lines are behind the shapes.
Change the mark type on the Order Date (2) card to line. The requirements state
“Make sure the “line” color matches with the following mark. It must be a single color“.
Hmmmm… I don’t think that last sentence was there when I tackled the challenge… I just read the line colour ‘matches’ with the following mark, so to do this I added Previous Order Date to the Path shelf. The line graduates in colour, but from the colour of preceding mark to the colour of the following mark.
Looking at Luke’s solution now that I’ve seen this statement, he chose to use a Gantt mark type to represent the line, which meant each gantt block was coloured fully. Oh well… I didn’t quite get this then 🙂
Stop the tooltips from showing on the line, by deleting the text in the tooltip, then its just a case of re-sizing the marks, and formatting to suit. Set the background colour of the worksheet to grey.
Adding the red header on the dashboard
To get the red header displayed above the table, add, a container onto the dashboard and add the viz to sit inside it. With the container selected (identified by the blue border), set the background of the container object to red (#D81159).
Then select the viz itself (identified by the grey border), and adjust the outer padding settings, to have 0 left, bottom & right, and 4 at the top.
Hopefully this should now be pretty much everything needed to get this challenge sorted. If I’ve missed anything, please comment. My published viz is here.
It’s a pretty lengthy read, so if you’ve got this far – well done & thank you!