For #WOW2022 week 18, Lorna set this challenge to allow us to familiarise ourselves with the new Tableau Workbook Optimizer feature released in version 2022.1 (so you’ll need this version of Tableau Desktop to complete this challenge).
You need to download the workbook Lorna provides from Tableau Public, and then the only stipulation is to get the workbook to fully pass all 12 checks made on the workbook – no changes to the dashboard presented is required.
So, open the workbook and start by running the optimizer via the Server > Run Optimizer menu
You’re presented with the following – 3 important Take Action suggestions and 1 warning Needs Review suggestion.
Each of the suggestions can then be expanded to see more detail
Take Action – unused data sources
You’re told exactly which data sources aren’t used. Close the Optimizer dialog then navigate to a worksheet so you can access the listed data sources in the top section of the left hand Data tab. Right click on the relevant data sources and Close. If you happen to click one that is in use, you will be warned.
Take Action – Unused Fields
On each of the remaining data sources listed, right click and select Hide all unused fields to hide all the fields mentioned in the optimizer.
The optimizer also identifies unused parameters, which are listed at the bottom of the left hand pane. Right click on these and Hide.
If you now run the optimiser again, you should find you’re now left with just 1 Take Action notification, since the removal of the unused data sources, also removed the Needs Review item which was advising about the number of data sources being used.
I made an attempt to build a solution without referencing Lindsey’s viz, but ended up building multiple sheets and creating multiple calculations, which just ‘didn’t feel right’ – it would have worked, but it just seemed ‘too clunky’ of a solution, so I ended up checking out Lindsey’s viz as a guide. The key to the ‘simpler’ solution involves utilisation of a field in the data that allows the ‘measures’ to be arranged horizontally in a single sheet. I’ll explain further as we get to that section. I also don’t make any use of set actions in my solution. Instead I use parameter actions. So let’s crack on…
To build this solution I need 7 sheets
1 sheet to display the 5 measure boxes with the summary stats per player for each measure
1 sheet per sparkline that is displayed in each box (5 sheets in total)
1 sheet to display the larger line chart at the bottom which changes the measure on selection of a card
Creating the measure calculations
Firstly, we need to set up the measures that are being used throughout this viz. Some are new calculations, some are just renamed fields. I renamed the following fields :
G -> GAMES (formatted to 0 dp)
H – > HITS (formatted to 0 dp)
HR -> HOME RUNS (formatted to 0 dp)
then I created
HITS/GAME
ROUND(SUM([HITS])/SUM([GAMES]),2)
AB/HR
ROUND(SUM([AB])/SUM([HOME RUNS]),2)
Building the measure swap line chart
I’m going to start by building out the line chart at the bottom of the dashboard, as once built and formatted, we can duplicate it to form the basis of the other 5 line charts.
As with any measure swap chart, we need to create a parameter that is going to store the value of the measure that has ben selected to show. In this instance we just need
pMeasure
a string parameter which is defaulted to the value ‘GAMES’
Then we need a field that is going to determine which measure to display based on the parameter value
Value to Display
CASE [pMeasure] WHEN ‘GAMES’ THEN SUM([GAMES]) WHEN ‘HITS’ THEN SUM([HITS]) WHEN ‘HITS/GAME’ THEN [HITS/GAME] WHEN ‘HOME RUNS’ THEN SUM([HOME RUNS]) WHEN ‘AB/HR’ THEN [AB/HR] END
Show the pMeasure parameter, then add Year (as green, continuous pill) to Columns and Value to Display to Rows and Player to Colour. Adjust colours accordingly. Add Year (as blue discrete pill) to the Filter shelf, and exclude 2022, as Kyle chooses not to show this data – probably as its an incomplete year).
Manually change the values in the parameter to HITS, or HITS/GAME etc and check the display changes as expected.
On the Label shelf, select to Show Mark Labels and only label Max/Min values. Modify the colour of the label to Match Mark Colour.
Finally, format the Value to Display field to Number(Standard). This is a format that will automatically display the values as whole numbers or decimals, which is really neat (Note – it’s only while rebuilding the solution to blog that I remembered this and tried it out, so you may see that my published solution has a bit of a convoluted calculation to get the display as I needed).
Edit the Year axis to start at 2000 and end in 2022, and remove the title. Edit the Value to Display axis to not include zero and remove the title. Remove all gridlines.
Finally amend the tooltip, and then edit the title of the sheet to reference the pMeasure parameter.
Building the individual sparkline charts
We need to create a sheet per measure to be used in the sparkline display in the card. To create the first one, duplicate the Selected Measure Trend sheet you just created, then
Uncheck Show mark labels
Uncheck Show Tooltips
Hide the Year axis
Replace the Value to Display pill with the GAMES field (drag the GAMES field from the Dimensions pane and drop it directly on the Value to Display pill, so it is just replaced).
Hide the Games axis
Format to remove all zero lines/axis rulers
Set the worksheet background colour to None (which will make it transparent when we add it to the dashboard later, although you won’t notice anything at this point).
Name the sheet Games Trend or similar.
Now duplicate this sheet, and replace the GAMES pill with the HITS pill. Name this new sheet Hits Trend or similar.
Repeat this process for the HITS/GAMES, HOME RUNS, and AB/HR measures, so you have 5 trend charts.
Building the Measure Selector card
We’re going to use some Tableau trickery to ‘fake’ these cards. We’re looking to build a table of 1 row and 5 columns where each column contains 3 lines of text.
We can’t just use Measure Names & Measure Values split by Player, as we need to create a ‘box’ for each measure with the Players text displayed ‘together’ top left.
So we use some trickery.
Firstly, we’re going to ‘map’ the name of each measure to a Year in the data set which exists for each Player (ie we use the last years and not the first ones).
MetricName
CASE [Year] WHEN 2017 THEN ‘GAMES’ WHEN 2018 THEN ‘HITS’ WHEN 2019 THEN ‘HITS/GAME’ WHEN 2020 THEN ‘HOME RUNS’ WHEN 2021 THEN ‘AB/HR’ END
Then we need to capture the get the set of measures for each player into a dedicated field :
C Measures
IF [Player] = ‘Cabrera’ THEN CASE [Metric Name] WHEN ‘GAMES’ THEN {FIXED [Player]: SUM([GAMES])} WHEN ‘HITS’ THEN {FIXED [Player]: SUM([HITS])} WHEN ‘HITS/GAME’ THEN {FIXED [Player]: [HITS/GAME]} WHEN ‘HOME RUNS’ THEN {FIXED [Player]: SUM([HOME RUNS])} WHEN ‘AB/HR’ THEN {FIXED [Player]: [AB/HR]} END END
P Measures
IF [Player] = ‘Pujols’ THEN CASE [Metric Name] WHEN ‘GAMES’ THEN {FIXED [Player]: SUM([GAMES])} WHEN ‘HITS’ THEN {FIXED [Player]: SUM([HITS])} WHEN ‘HITS/GAME’ THEN {FIXED [Player]: [HITS/GAME]} WHEN ‘HOME RUNS’ THEN {FIXED [Player]: SUM([HOME RUNS])} WHEN ‘AB/HR’ THEN {FIXED [Player]: [AB/HR]} END END
We’re using FIXED LoDs to get the totals of each measure across all years for each Player
You can see below that the numbers highlighted match the numbers per player from the image above
With these calculations, we can now build the ‘card’.
Add Year (blue discrete pill) to Columns, and filter to only show the years 2017-2021. Type in MIN(1) on the Rows shelf. Change mark type to Bar and edit the axis to be fixed from 0-1. Set the Fit to Fit Width.
Add Metric Name, P Measures and C Measures to the Label shelf. Adjust the label so it is formatted with the right colours, and aligned left. I used some extra spaces at the start of each line so that there was a bit of padding.
To colour the bars, we need a further calculated field
Measure is Selected
[pMeasure]=[Metric Name]
Add this to the Colour shelf and adjust accordingly and reduce the opacity to 50%.
Hide all headers/axes, remove all gridlines/axis rulers and format the background colour of the worksheet to None as you did before. Don’t show the tooltip.
Finally we need to create & add a couple more fields to prevent the selected box from remaining highlighted when we click on it. Create fields
True
TRUE
False
FALSE
and add both to the Detail shelf.
Building the dashboard
Now we have the components to build the dashboard. The Measure Selector sheet and the 5 sparkline charts will all be floating objects carefully arranged. It’s up to you whether the remaining objects will also be floating or captured in tiled containers.
The key with the Measure Selector and Sparkline sheet is that all the sparkline sheets should be sent to back, so they are behind the Measure Selector card. This is where the transparency settings help as you can then ‘see through’ the sheet that is on the top.
You should also make sure each of your sparkline sheets are the same height & width and start at the same y position
Adding the interactivity
Once all your objects are placed, add a parameter dashboard action that on select of the Measure Selector sheet will set the pMeasure parameter, passing in the Metric Name field.
Then create a filter dashboard action that will on select of the Measure Selector object on the dashboard, will filter the Measure Selector sheet itself, setting the fields True = False. As this can never happen, the filter doesn’t apply, and the sheet removes the auto highlight that Tableau applies.
Hopefully, you now have a fully functional dashboard with a chart that changes on click of a KPI card in the top section. My published viz is here.
Sean Miller was back this week to set this challenge to recreate a ‘rat sighting’ map using hexbins. I’ve only used hexbins in other #WOW challenges, so needed a bit of a refresher (the previous challenges pre-dated my own blog, so I couldn’t use myself as a reference). A quick google for ‘tableau hexbins’ and I found a variety of articles that provided the refresher needed.
Sean also used the opportunity to apply some other crucial skills – adding custom shapes and custom colour palettes, which I recommend is the first step you do in completing this challenge.
Adding custom shapes
Download the hexagon shape provided by Sean, and then save it into a folder in your …My Tableau Repository\Shapes directory. I have a folder called ‘Custom’ where I place random shapes I need. This post will help you out if you’re having difficulty with any of this.
Adding custom colour palette
Open a text editor such as Notepad, then open the preferences.tps file that is located in your .. My Tableau Repository directory. Copy & paste the block of code provided by Sean between the opening and closing <preferences> tag. Save the file and close the text editor. This post will help if you’re having trouble.
Building the map
The provided rat sighting data set contains a Longitude and Latitude value for every rat sighting since 2010.
Hexbins provide a way to group (bin) these Lat & Long values together. The size of the bin is typically determined by a parameter, so lets first set this up
pRatio
integer or float parameter which I set to a default value of 250 (Sean didn’t specify the value he’d used, but trial and error suggested to me this looked ‘about right’).
Edit the geographic role of this field to be mapped to Latitude.
Add HexbinX to Columns and HexbinY to Rows. Modify each field so that it is a continuous dimension.
Change the mark type to Shape and select the hexagon shape you saved earlier.
Add the auto generated ‘count of dataset’ (Count of Rows) field to Colour, and adjust the colour to use the OrRd-5 palette you added earlier. Ensure to set the palette to Reversed.
Set the Tooltip so it doesn’t display anything on hover, and add Borough to the Detail shelf (this is needed for the interactivity later). Hide the nulls indicator that displays in the bottom left (right click -> hide).
On the Map menu, select Map Layers, and set the background style to dark.
Using the map controls, zoom in and pan to the left slightly, so the coloured area is mainly central and there’s less ‘sea’ at the bottom. It’s likely that you may need to adjust further once you’ve placed the map on the dashboard. But before publishing, we want to turn the map controls off, to prevent a user from shifting the display (Map -> Map Options -> uncheck all options).
Building the Bar Chart
On a new sheet, add Borough to Rows and Count of Rows to Columns. Add Borough to Filter and exclude Null and Unspecified. Sort the rows descending. Adjust the colour to suit.
Show mark labels, but only display the max & min values. Format the Count of Rows pill so the labels are displayed in K to 2 dp.
Hide the axes, right align the row label headings, adjust the tooltip. Hide the column heading. Remove zero lines and axis rulers. Set the background colour of the worksheet to ‘None’ (ie transparent).
Further formatting is required, but this is best done after the sheet is added to the dashboard, as you’ll lose visibility of the text at this point.
Creating the dashboard
Add the map to a dashboard, and remove all the additional containers/legends etc that are added, and the sheet title. Then add the bar as a floating object and position bottom left. Fit to entire view. Edit the title so it contains the ‘on hover’ instruction and format in light grey font.
Now you can format the row labels, the row headings and the gridlines to be appropriate colours – light grey rather than white.
Add a dashboard highlight action that on hover of the bar chart, highlights data in the Map
Then add floating text boxes and add the title and description. I ended up adding a text box just containing the OH, and then another position just below containing RATS! and then the description, as otherwise the carriage return between OH & RATS! made the spacing too wide. I used the controls on the layout tab to ensure both text boxes were positioned at the same x-coordinate.
Hopefully, you should now have a beautiful looking viz. My published version is here.
Side note – When I first started building this I tried from memory, and didn’t quite get things right. I then adjusted various fields as described above, but when I then tried to add the Count of Rows to Colour, I was only ever getting a value of 1 against each bin. I double & triple checked all the calcs and couldn’t see any issues. It was very weird. I simply ended up closing down my workbook and starting again from scratch and all was fine. I’m just letting you know this in case you too come across any oddities during your build, and things don’t behave as expected. It meant, what was ultimately quite a straight forward build (once I got the calcs right), ended up involving more time and head-scratching than really required 😦
It was Erica Hughes’ turn to set the #WOW challenge this week. This ended up being a bit of a journey for me, because of one requirement – to use the max date of the dataset.
I typically use a FIXED LOD to determine the max date, which I did initially in this case, and after building a solution, tripped up when it came to adding some of the filters ‘to context’. The context filter meant my Max Date was changing depending on the filter and subsequently I wasn’t getting the right results for the ‘days since last purchase’.
So I ended up having to put my thinking cap back on, and after a lot of trial and error and reading on the internet (including this article by the Flerlage Twins) , I finally managed to get a solution that involved both LOD calculations (including a rarely used INCLUDE LOD) and Table calculations, and no context filters at all. This was a pretty complex solution. The table of data had to include the Order Date on the Detail shelf, resulting in multiple rows per customer showing, which meant I had to then use an INDEX()=1 filter to only show 1 row for each customer. I then used an additional INDEX() against each customer, so I could filter to show only the customers in position 1-10. My solution to this is published here.
After publishing, I checked Erica’s solution and found how she’d handled the max date requirement. It used a concept I haven’t had to use so far, so I decided to rebuild a less complex solution, which is what I will now blog about.
Capturing the max date of data set which isn’t affected by context filters
The usual way to define the maximum date in the data set is to create a FIXED LOD calculation, and this is still required
Max Date
{FIXED :MAX([Order Date])}
This returns 30 Dec 2021.
The problem is, once a context filter is applied to a sheet, the value of this field can change. For example, if Region is applied as a context filter and set to ‘South’, then what Tableau will do, based on the ‘order of operations’, is first filter all the data to just those records where Region=South. It will then work out the max date of these filtered records, and if there are no orders on 30 Dec 2021 for the South Region, then the value of the Max Date field will differ. This is because context filters get applied before FIXED LOD calculations. We need a way to ensure we can retain the 30 Dec 2021 without hardcoding it.
The solution is to use a parameter.
pMaxDate
A date parameter which is set to use the value of the Max Date field when the workbook is opened.
This is quite a sneaky but clever way to retain this, which is why I’m reworking my solution to use it, so I have something for future reference myself. When the workbook opens, the pMaxDate parameter will get set to 30 Dec 2021 and won’t ever change, whereas the Max Date field will change if context filters are used.
Building the Table
It may seem a bit odd, but I’m going to start with the table of data that’s displayed, as this requires the most calculations, and the most validation.
On a new sheet, add Customer Name, Customer ID to Rows and Sales onto Text. Order by Sales desc. Format Sales to $ with 0dp.
We need to determine the latest order date for each customer in order to work out other information.
Latest Order Date
{FIXED [Customer ID] : MAX([Order Date])}
This finds the maximum Order Date for each Customer.
And with this, we can now work out
Days Since Last Purchase
DATEDIFF(‘day’, [Latest Order Date], [pMaxDate]
Note this is comparing the Latest Order Date to the pMaxDate parameter instead of Max Date.
Format this to custom number, with 0 dp and a suffix of ‘ days’.
Add Latest Order Date to Rows (exact date, discrete) and Days Since Last Purchase into the table.
Now we need to work out
Latest Order Amount
{FIXED [Customer ID]: SUM(IF [Order Date]=[Latest Order Date] THEN [Sales] END)}
For each customer, if the Order Date is the same as the Latest Order Date, then retrieve the Sales value.
Format this to $ with 0 dp and add to the view.
Now add Customer ID to the Filter shelf and set to filter by the Top 10 based on Sales.
To validate the behaviour of the calculations add Region to the Filter shelf and select ‘South’. You’ll notice only 8 rows are shown, and not 10.
This is because the data has been filtered based on the top 10 customers with the most sales first, and then restricted those top 10, to those with sales in the South. Only 8 of the top 10 customers have sales in the South, hence the 8 rows.
To resolve this, we need to add Region to Context (right click on the filter and Add to Context . This has the effect of filtering all the data by Region = South first, and then displaying the top 10 customers by Sales. We now have 10 rows displayed.
Add State to the Filters shelf, and add that to context too. Test the table of results by selecting different combinations of regions and/or states and comparing to the results on Erica’s solution to verify all the calculations are behaving as expected.
Now we’re happy the table is displaying the data as expected, we can format it and make it ready for the dashboard :
Remove Latest Order Date & Customer ID
Set the Row Banding
Set the Row Dividers to Level 0, so only row lines appear at top and bottom
Remove Column Dividers
Format all text (row/column headings & text data) to be 8pt.
Alias Sales to be Total Sales (right click on the Sales title in the columns and Edit Alias)
Remove Region & State from the Filter shelf. These will get re-added later.
Building the Map
On a new sheet, double click State to load a map (you may need to set your location to United States : Map menu -> Edit Locations).
Change mark type to a Filled Map, add Region to Colour and adjust colours.
Remove all map layers (Map menu -> Map Layers and uncheck all options listed on left hand side). Change border on Colour shelf to white.
Remove row & column dividers, and remove all map options (Map menu -> Map Options and uncheck all options).
Drag a new instance of State onto the canvas and Add a Marks Layer. This will create a 2nd marks card on the left. Change the colour of the circles to black. Add Sales to the Size shelf and adjust. Adjust the tooltip of both marks card (you’ll need to add Sales to the Tooltip on the map marks card).
Building the Legend
On a new sheet add Region to Columns and type in MIN(1) into the Columns shelf too. Add Region to the Label shelf, and the Colour shelf. Adjust the height of the rows, so you can see the text.
Edit the axis so it is fixed from 0 to 1. Then format the Label so the font is larger and centred.
Hide the axis and the Region (uncheck show header). Remove row & column dividers and any gridlines. Adjust the border on the Colour shelf to be white. Set the tooltip not to show.
Building the Bar Chart
On a new sheet, add Sub-Category to Rows and Sales to Columns. Order by Sales desc. Add Sub-Category to the Filter shelf and set to show Top 10 by Sales. Reduce the Size of the bars, and align the row labels to be left aligned, and the font to be 8.
Format the axis, so the values are displayed in $K.
Hide the row label, and adjust Tooltip. You may need to create an additionalcalculated field based on Sales to add to the Tooltip which you can then format to $ with 0dp.
Adding the interactivity & context filters
Add the sheets onto a dashboard. You’ll need to use a mixture of vertical & horizontal layout containers, inner and outer padding and background colours to get the required layout.
Add a dashboard filter action which runs on Select when the Legend sheet is selected, which affects all other sheets and shows all values when unselected.
Add another filter dashboard action, which this time runs on Select when the Map sheet is selected, which affects all sheets except the Legend sheet, and also shows all values when unselected.
Now click on one of the regions in the legend, then click on a state in the map. This has the effect of adding filters to the shelves on the other sheets. Navigate to the Table sheet, and add the 2 ‘Action’ filters to context
Do the same for the Bar char sheet.
And you should now have a completed viz. My published version based on this solution is here.
The challenge focussed on reporting monthly headcount, using a dataset that contained 1 row per employee with a column relating to the start date (Hire Date) and a column relating to the date the employee left (Termdate), which could have been a future date or NULL/blank.
I’ve worked on headcount reporting before in my day job, but have used a snapshotted data source, which captures the active employees as at midnight on the 1st day of every month. This method makes counting the headcount each month very simple, as it’s just counting the number of rows per snapshot date.
Obviously, this wasn’t an option in this case. Luke gave very little clues as to what approach would be required, apart from saying ‘you will also need to manipulate the data’….
A word about the data…
Just a little side note here…. I ended up in discussions with my fellow #WOW participant Rosario Gauna as after attempting one method, I wasn’t getting as high numbers as the solution originally suggested. There were rows in the data set that had NULL values for most of the data except job title, but I think this was only around 225 rows, so even including them I didn’t get close to the figures. Rosario however, was somehow managing to see approx 15,000 rows of null data. I ended up building and publishing my initial solutions based on a version of the data sent to me directly by Rosario.
Since then, it seems others were also confused, and so the solution was adjusted to ignore all the null rows. As a result I reworked one of my solutions to use the original data source I downloaded from the site, and it’s this I will focus on in the blog.
A word about my solutions…
With little instruction, it took a bit of head-scratching to figure out how to work through the problem. I initially decided to use a scaffold dataset to help. I built a simple file in Excel which contained 1 column listing 1 date per month which went from 01 Jan 2017 to 01 Jan 2021. I then joined this to the HR data in Tableau Desktop using the physical data layer (ie I did not use relations), using join calculations as below (note, this is using the amended copy of data I received where Termdate was already defined as a date field.
However, I didn’t feel completely comfortable with this approach. Luke had suggested the difficulty of this was 9/10 and referred to nested tableau calculations… the above solution just used quick table calcs. So I decided to see if I could come up with an alternative, just using the provided data, and that’s what I’ll document below.
Setting up the data
I downloaded the Human Resources.csv file from the link provided, and connected to it in Tableau Desktop. I found the Termdate field presented as a string field, so I converted it to a date, just by clicking on the ABC symbol of the data type against the field. I also added a data source filter to exclude all records where the Id field was NULL.
I then added a new data source, and connected to the same file again. This meant I had 2 instances of the Human Resources data source listed in my Data pane. I renamed the 2nd instance to be Leavers Only, changed the data type of the Termdate field again and added a data source filter so that only records with non-Null Termdate values were retained.
Note – What I will be doing here is using the Human Resources data source to manage the months we’re reporting headcount over. This method only works if there is at least 1 new starter every month, which in this case there is.
Building the calculations
The requirements state that a new starter doesn’t count in the monthly headcount figures until the following month, so the date field we need to use in the output, needs to be created as
This basically shifts the date the new starter joined to the 1st day of the following month.
To make the calculations easier to read, I also created
Starters
COUNT([2022_04_06_WW14_Human Resources.csv])
which is literally a reference to the field that automatically gets added as part of the connection – it’s just a bit of a mouthful as its named according the data connection.
Let’s pop these out in a table as below – I’m deliberately using the dateparts of the Report Date field as they’ll be used in the viz like this.
So this is just showing the new starters against the month at which they will count for headcount. We need to get details of leavers now, and we’ll do this by blending.
When I blend data sources, I tend to create specific fields so it’s clear to me that they’re being used for blending, rather than defining blend relationships between differently named fields. So in the Human Resourcesdata source I’ve created
which is just referencing the automatically generated ‘count’ field from the Leavers data source. I’ve wrapped it in ZN so 0 is reported in the event no match is found.
Add Leavers Per Month into the table, and ensure the linking field from the secondary Leaver data source is connected on the BLEND – Date field
So now we know how many starters and leavers per month, we now need to total these up by generating a cumulative running sum of the starters, but subtracting any leavers along the way.
Official Headcount
([Starters] + PREVIOUS_VALUE(0)) – [Leavers Per Month]
Takes the value of the Starters in the current row, adds it to the value of Official Headcount from the previous row, then subtracts any Leavers recorded against the current row. Add this onto the table, and as its a table calculation, edit it so it is explicitly computing my the month & year of Report Date
For the annual change, we’re looking to compare the difference of the Official Headcount value for the current month eg Jan 2021, with the Official Headcount value for the same month 1 year ago eg Jan 2020.
Take the Official Headcount from the current row, subtract the Official Headcount from 12 rows before, then divide the result by the Official Headcount from 12 rows before.
Format this to a percentage at 1 dp. (For some reason my numbers here don’t seem to match Luke’s revised solution…).
Again add to the sheet and set the table calcs of both the nested calcs to compute by month and year
The final measure was a bit of a strangely named one IMO. I felt I should be doing some rolling calculation, but ultimately, it just seemed to be reporting the difference between the headcount now and that 12 months ago ie the numerator of the calculation above.
Pop this into the view, adjust the table calc settings again
Building the viz
On a new sheet, add Report Date at the month level, but then set to be continuous, to Columns and Report Date at the year level to Colour. Add Official Headcount to Rows (adding the linking field for the blend when prompted), and adjust the table calc settings to compute over both month & year, ensuring the Year of Report Date is listed first.
Then in the legend, select all the years from 2000-2017 and ‘hide’ (this is probably a cheat way, but the quickest – we need to retain the data from the previous years, so can’t just ‘filter’).
Edit the axis to not start from zero. Adjust colours to suit.
Add Annual Change and Net Rolling 12 Change to the Rows shelf, adjust both of the table calc settings, so all nested calcs are computing by Year & then Month.
Edit the axis of both these additional measures to also exclude 0. On the All marks card, click the Label button and tick the Show mark labels option.
Final steps ….
…on the All marks card, click the Tooltip button and uncheck Show tooltips so no tooltips display on hover.
Right click on the Month axis and format the axis so the dates in the Scale section are displayed as abbreviated dates
Edit the same axis and remove the title, then format the whole chart to
remove all row and column dividers
to set the row banding with a band size of 1
adjust the colour of the row and column gridlines to be a slightly darker shade of grey
Lorna Brown provided a refresher on map layers and spatial calculations for the #WOW2022 challenge this week. You’ll need Tableau Desktop v2020.4 or later to complete this due to the functionality incorporated.
The data set provided contains rows of origin & destination airports including the latitude and longitude values for each.
I found there were some fields in the provided data set which I don’t think should have been there. It meant when I came to naming some of my fields, I had to be more creative due to the existing ones.
We’ll start off by setting up the calculations required.
Firstly we need to create a spatial object out of our Origin and Destination airports.
Next we’re going to need to have a line to connect these
Origin-Dest Line
MAKELINE([Origin],[Destination])
We’re going to need ‘buffer’ to define the circle displayed. The size of this is to be defined by the user, so we’ll need a parameter
pBufferSize
integer parameter defaulted to 1000
and with this, we can then define the buffer
Buffer from Origin
BUFFER([Origin],[pBufferSize], ‘miles’)
Finally we need to determine whether the destination airport is within the buffer ‘zone’.
Within Buffer
[Distance (Miles)]<=[pBufferSize]
Note: Distance (Miles) already existed within the downloaded data set. I was expecting to have to calculate the distance myself, given the nature of the challenge. If I had had to create it, I would have used the calculation DISTANCE([Origin], [Destination], ‘miles’)
This is all we need to build the viz. I’m going to start from the top – down ie Origin airport -> Destination airport -> Lines – >Buffer.
On a new sheet, add Origin Airport to Filter and set to LAS. Also add Within Buffer to the Filter shelf and set to True.
Then drag Origin onto the main canvas area and drop it when you see ‘Show Me’ displayed on the cursor. This will automatically add all the required fields into the relevant locations
Change the mark to a circle, increase the size and set the colour to black. Add Origin Airport to Label and Origin Name to Tooltip, and adjust the tooltip accordingly.
This is the first map layer.
Now drag Destination onto the canvas and drop it on the ‘stacked layer’ icon when you get the Add a Marks Layer option appear
This will create a new marks card on the left, and is the 2nd marks layer.
Add Destination Name to the Detail shelf, then change the mark type to square. Add Destination Airport, Origin Airport and Distance (Miles) to Tooltip and adjust tooltip to match.
Finally move the Destination marks card so it is below the Origin marks card. Click on the Destination card and drag to below the Origincard and drop when you see the orange line appear
Now drag Origin-Dest Line onto the canvas and drop it on the ‘stacked layer’ icon when you get the Add a Marks Layer option appear to create the 3rd marks layer. This will add all the ‘spokes’.
Change the colour to grey. Add Destination City to the Detail shelf. Add Origin Name, Origin Airport, Distance (Miles) and Destination Airport to the Tooltip and adjust accordingly. Finally, drag this marks card so that it is now below the Destination marks card.
Final layer now – the buffer zone.
Drag Buffer from Origin onto the canvas and drop it on the ‘stacked layer’ icon when you get the Add a Marks Layer option appear to create the 4th marks layer. Adjust the colour to light grey, and reduce the opacity to suit. Then move this marks card to the bottom, so its beneath the Origin-Dest Line marks card.
And that’s it. Add the sheet onto a dashboard, and show the Origin Airport filter as a Single Value dropdown so only 1 origin airport can be selected at a time. I also customised the control so the All option did not display either. Add the pBufferSize parameter to the display too, and test the viz by changing the size and the airport.
It was Kyle’s turn to set this jittered box plot challenge this week. While it may sound complicated, this is quite a straightforward challenge this week, made more so as Kyle very kindly provides references to other blogs which help you.
So let’s build…
Firstly you need to download the 2 excel files Kyle provides, then relate them via the Team field (this should happen automatically).
On a new sheet add Team to Columns, Age to Rows and Player Code to Detail. Change the mark type to circle, and reduce the Size slightly.
From the Analytics pane, drag box plot onto the sheet and drop onto the Cell image that displays.
Create a new field which is the key field for the jittering functionality
Jitter
RANDOM()
This just generates a random number between 0 and 1.
Add this to Columns and change the view to Entire View so its not all squashed up.
We’ve got our jittered box plot. Now we just have to add in the additional functionality required.
Drag the Playoffs field so that it’s above the line in the data pane on the right hand side (ie change it from a measure to a dimension). Then right click > Aliases to alias to 0 and 1 values to the labels required
Add this field to the Columns in front of the Team field. Re-order so ‘Playoff Teams’ is listed first (I just click on the field name and drag it to the left).
We also need to sort the data based on the median value per team. We need a new field for this.
Median Age per Team
{FIXED [Team]:MEDIAN([Age])}
Add a sort to the Team field, so it sorts by the Median Age per Team descending
Add League to the Filter shelf and set to AL.
Format the worksheet and set the Column Banding to be level 0 and band size 1 to shade the sections as required.
Then adjust the format of the gridlines to remove all row and column gridlines.
Finally, add Name to the Tooltip and adjust accordingly. Then hide the Jitter axes (uncheck Show Header) and adjust the Age axis so it is fixed to start at 18.
You can now add this to a dashboard and you’re done! My published viz is here.
Sean Miller set this week’s #WOW2022 challenge based on a common requirement – how to allow users to navigate a hierarchy of data while capitalising on the the real estate available to display the data.
The charts required for this challenge are very simple, so I’m not going to spell out how to build these. I created 4 charts
Trend – Sales by Month line chart
by Category – Sales by Category horizontal bar chart
by Sub-Category – Sales by Sub-Category horizontal bar chart
by Product – Sales by Product Name horizontal bar chart
Now all the remaining functionality to drive the navigation through the hierarchy, how the charts are filtered at each level and whether the chart should display or not, will be driven by parameter actions. So for this we will need 3 parameters
pCategorySelected
string parameter defaulted to <nothing> (empty string)
We need similar parameters for pSubCategorySelected and also pProductSelected.
Controlling the sheet swap & filtering the charts
On a dashboard, add the Trend sheet, then below it add a vertical container.
Within the vertical container add the by Category sheet, the by Sub-Category sheet and the by Product sheet. Remove the title from all these sheets. Show the 3 parameters.
We’ll now set up some calculated fields to determine when each of the bar chart sheets should display or not.
Filter: Show Category
[pCategorySelected]=”
If this parameter is empty, we want to show the category bar chart. Add this field to the Filter shelf of the by Category sheet and set to True.
Type the word ‘Furniture’ into the pCategorySelected parameter box and press return. The by Category sheet should disappear from the dashboard.
We now do a similar calculation for the by Sub-Category sheet
Filter: Show Sub-Category
[Category]=[pCategorySelected] AND [pSubCategorySelected]=”
This field is filtering the bar chart based on the selected category
Add this to the Filter shelf of the by Sub-Category and set to True. The bars should now just display the sub-categories associated to the Furniture category.
Now type the word ‘Chairs’ into the pSubCategorySelected parameter box. The by Sub-Category sheet should also now disappear from the dashboard.
Finally we also need to now ensure the by Product sheet is filtered to the relevant Sub-Category.
Filter: Show Product
[Sub-Category]=[pSubCategorySelected]
Add this to the Filter shelf of the by Product sheet and set to True. Only products associated to Chairs should now be listed.
Now we’ve set all this up, we also need to ensure the Trend sheet is getting filtered based on all the selections being made.
Filter : Trend
([Category]=[pCategorySelected] OR [pCategorySelected]=”) AND ([Sub-Category]=[pSubCategorySelected] OR [pSubCategorySelected]=”) AND ([Product Name]=[pProductSelected] OR [pProductSelected]=”)
Add this to the Filter shelf of the Trend sheet and set to True.
Type in an appropriate value into the pProductSelected parameter box (eg Global Task Chair, Black) and see how the trend changes.
Setting the parameters
This will all be done with parameter actions – there’s a few 🙂
+ Drill down to show Subcategories within <Category>
Use the Insert link to add the <Category> field to the action title – this will then be set dynamically based on the bar being selected.
Set the action to apply to the by Category sheet only, and via the Menu option. It should impact the pCategorySelected parameter and retain it’s value when unselected. The Category field should be passed into the parameter.
Delete all the values from the parameter boxes, so they’re all empty. This should reset the dashboard so only the by Category sheet is displayed under the trend. Hover/click on a bar to show the tooltip and click on the link. The pCategorySelected parameter should be populated and the bar chart displayed now changes.
We’re going to create a similar parameter action for the drill down from by Sub-Category to the by Product sheet
+ Drill down to show Products within <Sub-Category>
This time the action applies to the by Sub-Category sheet on the Menu action, and sets the pSubCategorySelected parameter with the Sub-Category value, again retaining the value when cleared.
On this sheet, we also need an action to allow us to ‘drill up’. We need to set the pCategorySelected parameter back to nothing. For this we need an additional calculated field
Level Up : Category
”
Add this field to the Detail shelf on the by Sub-Category sheet.
The back on the dashboard, add a further parameter action
– Drill Up to show all Categories
The action runs on the Menu of the by Sub-Category sheet only, setting the pCategorySelected parameter with the value from the Level Up: Category field. Again the value should be retained when deselected.
Test the functions. The drill down should display the by Product sheet. Then manually delete the value in the pSubCategorySelected parameter, and test the drill up action.
We now need to deal with the actions from the by Product sheet
+ Filter dashboard to <Product Name>
This action runs on the Menu of the by Product sheet only and passes the Product Name field into the pProductSelected parameter. This time though, when the bae is unselected, the parameter should be cleared to ‘blank’.
Next we’ll add the drill up function back to the sub-categories.
Similarly we need to set the pSubCategorySelected parameter back to empty string, so we need
Level Up: Sub-Category
”
Add this to the Detail shelf of the by Product sheet. Also add the Category field to the Detail shelf.
+ Drill Up to show Subcategories within <Category>
The action applies to the Menu of the by Product sheet only, passing the Level Up : SubCategory field into the pSubCategorySelected parameter. The value should be retained when cleared. Note the Category field was required so it could be added to the menu action title.
Test the actions, and verify the behaviour of the parameter boxes as each selection is made.
The dynamic title
The title of the trend line keeps track of the options selected during the navigation. For some reason, I used a separate sheet, but that’s not needed and actually goes against the requirements on 4 sheets only. So I’ll describe how to dynamically set the title on the Trend sheet instead. We’ll need some additional fields
Title – Category
IF [pCategorySelected] <> ” THEN ‘for ‘ + [pCategorySelected] ELSE ” END
Title – Sub-Category
IF [pSubCategorySelected] <> ” THEN ‘-> ‘ + [pSubCategorySelected] ELSE ” END
Title – Product
IF [pProductSelected] <> ” THEN ‘-> ‘ + [pProductSelected] ELSE ” END
Add all these fields to the Detail shelf of the Trend sheet, then update the title
All of this should now mean the core requirements of the challenge have been met.
Bonus – Extending the tooltip width
The bonus step was to extend the width of the tooltip so no word-wrapping existed. I did this by creating a Viz in Tooltip.
On a separate sheet I added Category and Sales to the Text shelf and formatted so they were aligned as required.
This sheet was then referenced from the by Category tooltip where I then adjusted the width to 350 and the height to 75
I repeated this creating similar sheets for Sub-Category and Product Name.
You just now need to tidy up the dashboard – add a text box to act as a title for the bar charts section, format the titles to be grey and remove the parameters from the display. My published viz is here.
Erica Hughes had a table-calc-tastic challenge for us this week! I was using Tableau before LoDs were invented, so became very familiar with the ‘dark art’ of table calculations, and subconsciously often turn to these first when solving problems. Since the advent of LoDs, this functionality can get forgotten about, so this is a great challenge to help flex those table calc muscles and become a good reference point.
With any table calc challenge, I work out what I need in a tabular form before even attempting any visual, so that’s where we’ll start today.
I also tend to ‘build up’ the calculated fields I need, as this helps me validate the data I’m working with. It means I end up with more calculated fields than absolutely necessary, but it’s a good practice to get in as it eases troubleshooting in the long run.
In this example, we need to treat Sales as if it’s a budget that is then being ‘spent’ over the course of the following months. The first thing we need to define is the total budget
Total Sales
TOTAL(SUM([Sales]))
We then need to understand the cumulative (running sum) of Sales per month.
Running Sum Sales
RUNNING_SUM(SUM([Sales]))
Note – this calculation can be achieved by applying a Quick Table Calculation to the Sales pill once added to a view, but I will need to reference the values in other calculated fields, so created the field directly.
Let’s start building out our table of data to see what’s going on with these calculations.
Add Order Date to Rows and set so its displaying the date in the Month Year format
Then add Total Sales, Sales and Running SumSales
The Total Sales column will be the same for every row and match the same value in the last row of the Running Sum Sales column. The value in each Running Sum Sales row is the sum of the Sales value in the same and all preceding rows.
By default the table calculations are working ‘down’ the table which gives the desired result, but I tend to ‘fix’ the field the calculation is computing over, as when we build the viz we won’t be going ‘down’, but ‘across’, so fixing helps ensure we get all our settings just right.
So edit the table calculation of both the Total Sales and the Running Sum Sales fields to compute using Month of Order Date.
With these fields, we can calculate the ‘sales budget’ value being displayed as
Total Less Running Sum Sales
[Total Sales]- [Running Sum Sales]
Pop this on the view and verify the table calculation settings are set as above (this field contains nested calcs, so check each setting). You should be able to verify the value of this column is the result of the 1st column – 3rd column values
But there’s a twist.
The requirements state that “for dates in the future, the sales budget should remain constant”.
For this we need to work out what month ‘today’ is in. For this task, I have hard coded ‘today’ into a parameter called Today and set to 8th March 2022. If this was a ‘real’ production business dashboard, I’d just refer to the function TODAY() directly.
I can then work out
Current Month
DATETRUNC(‘month’, [Today])
which will return 01 March 2022 in this instance.
I then want to identify the record that matches the current month
And with this, I can then determine the value from the Total Less Running SumSales column that is associated to March 2022, and ‘spread’ that value across every row in the view.
Curr Month: Total Less Running Sum Sales
WINDOW_MAX(IF ATTR([Is Current Month]) THEN [Total Less Running Sum Sales] END)
If the month in the row is the current month, get the required value and ‘spread’ over every other row using WINDOW_MAX. Add this to the view, checking your table calc settings again.
Now we can work out the values needed for the Sales Budget line
Sales Budget
IF MIN(DATETRUNC(‘month’, [Order Date])) > MIN(DATE([Current Month])) THEN [Curr Month: Total Sales Less Running Sum Sales] ELSE [Total Less Running Sum Sales] END
Format this to $, Millions (M), 1dp.
If the month is later than the current month, use the value associated to the current month, otherwise use the Total Less Running Sum Sales value. Note here, the date functions are wrapped within a MIN function as the other fields are table calculations which means they’ve been aggregated, so all other fields referenced need to be aggregated to. The function MAX will have worked just as well.
Phew! we’ve finally got the data we need for the first line :-). As mentioned earlier, this can be achieved by combining some of the logic in the calcs, but I like to be methodical and verify my numbers give me what I expect at each stage.
In order to display the Estimated Budget line, we need to first work out how much of the total sales would be spent each month, if the same amount was spent each month – ie Total Sales divided by number of months.
Size (Count of Months)
SIZE()
I just chose to use the SIZE() table calculation to essentially count the number of rows in my view.
Estimated Budget Constant
//average the sales over the total months to get a constant budget [Total Sales] / [Size (Count of Months)]
Add these into the view and adjust the table calc settings as before
essentially Total Sales (2,297,201) / Count of Months (48) = Estimated Budget Constant (47,858).
Like before, we need to compute running sum of this estimated budget
Running Sum Est Budget
RUNNING_SUM([Estimated Budget Constant])
and then we can calculate the Estimated Budget
Total Less Running Sum Est Budget
[Total Sales] – [Running Sum Est Budget]
This now gives us the data to plot the 2nd line.
The final calculation we need for the tooltip is the difference between the sales budget and estimated budget
Difference to Estimated
[Sales Budget]- [Total Less Estimated Running Sum]
format this using a custom format of +”$”#,##0,K;-“$”#,##0,K
Now we can build the viz! I tend to keep sheets like above in any workbook as a ‘check sheet’ if I need to do any troubleshooting later on.
So on a new sheet, add Order Date to Rows and set to be a continuous (green) month/year format this time. Add Sales Budget to Columns. Adjust the table calculation so all nested calculations are computing by Order Date.
Add Total Less Estimated Running Sum to Columns (set the table calc settings), then change to dual axis and synchronise axis.
Remove Measure Names from the All Marks card to remove the colours that have been set. Change the Colour of the Sales Budget line to purple, and set the markers to have circles.
From the Analytics tab, drag a Trend Line to the canvas and drop it as a linear trend on the Total Less Estimated Running Sum measure
This will add the ‘dotted’ line.
On the Total Less Estimated Running Sum marks card, set the Opacity of the Colour to 0%, so only the grey dotted trend line is visible.
Edit the trendline and uncheck Show recalculated line for highlighted or selected data points
Add Current Month to the Detail shelf of the All Marks card, and set to the month/year format. Then right click on the Order Date axis and Add Reference Line, setting the values as below
Right click on the reference line, and Format; adjust the alignment and font size & colour, so ‘Future’ is listed at the top.
Add Difference to Estimated onto the Tooltip of the Sales Budget card (adjust those table calc settings). The format the tooltip accordingly.
Add Curr Month: Total Sales Less Running Sum Sales to the Detail shelf of the All Marks Card (adjust those table calc settings). Edit the title of the viz
Finally tidy up the display by removing axis, row and column banding etc, and adjust the Sales Budget axis so it displays every 500,000.
This week, Luke set the challenge which is focussed on manipulating time. Medical admissions over many days is represented in a bar chart which spans a 24 hour period. All admissions needs to be ‘bucketed’ into 15 minute intervals over the 24 hours ie admissions between midnight and 12:14am is counted within the same 12:00am ‘bucket’.
The data for this challenge is embedded within a workbook which you need to download via the challenge page. I did as Luke instructed; downloaded the workbook, deleted all the existing sheets, then re-saved as my own file.
The first step that is required is to ‘baseline’ / normalise the admission dates so they all look to be on the same day.
There’s different ways to do this; on this occasion I used the 3rd method from this Tableau KB, although I simply hardcoded a date of 1 Jan 2023 rather than use TODAY(). It doesn’t matter what this particular date is, its just an arbitrary date.
Once we’ve got this, we then need to manipulate this date again to ‘group’ into the 15 min interval. This isn’t something I know ‘just to do’, but I know I’ve done it before. So a quick google was needed and I used this blog for the required calculation.
Pop these fields out into a table to see how these calculated fields are working
The Baseline Admission Date 15 mins is what we’ll use for the x-axis. The next step is work out the value being plotted Stays per Day.
Now when the challenge was first placed, a couple of the requirements were missing, so there was a bit of head-scratching trying to figure out what numbers were being used to get the values presented.
The following fields need to be added to the Filter shelf:
Stay Type = Outpatient
Admission Date starting from 30 Aug 2017 00:00
When on the filters shelf, both these should then be added to Context, as the data needs to be filtered before the LOD calc we need to use gets calculated (defined below).
Count Days with 15 min interval
{FIXED [Baseline Admission Date 15 mins]: COUNTD(DATETRUNC(‘day’, [Admission Date]))}
This is counting the distinct days when there was admission within each 15 minute period ie if there were 2 admissions on the same day within the same 15 minute window, the day would only count as 1.
From this we can then compute
Stays per Day
SUM([Number of Records])/SUM([Count Days with 15 min interval])
Now we’ve got the data we need, so we can build the viz.
Add Baseline Admission Date 15 mins to Columns as a green continuous pill, and Stays per Day to Rows. Don’t forget to add Admissions Date and Stay Type to the Filter shelf as mentioned above.
The bars look ‘blocky’. You can manually adjust the size, but you might notice that the widths between isn’t exact – the whitespace looks larger between some bars than others. To resolve this, I created a field to control the size, which is based on the number of 15 minute intervals there are in a 24 hour period – 96.
Size
1/96
Add this to the Size shelf, change the aggregation to MIN, and adjust the size to be Fixed and aligned Centre.
Add another instance of Stays per Day to the Rows shelf. Then make it dual axis and synchronise axis. Change the mark type of the 2nd Stays per Day instance to Gantt and change the colour. Then change the colour of the 1st Stays per Day.
Show mark labels, and set the Label just to show the max value.
Right click on the time axis, and format, and custom format to h:nn am/pm
And essentially, that’s it. There’s formatting to do to remove the secondary axis, column & row banding etc and add tooltips, but the core of the viz is complete.
My published instance is here. Note the time formatting seems to be an issue on Tableau Public. Someone did comment that this was an issue with the MAKETIME function, but I didn’t actually use this function.