Can you track headcount?

April 7, 2022 Donna ColesLeave a comment

Luke decided to set us a challenge this week based on Human Resources (HR) data, using a dataset provided as part of the #RWFD Real World Fake Data project managed by Mark Bradbourne.

The challenge focussed on reporting monthly headcount, using a dataset that contained 1 row per employee with a column relating to the start date (Hire Date) and a column relating to the date the employee left (Termdate), which could have been a future date or NULL/blank.

I’ve worked on headcount reporting before in my day job, but have used a snapshotted data source, which captures the active employees as at midnight on the 1st day of every month. This method makes counting the headcount each month very simple, as it’s just counting the number of rows per snapshot date.

Obviously, this wasn’t an option in this case. Luke gave very little clues as to what approach would be required, apart from saying ‘you will also need to manipulate the data’….

A word about the data…

Just a little side note here…. I ended up in discussions with my fellow #WOW participant Rosario Gauna as after attempting one method, I wasn’t getting as high numbers as the solution originally suggested. There were rows in the data set that had NULL values for most of the data except job title, but I think this was only around 225 rows, so even including them I didn’t get close to the figures. Rosario however, was somehow managing to see approx 15,000 rows of null data. I ended up building and publishing my initial solutions based on a version of the data sent to me directly by Rosario.

Since then, it seems others were also confused, and so the solution was adjusted to ignore all the null rows. As a result I reworked one of my solutions to use the original data source I downloaded from the site, and it’s this I will focus on in the blog.

A word about my solutions…

With little instruction, it took a bit of head-scratching to figure out how to work through the problem. I initially decided to use a scaffold dataset to help. I built a simple file in Excel which contained 1 column listing 1 date per month which went from 01 Jan 2017 to 01 Jan 2021. I then joined this to the HR data in Tableau Desktop using the physical data layer (ie I did not use relations), using join calculations as below (note, this is using the amended copy of data I received where Termdate was already defined as a date field.

The solution using this method is published here.

However, I didn’t feel completely comfortable with this approach. Luke had suggested the difficulty of this was 9/10 and referred to nested tableau calculations… the above solution just used quick table calcs. So I decided to see if I could come up with an alternative, just using the provided data, and that’s what I’ll document below.

Setting up the data

I downloaded the Human Resources.csv file from the link provided, and connected to it in Tableau Desktop. I found the Termdate field presented as a string field, so I converted it to a date, just by clicking on the ABC symbol of the data type against the field. I also added a data source filter to exclude all records where the Id field was NULL.

I then added a new data source, and connected to the same file again. This meant I had 2 instances of the Human Resources data source listed in my Data pane. I renamed the 2nd instance to be Leavers Only, changed the data type of the Termdate field again and added a data source filter so that only records with non-Null Termdate values were retained.

Note – What I will be doing here is using the Human Resources data source to manage the months we’re reporting headcount over. This method only works if there is at least 1 new starter every month, which in this case there is.

Building the calculations

The requirements state that a new starter doesn’t count in the monthly headcount figures until the following month, so the date field we need to use in the output, needs to be created as

Report Date

DATE(DATETRUNC(‘month’,DATEADD(‘month’,1, [Hire Date])))

This basically shifts the date the new starter joined to the 1st day of the following month.

To make the calculations easier to read, I also created

Starters

COUNT([2022_04_06_WW14_Human Resources.csv])

which is literally a reference to the field that automatically gets added as part of the connection – it’s just a bit of a mouthful as its named according the data connection.

Let’s pop these out in a table as below – I’m deliberately using the dateparts of the Report Date field as they’ll be used in the viz like this.

So this is just showing the new starters against the month at which they will count for headcount. We need to get details of leavers now, and we’ll do this by blending.

When I blend data sources, I tend to create specific fields so it’s clear to me that they’re being used for blending, rather than defining blend relationships between differently named fields. So in the Human Resources data source I’ve created

BLEND – Date

[Report Date]

and then in the Leavers data source, I’ve created

BLEND – Date

DATE(DATETRUNC(‘month’,DATEADD(‘month’,1,[Termdate])))

which is shifting the Termdate forward to the the 1st of the next month, since leavers should also only be recorded in the following month.

In the Human Resources data source I then created

Leavers Per Month

ZN(COUNT([2022_04_06_WW14_Human Resources – Leavers Only].[2022_04_06_WW14_Human Resources.csv]))

which is just referencing the automatically generated ‘count’ field from the Leavers data source. I’ve wrapped it in ZN so 0 is reported in the event no match is found.

Add Leavers Per Month into the table, and ensure the linking field from the secondary Leaver data source is connected on the BLEND – Date field

So now we know how many starters and leavers per month, we now need to total these up by generating a cumulative running sum of the starters, but subtracting any leavers along the way.

Official Headcount

([Starters] + PREVIOUS_VALUE(0)) – [Leavers Per Month]

Takes the value of the Starters in the current row, adds it to the value of Official Headcount from the previous row, then subtracts any Leavers recorded against the current row. Add this onto the table, and as its a table calculation, edit it so it is explicitly computing my the month & year of Report Date

For the annual change, we’re looking to compare the difference of the Official Headcount value for the current month eg Jan 2021, with the Official Headcount value for the same month 1 year ago eg Jan 2020.

Annual Change

(ZN([Official Headcount]) – LOOKUP(ZN([Official Headcount]), -12)) / ABS(LOOKUP(ZN([Official Headcount]), -12))

Take the Official Headcount from the current row, subtract the Official Headcount from 12 rows before, then divide the result by the Official Headcount from 12 rows before.

Format this to a percentage at 1 dp. (For some reason my numbers here don’t seem to match Luke’s revised solution…).

Again add to the sheet and set the table calcs of both the nested calcs to compute by month and year

The final measure was a bit of a strangely named one IMO. I felt I should be doing some rolling calculation, but ultimately, it just seemed to be reporting the difference between the headcount now and that 12 months ago ie the numerator of the calculation above.

Net Rolling 12 Change

(ZN([Official Headcount]) – LOOKUP(ZN([Official Headcount]), -12))

Pop this into the view, adjust the table calc settings again

Building the viz

On a new sheet, add Report Date at the month level, but then set to be continuous, to Columns and Report Date at the year level to Colour. Add Official Headcount to Rows (adding the linking field for the blend when prompted), and adjust the table calc settings to compute over both month & year, ensuring the Year of Report Date is listed first.

Then in the legend, select all the years from 2000-2017 and ‘hide’ (this is probably a cheat way, but the quickest – we need to retain the data from the previous years, so can’t just ‘filter’).

Edit the axis to not start from zero. Adjust colours to suit.

Add Annual Change and Net Rolling 12 Change to the Rows shelf, adjust both of the table calc settings, so all nested calcs are computing by Year & then Month.

Edit the axis of both these additional measures to also exclude 0. On the All marks card, click the Label button and tick the Show mark labels option.

Final steps ….

…on the All marks card, click the Tooltip button and uncheck Show tooltips so no tooltips display on hover.

Right click on the Month axis and format the axis so the dates in the Scale section are displayed as abbreviated dates

Edit the same axis and remove the title, then format the whole chart to

remove all row and column dividers
to set the row banding with a band size of 1
adjust the colour of the row and column gridlines to be a slightly darker shade of grey

And you’re done 🙂 My published viz is here

Had a bit of all sorts this week… I’ll be intrigued to see how Luke solved it!

Happy vizzin’!

Donna

Can you use multiple mark layers?

March 31, 2022 Donna ColesLeave a comment

Lorna Brown provided a refresher on map layers and spatial calculations for the #WOW2022 challenge this week. You’ll need Tableau Desktop v2020.4 or later to complete this due to the functionality incorporated.

The data set provided contains rows of origin & destination airports including the latitude and longitude values for each.

I found there were some fields in the provided data set which I don’t think should have been there. It meant when I came to naming some of my fields, I had to be more creative due to the existing ones.

We’ll start off by setting up the calculations required.

Firstly we need to create a spatial object out of our Origin and Destination airports.

Origin

MAKEPOINT([Origin Latitude],[Origin Longitude])

Destination

MAKEPOINT([Destination Latitude],[Destination Longitude])

Next we’re going to need to have a line to connect these

Origin-Dest Line

MAKELINE([Origin],[Destination])

We’re going to need ‘buffer’ to define the circle displayed. The size of this is to be defined by the user, so we’ll need a parameter

pBufferSize

integer parameter defaulted to 1000

and with this, we can then define the buffer

Buffer from Origin

BUFFER([Origin],[pBufferSize], ‘miles’)

Finally we need to determine whether the destination airport is within the buffer ‘zone’.

Within Buffer

[Distance (Miles)]<=[pBufferSize]

Note: Distance (Miles) already existed within the downloaded data set. I was expecting to have to calculate the distance myself, given the nature of the challenge. If I had had to create it, I would have used the calculation DISTANCE([Origin], [Destination], ‘miles’)

This is all we need to build the viz. I’m going to start from the top – down ie Origin airport -> Destination airport -> Lines – >Buffer.

On a new sheet, add Origin Airport to Filter and set to LAS. Also add Within Buffer to the Filter shelf and set to True.

Then drag Origin onto the main canvas area and drop it when you see ‘Show Me’ displayed on the cursor. This will automatically add all the required fields into the relevant locations

Change the mark to a circle, increase the size and set the colour to black. Add Origin Airport to Label and Origin Name to Tooltip, and adjust the tooltip accordingly.

This is the first map layer.

Now drag Destination onto the canvas and drop it on the ‘stacked layer’ icon when you get the Add a Marks Layer option appear

This will create a new marks card on the left, and is the 2nd marks layer.

Add Destination Name to the Detail shelf, then change the mark type to square. Add Destination Airport, Origin Airport and Distance (Miles) to Tooltip and adjust tooltip to match.

Finally move the Destination marks card so it is below the Origin marks card. Click on the Destination card and drag to below the Origin card and drop when you see the orange line appear

Now drag Origin-Dest Line onto the canvas and drop it on the ‘stacked layer’ icon when you get the Add a Marks Layer option appear to create the 3rd marks layer. This will add all the ‘spokes’.

Change the colour to grey. Add Destination City to the Detail shelf. Add Origin Name, Origin Airport, Distance (Miles) and Destination Airport to the Tooltip and adjust accordingly. Finally, drag this marks card so that it is now below the Destination marks card.

Final layer now – the buffer zone.

Drag Buffer from Origin onto the canvas and drop it on the ‘stacked layer’ icon when you get the Add a Marks Layer option appear to create the 4th marks layer. Adjust the colour to light grey, and reduce the opacity to suit. Then move this marks card to the bottom, so its beneath the Origin-Dest Line marks card.

And that’s it. Add the sheet onto a dashboard, and show the Origin Airport filter as a Single Value dropdown so only 1 origin airport can be selected at a time. I also customised the control so the All option did not display either. Add the pBufferSize parameter to the display too, and test the viz by changing the size and the airport.

My published version of the viz is here.

Happy vizzin’!

Donna

Can you create a jittered box plot?

March 24, 2022 Donna ColesLeave a comment

It was Kyle’s turn to set this jittered box plot challenge this week. While it may sound complicated, this is quite a straightforward challenge this week, made more so as Kyle very kindly provides references to other blogs which help you.

So let’s build…

Firstly you need to download the 2 excel files Kyle provides, then relate them via the Team field (this should happen automatically).

On a new sheet add Team to Columns, Age to Rows and Player Code to Detail. Change the mark type to circle, and reduce the Size slightly.

From the Analytics pane, drag box plot onto the sheet and drop onto the Cell image that displays.

Create a new field which is the key field for the jittering functionality

Jitter

RANDOM()

This just generates a random number between 0 and 1.

Add this to Columns and change the view to Entire View so its not all squashed up.

We’ve got our jittered box plot. Now we just have to add in the additional functionality required.

Drag the Playoffs field so that it’s above the line in the data pane on the right hand side (ie change it from a measure to a dimension). Then right click > Aliases to alias to 0 and 1 values to the labels required

Add this field to the Columns in front of the Team field. Re-order so ‘Playoff Teams’ is listed first (I just click on the field name and drag it to the left).

We also need to sort the data based on the median value per team. We need a new field for this.

Median Age per Team

{FIXED [Team]:MEDIAN([Age])}

Add a sort to the Team field, so it sorts by the Median Age per Team descending

Add League to the Filter shelf and set to AL.

Format the worksheet and set the Column Banding to be level 0 and band size 1 to shade the sections as required.

Then adjust the format of the gridlines to remove all row and column gridlines.

Finally, add Name to the Tooltip and adjust accordingly. Then hide the Jitter axes (uncheck Show Header) and adjust the Age axis so it is fixed to start at 18.

You can now add this to a dashboard and you’re done! My published viz is here.

Happy vizzin!

Donna

Can you navigate a hierarchy?

March 17, 2022 Donna ColesLeave a comment

Sean Miller set this week’s #WOW2022 challenge based on a common requirement – how to allow users to navigate a hierarchy of data while capitalising on the the real estate available to display the data.

The charts required for this challenge are very simple, so I’m not going to spell out how to build these. I created 4 charts

Trend – Sales by Month line chart
by Category – Sales by Category horizontal bar chart
by Sub-Category – Sales by Sub-Category horizontal bar chart
by Product – Sales by Product Name horizontal bar chart

Now all the remaining functionality to drive the navigation through the hierarchy, how the charts are filtered at each level and whether the chart should display or not, will be driven by parameter actions. So for this we will need 3 parameters

pCategorySelected

string parameter defaulted to <nothing> (empty string)

We need similar parameters for pSubCategorySelected and also pProductSelected.

Controlling the sheet swap & filtering the charts

On a dashboard, add the Trend sheet, then below it add a vertical container.

Within the vertical container add the by Category sheet, the by Sub-Category sheet and the by Product sheet. Remove the title from all these sheets. Show the 3 parameters.

We’ll now set up some calculated fields to determine when each of the bar chart sheets should display or not.

Filter: Show Category

[pCategorySelected]=”

If this parameter is empty, we want to show the category bar chart. Add this field to the Filter shelf of the by Category sheet and set to True.

Type the word ‘Furniture’ into the pCategorySelected parameter box and press return. The by Category sheet should disappear from the dashboard.

We now do a similar calculation for the by Sub-Category sheet

Filter: Show Sub-Category

[Category]=[pCategorySelected] AND [pSubCategorySelected]=”

This field is filtering the bar chart based on the selected category

Add this to the Filter shelf of the by Sub-Category and set to True. The bars should now just display the sub-categories associated to the Furniture category.

Now type the word ‘Chairs’ into the pSubCategorySelected parameter box. The by Sub-Category sheet should also now disappear from the dashboard.

Finally we also need to now ensure the by Product sheet is filtered to the relevant Sub-Category.

Filter: Show Product

[Sub-Category]=[pSubCategorySelected]

Add this to the Filter shelf of the by Product sheet and set to True. Only products associated to Chairs should now be listed.

Now we’ve set all this up, we also need to ensure the Trend sheet is getting filtered based on all the selections being made.

Filter : Trend

([Category]=[pCategorySelected] OR [pCategorySelected]=”)
AND
([Sub-Category]=[pSubCategorySelected] OR [pSubCategorySelected]=”)
AND
([Product Name]=[pProductSelected] OR [pProductSelected]=”)

Add this to the Filter shelf of the Trend sheet and set to True.

Type in an appropriate value into the pProductSelected parameter box (eg Global Task Chair, Black) and see how the trend changes.

Setting the parameters

This will all be done with parameter actions – there’s a few 🙂

+ Drill down to show Subcategories within <Category>

Use the Insert link to add the <Category> field to the action title – this will then be set dynamically based on the bar being selected.

Set the action to apply to the by Category sheet only, and via the Menu option. It should impact the pCategorySelected parameter and retain it’s value when unselected. The Category field should be passed into the parameter.

Delete all the values from the parameter boxes, so they’re all empty. This should reset the dashboard so only the by Category sheet is displayed under the trend. Hover/click on a bar to show the tooltip and click on the link. The pCategorySelected parameter should be populated and the bar chart displayed now changes.

We’re going to create a similar parameter action for the drill down from by Sub-Category to the by Product sheet

+ Drill down to show Products within <Sub-Category>

This time the action applies to the by Sub-Category sheet on the Menu action, and sets the pSubCategorySelected parameter with the Sub-Category value, again retaining the value when cleared.

On this sheet, we also need an action to allow us to ‘drill up’. We need to set the pCategorySelected parameter back to nothing. For this we need an additional calculated field

Level Up : Category

”

Add this field to the Detail shelf on the by Sub-Category sheet.

The back on the dashboard, add a further parameter action

– Drill Up to show all Categories

The action runs on the Menu of the by Sub-Category sheet only, setting the pCategorySelected parameter with the value from the Level Up: Category field. Again the value should be retained when deselected.

Test the functions. The drill down should display the by Product sheet. Then manually delete the value in the pSubCategorySelected parameter, and test the drill up action.

We now need to deal with the actions from the by Product sheet

+ Filter dashboard to <Product Name>

This action runs on the Menu of the by Product sheet only and passes the Product Name field into the pProductSelected parameter. This time though, when the bae is unselected, the parameter should be cleared to ‘blank’.

Next we’ll add the drill up function back to the sub-categories.

Similarly we need to set the pSubCategorySelected parameter back to empty string, so we need

Level Up: Sub-Category

”

Add this to the Detail shelf of the by Product sheet. Also add the Category field to the Detail shelf.

+ Drill Up to show Subcategories within <Category>

The action applies to the Menu of the by Product sheet only, passing the Level Up : SubCategory field into the pSubCategorySelected parameter. The value should be retained when cleared. Note the Category field was required so it could be added to the menu action title.

Test the actions, and verify the behaviour of the parameter boxes as each selection is made.

The dynamic title

The title of the trend line keeps track of the options selected during the navigation. For some reason, I used a separate sheet, but that’s not needed and actually goes against the requirements on 4 sheets only. So I’ll describe how to dynamically set the title on the Trend sheet instead. We’ll need some additional fields

Title – Category

IF [pCategorySelected] <> ” THEN ‘for ‘ + [pCategorySelected] ELSE ” END

Title – Sub-Category

IF [pSubCategorySelected] <> ” THEN ‘-> ‘ + [pSubCategorySelected] ELSE ” END

Title – Product

IF [pProductSelected] <> ” THEN ‘-> ‘ + [pProductSelected] ELSE ” END

Add all these fields to the Detail shelf of the Trend sheet, then update the title

All of this should now mean the core requirements of the challenge have been met.

Bonus – Extending the tooltip width

The bonus step was to extend the width of the tooltip so no word-wrapping existed. I did this by creating a Viz in Tooltip.

On a separate sheet I added Category and Sales to the Text shelf and formatted so they were aligned as required.

This sheet was then referenced from the by Category tooltip where I then adjusted the width to 350 and the height to 75

I repeated this creating similar sheets for Sub-Category and Product Name.

You just now need to tidy up the dashboard – add a text box to act as a title for the bar charts section, format the titles to be grey and remove the parameters from the display. My published viz is here.

Happy vizzin’!

Donna

Sales Budget “Burndown” Chart

March 10, 2022 Donna ColesLeave a comment

Erica Hughes had a table-calc-tastic challenge for us this week! I was using Tableau before LoDs were invented, so became very familiar with the ‘dark art’ of table calculations, and subconsciously often turn to these first when solving problems. Since the advent of LoDs, this functionality can get forgotten about, so this is a great challenge to help flex those table calc muscles and become a good reference point.

With any table calc challenge, I work out what I need in a tabular form before even attempting any visual, so that’s where we’ll start today.

I also tend to ‘build up’ the calculated fields I need, as this helps me validate the data I’m working with. It means I end up with more calculated fields than absolutely necessary, but it’s a good practice to get in as it eases troubleshooting in the long run.

In this example, we need to treat Sales as if it’s a budget that is then being ‘spent’ over the course of the following months. The first thing we need to define is the total budget

Total Sales

TOTAL(SUM([Sales]))

We then need to understand the cumulative (running sum) of Sales per month.

Running Sum Sales

RUNNING_SUM(SUM([Sales]))

Note – this calculation can be achieved by applying a Quick Table Calculation to the Sales pill once added to a view, but I will need to reference the values in other calculated fields, so created the field directly.

Let’s start building out our table of data to see what’s going on with these calculations.

Add Order Date to Rows and set so its displaying the date in the Month Year format

Then add Total Sales, Sales and Running Sum Sales

The Total Sales column will be the same for every row and match the same value in the last row of the Running Sum Sales column. The value in each Running Sum Sales row is the sum of the Sales value in the same and all preceding rows.

By default the table calculations are working ‘down’ the table which gives the desired result, but I tend to ‘fix’ the field the calculation is computing over, as when we build the viz we won’t be going ‘down’, but ‘across’, so fixing helps ensure we get all our settings just right.

So edit the table calculation of both the Total Sales and the Running Sum Sales fields to compute using Month of Order Date.

With these fields, we can calculate the ‘sales budget’ value being displayed as

Total Less Running Sum Sales

[Total Sales]- [Running Sum Sales]

Pop this on the view and verify the table calculation settings are set as above (this field contains nested calcs, so check each setting). You should be able to verify the value of this column is the result of the 1st column – 3rd column values

But there’s a twist.

The requirements state that “for dates in the future, the sales budget should remain constant”.

For this we need to work out what month ‘today’ is in. For this task, I have hard coded ‘today’ into a parameter called Today and set to 8th March 2022. If this was a ‘real’ production business dashboard, I’d just refer to the function TODAY() directly.

I can then work out

Current Month

DATETRUNC(‘month’, [Today])

which will return 01 March 2022 in this instance.

I then want to identify the record that matches the current month

Is Current Month?

DATETRUNC(‘month’, [Order Date]) = DATE([Current Month])

which just returns a boolean True/False.

And with this, I can then determine the value from the Total Less Running Sum Sales column that is associated to March 2022, and ‘spread’ that value across every row in the view.

Curr Month: Total Less Running Sum Sales

WINDOW_MAX(IF ATTR([Is Current Month]) THEN [Total Less Running Sum Sales] END)

If the month in the row is the current month, get the required value and ‘spread’ over every other row using WINDOW_MAX. Add this to the view, checking your table calc settings again.

Now we can work out the values needed for the Sales Budget line

Sales Budget

IF MIN(DATETRUNC(‘month’, [Order Date])) > MIN(DATE([Current Month])) THEN [Curr Month: Total Sales Less Running Sum Sales] ELSE [Total Less Running Sum Sales] END

Format this to $, Millions (M), 1dp.

If the month is later than the current month, use the value associated to the current month, otherwise use the Total Less Running Sum Sales value. Note here, the date functions are wrapped within a MIN function as the other fields are table calculations which means they’ve been aggregated, so all other fields referenced need to be aggregated to. The function MAX will have worked just as well.

Phew! we’ve finally got the data we need for the first line :-). As mentioned earlier, this can be achieved by combining some of the logic in the calcs, but I like to be methodical and verify my numbers give me what I expect at each stage.

In order to display the Estimated Budget line, we need to first work out how much of the total sales would be spent each month, if the same amount was spent each month – ie Total Sales divided by number of months.

Size (Count of Months)

SIZE()

I just chose to use the SIZE() table calculation to essentially count the number of rows in my view.

Estimated Budget Constant

//average the sales over the total months to get a constant budget
[Total Sales] / [Size (Count of Months)]

Add these into the view and adjust the table calc settings as before

essentially Total Sales (2,297,201) / Count of Months (48) = Estimated Budget Constant (47,858).

Like before, we need to compute running sum of this estimated budget

Running Sum Est Budget

RUNNING_SUM([Estimated Budget Constant])

and then we can calculate the Estimated Budget

Total Less Running Sum Est Budget

[Total Sales] – [Running Sum Est Budget]

This now gives us the data to plot the 2nd line.

The final calculation we need for the tooltip is the difference between the sales budget and estimated budget

Difference to Estimated

[Sales Budget]- [Total Less Estimated Running Sum]

format this using a custom format of +”$”#,##0,K;-“$”#,##0,K

Now we can build the viz! I tend to keep sheets like above in any workbook as a ‘check sheet’ if I need to do any troubleshooting later on.

So on a new sheet, add Order Date to Rows and set to be a continuous (green) month/year format this time. Add Sales Budget to Columns. Adjust the table calculation so all nested calculations are computing by Order Date.

Add Total Less Estimated Running Sum to Columns (set the table calc settings), then change to dual axis and synchronise axis.

Remove Measure Names from the All Marks card to remove the colours that have been set. Change the Colour of the Sales Budget line to purple, and set the markers to have circles.

From the Analytics tab, drag a Trend Line to the canvas and drop it as a linear trend on the Total Less Estimated Running Sum measure

This will add the ‘dotted’ line.

On the Total Less Estimated Running Sum marks card, set the Opacity of the Colour to 0%, so only the grey dotted trend line is visible.

Edit the trendline and uncheck Show recalculated line for highlighted or selected data points

Add Current Month to the Detail shelf of the All Marks card, and set to the month/year format. Then right click on the Order Date axis and Add Reference Line, setting the values as below

Right click on the reference line, and Format; adjust the alignment and font size & colour, so ‘Future’ is listed at the top.

Add Difference to Estimated onto the Tooltip of the Sales Budget card (adjust those table calc settings). The format the tooltip accordingly.

Add Curr Month: Total Sales Less Running Sum Sales to the Detail shelf of the All Marks Card (adjust those table calc settings). Edit the title of the viz

Finally tidy up the display by removing axis, row and column banding etc, and adjust the Sales Budget axis so it displays every 500,000.

And that should be it… my published viz is here.

Happy vizzin’!

Donna

How many patients were admitted every 15 minutes?

March 3, 2022 Donna ColesLeave a comment

This week, Luke set the challenge which is focussed on manipulating time. Medical admissions over many days is represented in a bar chart which spans a 24 hour period. All admissions needs to be ‘bucketed’ into 15 minute intervals over the 24 hours ie admissions between midnight and 12:14am is counted within the same 12:00am ‘bucket’.

The data for this challenge is embedded within a workbook which you need to download via the challenge page. I did as Luke instructed; downloaded the workbook, deleted all the existing sheets, then re-saved as my own file.

The first step that is required is to ‘baseline’ / normalise the admission dates so they all look to be on the same day.

There’s different ways to do this; on this occasion I used the 3rd method from this Tableau KB, although I simply hardcoded a date of 1 Jan 2023 rather than use TODAY(). It doesn’t matter what this particular date is, its just an arbitrary date.

Baseline Admission Date

DATEADD(‘day’, DATEDIFF(‘day’, [Admission Date], #2023-01-01#), [Admission Date])

Once we’ve got this, we then need to manipulate this date again to ‘group’ into the 15 min interval. This isn’t something I know ‘just to do’, but I know I’ve done it before. So a quick google was needed and I used this blog for the required calculation.

Baseline Admission Date 15 mins

DATETIME((INT(FLOAT([Baseline Admission Date])*96))/96)

Pop these fields out into a table to see how these calculated fields are working

The Baseline Admission Date 15 mins is what we’ll use for the x-axis. The next step is work out the value being plotted Stays per Day.

Now when the challenge was first placed, a couple of the requirements were missing, so there was a bit of head-scratching trying to figure out what numbers were being used to get the values presented.

The following fields need to be added to the Filter shelf:

Stay Type = Outpatient
Admission Date starting from 30 Aug 2017 00:00

When on the filters shelf, both these should then be added to Context, as the data needs to be filtered before the LOD calc we need to use gets calculated (defined below).

Count Days with 15 min interval

{FIXED [Baseline Admission Date 15 mins]: COUNTD(DATETRUNC(‘day’, [Admission Date]))}

This is counting the distinct days when there was admission within each 15 minute period ie if there were 2 admissions on the same day within the same 15 minute window, the day would only count as 1.

From this we can then compute

Stays per Day

SUM([Number of Records])/SUM([Count Days with 15 min interval])

Now we’ve got the data we need, so we can build the viz.

Add Baseline Admission Date 15 mins to Columns as a green continuous pill, and Stays per Day to Rows. Don’t forget to add Admissions Date and Stay Type to the Filter shelf as mentioned above.

The bars look ‘blocky’. You can manually adjust the size, but you might notice that the widths between isn’t exact – the whitespace looks larger between some bars than others. To resolve this, I created a field to control the size, which is based on the number of 15 minute intervals there are in a 24 hour period – 96.

Size

1/96

Add this to the Size shelf, change the aggregation to MIN, and adjust the size to be Fixed and aligned Centre.

Add another instance of Stays per Day to the Rows shelf. Then make it dual axis and synchronise axis. Change the mark type of the 2nd Stays per Day instance to Gantt and change the colour. Then change the colour of the 1st Stays per Day.

Show mark labels, and set the **Label** just to show the max value.

Right click on the time axis, and format, and custom format to h:nn am/pm

And essentially, that’s it. There’s formatting to do to remove the secondary axis, column & row banding etc and add tooltips, but the core of the viz is complete.

My published instance is here. Note the time formatting seems to be an issue on Tableau Public. Someone did comment that this was an issue with the MAKETIME function, but I didn’t actually use this function.

Happy vizzin’! Stay Safe!

Donna

Let’s switch those measures!

February 24, 2022 Donna ColesLeave a comment

It was time for Lorna to add her iteration to the dashboard challenge that’s been set over the last few weeks. The challenge this week was to incorporate a measure swap control which allows the user to select which measures they want to compare.

Lorna gave the option to build out either your own or an already published solution, or to build a simpler brand new viz that just demonstrates the technique.

By incorporating this measure swap control into an existing solution, existing calculated fields and other objects will need to be adjusted/renamed. I started to build against my own solution, but that already had multiple instances of calculated fields where I’d chosen to change from a table calculation version in week 5 to an LoD based solution in week 6. I decided instead to use Kyle’s solution from week 7, which is published here.

However, for the purposes of this blog, I will build out the simple solution, so the focus is on what you need to do to build the control.

Building the Measure Selector

So to start, I simply copied the text for the secondary data source off the challenge page

I then opened Tableau Desktop, connected to the Orders sheet of the relevant Superstore Sales excel spreadsheet, then on the Data menu, selected Paste, which added a ‘clipboard’ data source and presented the data on a sheet.

To start with I chose to Alias the Number field (right click > Aliases) to map the numbers in the data source to the names of the measures we want, as below

The values selected are going to be captured within a parameter, so I then set these up

pMeasureX

Integer parameter defaulted to 1 (ie Sales)

I created a similar pMeasureY too, this one defaulted to 3 (ie Profit Margin).

For the selector control, we need to indicate which measure has been selected, so we need

X Selected?

[pMeasureX]=[Number]

and

Y Selected?

[pMeasureY]=[Number]

I then added Number to Rows and created 3 MIN(1) fields on the Columns shelf

On the first MIN(1) marks card, I changed the mark type to Shape, then added X Selected? to the Shape shelf. I adjusted the shape using options from the Ratings shape palette, and adjusted the size to suit.

On the 2nd MIN(1) marks card, I changed the mark type to Circle, reduced the size to as small as possible, and changed the colour to be completely transparent (Opacity = 0%). I then added Number to the Label shelf, changed the alignment to be Middle Centre, and adjusted the height of the rows, so all the text labels are visible.

On the 3rd MIN(1) marks card, I repeated the steps taken for the 1st marks card, but added Y Selected? to the Shape shelf instead.

I then hid the axes and the Number pill (uncheck Show Header), and removed all row/column borders and gridlines/zero lines.

When this sheet is added to the dashboard, we’re going to need to add parameter actions to pass information from this sheet into the pMeasureX and pMeasureY parameters. For this I found I needed to create copies of the Number field, as using the same field to control both selectors caused me some issues. So I created

Number (X)

[Number]

and added this to the Detail shelf of the 1st MIN(1) marks card.

And then

Number (Y)

[Number]

which was added to the Detail shelf of the 3rd MIN(1) marks card.

Another feature of this sheet when added to the dashboard, is we don’t want the circle selected to be highlighted/remain selected, so we’ll use a filter action to solve that. But again we need some additional fields on the sheet to help us with that. Create calculated fields

True

TRUE

False

FALSE

And add both of these to the Detail shelf of the All Marks card.

Finally adjust the title of the sheet to contain the text ‘X Select a Measure Y’. You’ll need to play around with the number of spaces to use. It’s best to wait until the sheet is on the dashboard to get it right.

Building the Scatter Plot

On a new sheet, the first thing needed for this challenge is to create

Profit Ratio

SUM([Profit])/SUM([Sales])

Now we need to define fields that will determine what measure to display based on the value stored in the relevant parameter

X Measure

CASE [pMeasureX]
WHEN 1 THEN SUM([Sales])
WHEN 2 THEN SUM([Profit])
WHEN 3 THEN [Profit Ratio]
WHEN 4 THEN SUM([Quantity])
END

Y Measure

CASE [pMeasureY]
WHEN 1 THEN SUM([Sales])
WHEN 2 THEN SUM([Profit])
WHEN 3 THEN [Profit Ratio]
WHEN 4 THEN SUM([Quantity])
END

Add X Measure to Columns and Y Measure to Rows, and add State to Detail.

We need to label the X & Y axis based on the name of the selected measure, so we need

X Dimension

CASE [pMeasureX]
WHEN 1 THEN ‘Sales’
WHEN 2 THEN ‘Profit’
WHEN 3 THEN ‘Profit Ratio’
WHEN 4 THEN ‘Quantity’
END

Add this to the Columns shelf

Y Dimension

CASE [pMeasureY]
WHEN 1 THEN ‘Sales’
WHEN 2 THEN ‘Profit’
WHEN 3 THEN ‘Profit Ratio’
WHEN 4 THEN ‘Quantity’
END

Add this to the Rows shelf

Edit both axes and remove the title from both. Right click on the word ‘Profit Ratio’ and Rotate Label. Right click on ‘X Dimension’ at the top and select hide field labels for columns. Right click on ‘Y Dimension’ on the left and select hide field labels for rows. Format to remove the row and column gridlines.

The State a user clicks on/selects needs to be captured within a parameter

pSelectedState

String parameter defaulted to New Jersey

We then need to determine if the State matches that selected

Is Selected State?

[State]= [pSelectedState]

Change the mark type to Circle, then add Is Selected State? to the Colour shelf and adjust to suit. Add a border to the mark too. Drag the ‘True’ value on the colour legend so its listed above ‘False’.

Add Is Selected State? to the Size shelf too, and adjust so the sizes are reversed, and alter the range.

Use the slider on the Size shelf too to make all the circles smaller too if need be.

Adjust the Tooltip so the text will be dynamic based on the measure selections too.

Then, modify the title of the sheet, so it too is dynamic.

Finally, this sheet too also needs to ensure the State circle clicked on doesn’t remain ‘selected’ so as above, create a True = TRUE and a False = FALSE calculated fields, and add these to the Detail shelf.

Adding the interactivity

Add both the sheets to a dashboard.

Create the following dashboard actions:

Set X Measure

A parameter action, that runs on Select of the Measure Selector sheet and passes Number (X) into the pMeasureX parameter.

Create a similar parameter action called **Set Y Measure** that passes **Number (Y)** into the **pMeasureY** parameter.

Create a dashboard filter action Selector Sheet – Unhighlight which on select of the Measure Selector sheet on the dashboard, targets the Measure Selector sheet directly, passing True = False into the filter. All values should show when the selection is cleared.

With these 3 actions, you should be able to test what happens when you click the different combinations.

Now we need 2 further dashboard actions

Set Selected State

A parameter action that on select of the Scatterplot passes the State field into the pSelectedState parameter.

Add finally, add another dashboard filter action, Scatter – Unhighlight that on select of the Scatter sheet on the dashboard, targets the scatter sheet directly, passing the fields True = False as the filter.

Hiding the Measure Selector

On the dashboard, select the Measure Selector sheet and select the Add Show/Hide Button from the context menu.

This will create a X button that can then be moved and positioned where required.

On the context menu of the X button, select Hide and the measure selector sheet will disappear, and the X will be replaced by a different image

Select Show from the context menu of this button, and the sheet will reappear. Test the functionality in presentation mode where simply clicking the button will invoke the show/hide behaviour.

And this should be the core functionality to demonstrate this measure swapping feature. My published viz based on this more basic solution is here.

If you’re aiming to build on the viz you’ve built over the last 3 weeks, then you need to replace any references you had to SUM([Sales]) and [Profit Ratio] in the various calculations with the X Measure and Y Measure fields accordingly. You’ll also need to adjust tooltips and titles of the various sheets using the X Dimension and Y Dimension fields as appropriate. My adapted solution built on from Kyle’s week 7 solution is here.

Happy vizzin’! Stay Safe!

Donna

Adding more detail & context

February 17, 2022 Donna ColesLeave a comment

This week it was Kyle’s turn to add a 3rd piece to the dashboard (see challenge here).

I chose to build on my previous week’s solution, which you can download from here, so all fields referenced etc will be based on that (ie I may reference fields that don’t exist in the solution published by the #WOW crew, and that have not been created as part of this stage of the solution).

For this part of the challenge, we’re looking to add a trend line, which displays a line for each row in the bar chart above, which means we need to display the median values of the Sales and Proft Ratio measures, since the ‘Other’ bar represents a group of multiple states.

We also need to colour and size the lines according to whether the bar chart has been clicked on or not.

First up, to build the line chart, we need to create some new LoD fields, as the ones we have so far have been FIXED at a State level, and now we need to consider the month of each order. I’m essentially going to repeat the steps I used when building the bar chart, but this time as the month of the Order Date into the calculation

Sales by State & Date

{FIXED [State], DATETRUNC(‘month’, [Order Date]): SUM([Sales])}

PR by State & Date

{FIXED [State], DATETRUNC(‘month’, [Order Date]): SUM([Profit])/SUM([Sales])}

From these, I can then create the measures I need for the display

Sales by Display State & Date

{FIXED [State To Display], DATETRUNC(‘month’, [Order Date]): MEDIAN([Sales by State & Date])}

format this to $ with 0 dp

PR by Display State & Date

{FIXED [State To Display], DATETRUNC(‘month’, [Order Date]): MEDIAN([PR by State & Date])}

format this to % with 0 dp

I can then build the initial line chart, with Order Date on Columns, set to the continuous month level (eg May 2020) and Sales by Display State & Date and PR by Display State & Date on Rows. Add State To Display to Detail.

Now we need to colour and size the lines based on a combination of whether the State is the one selected in the scatter plot, or whether the user has clicked on one of the bars.

I decided that I would capture the ‘state’ the user selected on the bar into a new parameter. So I created

pSelectedStateFromBar

A string parameter defaulted to ‘blank’

With this, I can the create

Size – Line

[State To Display] = [pSelectedStateFromBar]
OR
[Selected State]

This returns true if the State To Display field matches the value captured in the parameter, or the (existing) Selected State field is true.

Add this to the Size shelf on the All Marks card and adjust so the True option is slightly thicker that the False option. You may need also need to adjust the slider against the Size shelf to get the thickness just right.

We also use these 2 parameters to determine what the colour of the line needs to be

Colour – Line

IF [Selected State] AND ([pSelectedStateFromBar]=” OR [pSelectedStateFromBar]=[State]) THEN ‘dark’
ELSEIF [State To Display]=[pSelectedStateFromBar] THEN ‘mid’
ELSE ‘light’
END

Add this onto the Colour shelf. You’ll never have all 3 options displayed at the same time, so you’ll need to set the pSelectedSateFromBar to empty and to a value in order to adjust the colours. You’ll also need to ensure ‘dark’ is listed above ‘light’ and then ‘mid’ is listed above ‘light’ so these lines appear ‘on top’.

The chart just needs tidying up now – edit/remove the axis titles, reduce the size of the text on the axis, remove gridlines and set the tooltips.

Once done, you’re ready to add to the dashboard.

I rearranged my dashboard to have a horizontal layout container to with the scatter plot in the left hand column, and a vertical container in the right hand column. The vertical container then had the bar chart above the line chart.

The final step was to the then add a parameter action that on ‘select’ of the bar chart, it passed the State to Display value into the pSelectedStateFromBar parameter.

And that should be it. My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Adding Detail & Context

February 11, 2022 Donna ColesLeave a comment

It was Sean Miller’s turn to expand on last week’s #WOW2022 challenge, by adding an additional viz to the scatter plot (challenge details here).

The assumption is you should be able to build on the challenge solution you have built in the previous week. My solution to the Reference Box challenge is here. I adopted (and blogged about) a table calc solution. However, the published solution used LoDs. Both methods achieved the desired result, but I decided when starting this challenge, that I would build this ‘extension’ using LoDs too, so anyone who uses the published solution as a starting point, gets help via this blog.

So if you used my previous blog to build the challenge, you’ll need to first create the LoD equivalent of the calculated fields we used (note I did not change the scatter plot viz to use these fields) :

PR- 25th Percentile LoD

{PERCENTILE({FIXED [State]: SUM([Profit])/SUM([Sales]) },0.25)}

PR-75th Percentile LoD

{PERCENTILE({FIXED [State]: SUM([Profit])/SUM([Sales]) },0.75)}

Sales- 25th Percentile LoD

{PERCENTILE({ FIXED [State]: SUM([Sales]) },0.25)}

Sales- 75th Percentile LoD

{PERCENTILE({ FIXED [State]: SUM([Sales]) },0.75)}

Note – It’s also worth reiterating at this point, that I will be referencing calculated fields/parameters/objects in this blog created as part of my initial challenge.

In order to build the bar chart, we need to categorise each State into a grouping; the selected/highlighted state, the states in the reference box, all other states.

To identify the states in the reference box, we need to use FIXED LoDs to get the value of the Sales and Profit Ratio at the State level.

Sales by State

{FIXED [State]: SUM([Sales])}

PR by State

{FIXED [State]: SUM([Profit])/SUM([Sales])}

We can then use these fields along with the LoDs further above to determine whether a State is in the reference box

In Reference Box?

[PR by State]>=[PR-25th Percentile LOD] AND
[PR by State]<= [PR-75th Percentile LOD] AND [Sales by State]>=[Sales – 25th Percentile LOD] AND
[Sales by State]<= [Sales – 75th Percentile LOD]

This simply returns a boolean.

Now we can categorise each State

State to Display

IF [Selected State] THEN [State]
ELSEIF [In Reference Box?] THEN [State]
ELSE ‘Other (median)’
END

And with the above, we can then define the measures we want to show

Sales to Display

{FIXED [State To Display]: MEDIAN([Sales by State])}

PR to Display

{FIXED [State To Display]: MEDIAN([PR by State])}

And with these fields we can now build the bar chart.

Add Selected State to Rows and drag the dimension value, so True is listed before False.
Add In Reference Box? to Rows, and again drag so the True is listed before False.
Add State to Display to Rows
Add Sales to Display to Columns and Sort descending
Add PR to Display to Columns
On the All Marks Card add Selected State to Colour
Then add In Reference Box? to the Detail shelf. Then click on the … icon to the left of the In Reference Box? pill on the marks card, and change to the Colour icon. This should result in 2 fields on the Colour shelf.
Adjust the colours accordingly.

Add Sales to Display and PR to Display to the Tooltip shelf and adjust.
Change the titles of the axis
Remove row banding
Uncheck Show Header against Selected State and In Reference Band?
Hide field labels for Rows against the State to Display column heading

Add this sheet onto the dashboard and you’re done 🙂

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you build a reference box?

February 3, 2022 Donna Coles1 Comment

Erica Hughes set her first #WOW challenge this week, which is all about reference lines. The intention by the #WOW crew is to use this initial challenge as the basis for the next few weeks, so it’s going to be interesting to see how that works out.

But back to this week. We’re building a basic Sales vs Profit Ratio scatter plot by State, but then need various calculated fields to plot the reference lines and the reference box.

Building the basic chart
Adding the dotted reference lines
Adding the reference box
Making the y-axis symmetrical
Highlighting the selected state

Building the basic chart

First up we need to create our favourite calculated field

Profit Ratio

SUM([Profit]) / SUM( [Sales])

format to percentage with 0 dp.

Create the basic scatter plot as below

Remove all gridlines, zero lines and axis rulers and tick marks.

Adding the dotted reference lines

These lines are the median values for the Profit Ratio and Sales.

Sales – Median

WINDOW_MEDIAN(SUM([Sales]))

format to Custom Currency with 1 dp, $ prefix and display units to Thousands (K).

PR- Median

WINDOW_MEDIAN([Profit Ratio])

format to percentage with 0 dp.

Add both these fields to the Detail shelf, and adjust the table calculation settings of both fields so they are computing by State

Add a reference line to the Sales axis (I do this by right clicking on the axis > Add Reference Line, but you can also drag from the Analytics pane).

Set the reference line to be against the Entire Table, using the Sales – Median field, no label, and the tooltip set to custom as below. Set the line to be dotted.

Apply a similar reference line to the Profit Ratio axis, using the PR – Median value instead.

Adding the reference box

I have to admit, this bit took a bit of trial and error. I knew it was going to involve a combination of bands and lines and colouring above and below, but things didn’t always go to plan. It was this post by Jonathan Allenby that helped.

The block is basically bounded by the 25th & 75th percentile of the Sales and Profit Ratio values. So lets create them

Sales – 25th Percentile

WINDOW_PERCENTILE(SUM([Sales]),0.25)

Sales – 75th Percentile

WINDOW_PERCENTILE(SUM([Sales]),0.75)

Add these fields to the Detail shelf and set the table calculation on both to compute by State.

Add a reference line again to the Sales axis, and this time select Band that goes from the Sales – 25th Percentile to the Sales – 75th Percentile and fill based on the colour stated in the requirements.

Now add a reference line to the Profit Ratio axis. This time select Distribution.

Select the computation to be Percentiles and enter 25 as the value. Set Label, Tooltip and Line to None. Now this is the odd bit… set the Fill to the colour white, then check the Fill Below box, which then seems to set the Fill option to ‘No Fill’. Doing this seems to have the desired effect. Trying this in any other order, or manually selecting the Fill option to be No Fill, just doesn’t seem to work…

Now repeat the process adding another reference line on the Profit Ratio axis, and this time create a distribution for the 75th percentile. This time, after setting the Fill to white, check the Fill Above checkbox. Weirdly this time, the fill on my laptop got set the Grey Light, although it was correctly displaying as white on the screen. I manually changed it to ‘No Fill’.

It feels like there’s something a bit ‘buggy’ with all this, which might explain while my initial attempts were failing. I’ll be interested in knowing if you see this behaviour too (I created on v 2021.4.2).

Making the y-axis symmetrical

This was a ‘bonus’ feature, but is again achieved with a reference line. What we’re looking for here is to understand what is the maximum absolute value so we can determine where to place a hidden reference line – ie if the maximum profit ratio is 20% and the minimum is -30%, the maximum absolute value is 30%, and we’d need a reference line +30%, so the axis extends from -30% to +30%. Conversely if the maximum profit ratio is 40% and the minimum is -25%, the maximum absolute value is 40% and we need a reference line at -40% for symmetry.

I’ve encapsulated all this logic within one field below

Ref Line – Profit Ratio Symmetry

IF MAX(WINDOW_MAX([Profit Ratio]), ABS(WINDOW_MIN([Profit Ratio]))) = WINDOW_MAX([Profit Ratio]) THEN -1*WINDOW_MAX([Profit Ratio]) ELSE -1*WINDOW_MIN([Profit Ratio])
END

The statement MAX(WINDOW_MAX([Profit Ratio]), ABS(WINDOW_MIN([Profit Ratio]))) is returning the maximum of the max and min values (ie MAX(20, ABS(-30)) = MAX(20,30) = 30, MAX(40, ABS(-25)) = MAX(40,25) = 40.

Add this onto the Detail shelf and add another reference line to the Profit Ratio axis, setting it to use the Minimum of the Ref Line – Profit Ratio Symmetry field

Test how the field works, by selecting a few of the State marks towards the top and exclude; the axis should change, but still be symmetrical.

Highlighting the selected state

For this we need a parameter

pSelectedState

string parameter defaulted to Virginia

Then we need a field to identify which state has been captured

Selected State

[State]=[pSelectedState]

which returns a boolean. Change the mark type to Circle and add Selected State to the Colour shelf, and adjust accordingly (add a border to the circle too via the Colour shelf). Also add Selected State to the Size shelf and again tweak.

Once the sheet has been added to the dashboard, create a dashboard parameter action that passes the State field into the pSelectedState parameter on Select of the chart.

The final step to add is to stop all the other marks from ‘fading’ out when a State mark is selected. This is achieved by creating the following calculated fields

True

True

False

False

Add both of these to the Detail shelf of the scatter chart.

On the dashboard, create a dashboard filter action as below, which passes selected fields setting True = False, which can never be true and prevents the mark from being highlighted.

And with that, you should have the desired dashboard. I’m interested to see if this matches Erica’s solution (it’s likely I’ll start with the provided workbook next week, rather than mine, just in case there are some discrepancies – eg I’ve used a lot of table calcs… LoDs may have been possible…)

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Donna + DataViz

ramblings in all things Tableau (and occasionally some other Stuff….)

Can you track headcount?

Can you use multiple mark layers?

Can you create a jittered box plot?

Can you navigate a hierarchy?

Sales Budget “Burndown” Chart

How many patients were admitted every 15 minutes?

Let’s switch those measures!

Adding more detail & context

Adding Detail & Context

Can you build a reference box?