
Luke decided to set us a challenge this week based on Human Resources (HR) data, using a dataset provided as part of the #RWFD Real World Fake Data project managed by Mark Bradbourne.
The challenge focussed on reporting monthly headcount, using a dataset that contained 1 row per employee with a column relating to the start date (Hire Date) and a column relating to the date the employee left (Termdate), which could have been a future date or NULL/blank.
I’ve worked on headcount reporting before in my day job, but have used a snapshotted data source, which captures the active employees as at midnight on the 1st day of every month. This method makes counting the headcount each month very simple, as it’s just counting the number of rows per snapshot date.
Obviously, this wasn’t an option in this case. Luke gave very little clues as to what approach would be required, apart from saying ‘you will also need to manipulate the data’….
A word about the data…
Just a little side note here…. I ended up in discussions with my fellow #WOW participant Rosario Gauna as after attempting one method, I wasn’t getting as high numbers as the solution originally suggested. There were rows in the data set that had NULL values for most of the data except job title, but I think this was only around 225 rows, so even including them I didn’t get close to the figures. Rosario however, was somehow managing to see approx 15,000 rows of null data. I ended up building and publishing my initial solutions based on a version of the data sent to me directly by Rosario.
Since then, it seems others were also confused, and so the solution was adjusted to ignore all the null rows. As a result I reworked one of my solutions to use the original data source I downloaded from the site, and it’s this I will focus on in the blog.
A word about my solutions…
With little instruction, it took a bit of head-scratching to figure out how to work through the problem. I initially decided to use a scaffold dataset to help. I built a simple file in Excel which contained 1 column listing 1 date per month which went from 01 Jan 2017 to 01 Jan 2021. I then joined this to the HR data in Tableau Desktop using the physical data layer (ie I did not use relations), using join calculations as below (note, this is using the amended copy of data I received where Termdate was already defined as a date field.

The solution using this method is published here.
However, I didn’t feel completely comfortable with this approach. Luke had suggested the difficulty of this was 9/10 and referred to nested tableau calculations… the above solution just used quick table calcs. So I decided to see if I could come up with an alternative, just using the provided data, and that’s what I’ll document below.
Setting up the data
I downloaded the Human Resources.csv file from the link provided, and connected to it in Tableau Desktop. I found the Termdate field presented as a string field, so I converted it to a date, just by clicking on the ABC symbol of the data type against the field. I also added a data source filter to exclude all records where the Id field was NULL.
I then added a new data source, and connected to the same file again. This meant I had 2 instances of the Human Resources data source listed in my Data pane. I renamed the 2nd instance to be Leavers Only, changed the data type of the Termdate field again and added a data source filter so that only records with non-Null Termdate values were retained.

Note – What I will be doing here is using the Human Resources data source to manage the months we’re reporting headcount over. This method only works if there is at least 1 new starter every month, which in this case there is.
Building the calculations
The requirements state that a new starter doesn’t count in the monthly headcount figures until the following month, so the date field we need to use in the output, needs to be created as
Report Date
DATE(DATETRUNC(‘month’,DATEADD(‘month’,1, [Hire Date])))
This basically shifts the date the new starter joined to the 1st day of the following month.
To make the calculations easier to read, I also created
Starters
COUNT([2022_04_06_WW14_Human Resources.csv])
which is literally a reference to the field that automatically gets added as part of the connection – it’s just a bit of a mouthful as its named according the data connection.
Let’s pop these out in a table as below – I’m deliberately using the dateparts of the Report Date field as they’ll be used in the viz like this.

So this is just showing the new starters against the month at which they will count for headcount. We need to get details of leavers now, and we’ll do this by blending.
When I blend data sources, I tend to create specific fields so it’s clear to me that they’re being used for blending, rather than defining blend relationships between differently named fields. So in the Human Resources data source I’ve created
BLEND – Date
[Report Date]
and then in the Leavers data source, I’ve created
BLEND – Date
DATE(DATETRUNC(‘month’,DATEADD(‘month’,1,[Termdate])))
which is shifting the Termdate forward to the the 1st of the next month, since leavers should also only be recorded in the following month.
In the Human Resources data source I then created
Leavers Per Month
ZN(COUNT([2022_04_06_WW14_Human Resources – Leavers Only].[2022_04_06_WW14_Human Resources.csv]))
which is just referencing the automatically generated ‘count’ field from the Leavers data source. I’ve wrapped it in ZN so 0 is reported in the event no match is found.
Add Leavers Per Month into the table, and ensure the linking field from the secondary Leaver data source is connected on the BLEND – Date field

So now we know how many starters and leavers per month, we now need to total these up by generating a cumulative running sum of the starters, but subtracting any leavers along the way.
Official Headcount
([Starters] + PREVIOUS_VALUE(0)) – [Leavers Per Month]
Takes the value of the Starters in the current row, adds it to the value of Official Headcount from the previous row, then subtracts any Leavers recorded against the current row. Add this onto the table, and as its a table calculation, edit it so it is explicitly computing my the month & year of Report Date

For the annual change, we’re looking to compare the difference of the Official Headcount value for the current month eg Jan 2021, with the Official Headcount value for the same month 1 year ago eg Jan 2020.
Annual Change
(ZN([Official Headcount]) – LOOKUP(ZN([Official Headcount]), -12)) / ABS(LOOKUP(ZN([Official Headcount]), -12))
Take the Official Headcount from the current row, subtract the Official Headcount from 12 rows before, then divide the result by the Official Headcount from 12 rows before.
Format this to a percentage at 1 dp. (For some reason my numbers here don’t seem to match Luke’s revised solution…).
Again add to the sheet and set the table calcs of both the nested calcs to compute by month and year

The final measure was a bit of a strangely named one IMO. I felt I should be doing some rolling calculation, but ultimately, it just seemed to be reporting the difference between the headcount now and that 12 months ago ie the numerator of the calculation above.
Net Rolling 12 Change
(ZN([Official Headcount]) – LOOKUP(ZN([Official Headcount]), -12))
Pop this into the view, adjust the table calc settings again

Building the viz
On a new sheet, add Report Date at the month level, but then set to be continuous, to Columns and Report Date at the year level to Colour. Add Official Headcount to Rows (adding the linking field for the blend when prompted), and adjust the table calc settings to compute over both month & year, ensuring the Year of Report Date is listed first.

Then in the legend, select all the years from 2000-2017 and ‘hide’ (this is probably a cheat way, but the quickest – we need to retain the data from the previous years, so can’t just ‘filter’).
Edit the axis to not start from zero. Adjust colours to suit.

Add Annual Change and Net Rolling 12 Change to the Rows shelf, adjust both of the table calc settings, so all nested calcs are computing by Year & then Month.
Edit the axis of both these additional measures to also exclude 0. On the All marks card, click the Label button and tick the Show mark labels option.

Final steps ….
…on the All marks card, click the Tooltip button and uncheck Show tooltips so no tooltips display on hover.
Right click on the Month axis and format the axis so the dates in the Scale section are displayed as abbreviated dates

Edit the same axis and remove the title, then format the whole chart to
- remove all row and column dividers
- to set the row banding with a band size of 1
- adjust the colour of the row and column gridlines to be a slightly darker shade of grey

And you’re done 🙂 My published viz is here
Had a bit of all sorts this week… I’ll be intrigued to see how Luke solved it!
Happy vizzin’!
Donna






















































