Are Sales on track with Goal?

This week’s #WOW challenge was set by Ann primarily to demonstrate the recent 2020.2 release of relationships in Tableau. This is a change to how the data modelling works when combining data together, in this case actual sales data and sales goal (target) data. Ann provided a ‘tweaked’ version of superstore sales data for the actual sales, alongside a custom ‘goals’ data set which provided a sales goal value for each month per Segment and Category (ie the goals data was not as granular as the actual sales data).

Combining the Data

Obviously, to make use of the new relationships feature, you’ll need at least v2020.2.1 installed.

When you add the 2 data sources to the Data Source pane and add the Orders sheet from the Superstore data and then add the Sheet 1 from the Sales Goals data, you’re prompted to add a relationship if matches can’t automatically be found (ie the field names aren’t identical).

I added relationships for the Segment, Category and Month of Order Date fields, which then provides a ‘noodle’ link between the two sets of data

Note – to add multiple relationships, you have to close the field listing to then see the Add more fields option.

With relationships, Tableau is now smarter at how it manages the queries between the data. I’m not going to go into it at length (as there’s plenty of good Tableau documentation on the subject), but I’ll quickly show the difference between this and a traditional ‘join’.

In the Relationships model, if I show Sales and Goal by Month & Segment, the Goal is based on the sum of the rows associated to that Month and Segment that you can find if you looked at the source spreadsheet directly

But if I was using a traditional join with these data sets :

and then presented the same info as above

The Goal data is way off, as it’s being influenced by all the additional rows of data that have been constructed by the join clause.

If we expand to the Category level in each example above (the level at which the Goal data is stored), the Relationship version still shows the correct value

while the Join version is still incorrect, and only changing to an Average aggregation do we get the figure needed

Joining the data requires more work (calculations / knowledge) to ensure the Goal data is reporting correctly at the various levels, whereas the Relationships model ‘just works’.

Building the Table

When I started building the solution, I actually started with the YoY trend chart, but found I had to revisit it (swap a couple of pills about) as I progressed through, so I’m going to start with the table of data itself.

This turned out to be really quite tricksy… it’s just a table right… looks harmful enough…. but Ann had lobbed a few ‘gotcha’ grenades into the mix, that took a bit of head-scratching!

First up, Ann likes to use capitalisation, so I simply renamed the following fields

  • Category (from Sheet 1 – the Goals data) -> CATEGORY
  • Segment (from Sheet 1 – the Goals data) -> SEGMENT
  • Month of Order Date (from Sheet 1 – the Goals data) -> MONTH
  • Sub-Category (from Orders – the Superstore Sales data) -> SUB-CATEGORY

And I then created a hierarchy called SEGMENT which stored SEGMENT – > CATEGORY -> SUB-CATEGORY.

We also needed to ‘fake’ today so we can show sales to date. The Orders Superstore data set contains data from the start of 2016 to the end of 2019. We needed to pretend ‘today’ was 1st July 2019 and not show any sales beyond the end of the month. So I created the field

Today

#2019-07-01#

and then

ACTUAL SALES

IF DATETRUNC(‘month’,[Order Date])<= [Today] THEN [Sales] END

If we build out the basics of the table

  • SEGMENT hierarchy on Rows
  • MONTH (discrete, exact date, formatted to mmmm yyyy) on Rows
  • Measure Names on Columns and on Filter, filtered to ACTUAL SALES and Goal
  • Measure Values on Text
  • MONTH on Filter, set to exclude 2016

we get

and as we expand the hierarchy to the next level, the ACTUAL SALES and Goal adjusts as expected

but when we reach the 3rd level we see a couple of issues

  1. The Goal is showing the values that were displayed at the CATEGORY level, since it isn’t set at any further level. The requirement we have is to show the value ‘No Goal’ instead once we reach this point.
  2. We also have a set of values with a NULL SUB-CATEGORY. This is because there are dates with goal values in the Goals data set, but we have no Sales data for those dates, so a match at the SUB-CATEGORY level can’t be found. We need to remove these records when at this level, and actually only show records where we have ACTUAL SALES, so records from July 2019 onwards should also be excluded.

At this point, I did end up having a sneak look at Ann’s solution. I had an idea of what I wanted to do, BUT I was worried that I may be missing something related to the new data model… that perhaps there was a feature of relationships that I needed to be exploiting as part of this, rather than go down the table calculation route I was heading towards. After all, part of doing these challenges for me is to make use of new features & functions if I can, so I have a reference workbook for future use. I had a read up on the blog posts related to relationships, but couldn’t spot anything obvious, hence a quick check at Ann’s solution…. nope nothing special required in respect of the data model….so instead, we need some additional calculations to help resolve all this.

What we need is some way to identify when we’re at the lowest SUB-CATEGORY level. We’ll start by counting the number of distinct sub-categories we have

# Sub Cats

ZN(COUNTD([SUB-CATEGORY]))

Wrapped in a ZN will return the value 0 when none exist.

Dropping this into the table of data, we can see the 0s for all the future Goal records, and 1 for the rest of the rows, since we’ve expanded to the SUB-CATEGORY level.

If you collapse up to the CATEGORY level, the #Sub Cats count changes, as each row typically contains more than 1 SUB-CATEGORY.

The term ‘typically’ in the above statement though is the reason we can’t assume we’re at the lowest level if the count is 1 though. It is possible for the sales against a particular CATEGORY in a MONTH to only span a single SUB-CATEGORY (Check out Home Office -> Furniture in Jan 2018). So we also need

Max Sub-Cats in Window

WINDOW_MAX([# Sub-Cats])

Which will count the maximum number of SUB-CATEGORY records for each SEGMENT and CATEGORY. In the case of Furniture Sales in the Home Office in Jan 2018, you can see below the count of sub-categories is 1 but the maximum count in the window is 4.

This shows we’re not at the SUB-CATGEORY level, as when we are, the #Sub Cats = Max Sub Cats in Window (ie 1=1)

So with that knowledge we can now derive an alternative field for the goal

SALES GOAL

IF ([# Sub-Cats] = 1 AND [# Sub-Cats]=[Max Sub-Cats in Window]) THEN 0 ELSE SUM([Goal]) END

Remove the original Goal field from the table and add this instead, setting the table calculation to compute by MONTH and you’ll see the SALES GOAL value change from a value to 0 as you expand down to the SUB-CATEGORY level.

So how do we show ‘NO GOAL’ instead of 0? Well this is a bit of a sneaky formatting trick, and one I couldn’t figure out (this step was actually the last thing I changed before publishing).

In my original SALES GOAL calculation I had set the value to be NULL rather than 0 expecting there to be an option to show the NULL values as something else (ie NO GOAL). But the feature I was looking for wasn’t available (maybe I was mistaken about when it appears.. maybe it’s disappeared in 2020.2… I still need to investigate). Anyhow, my alternative was to create a string field to store the value in, but this would make formatting the number quite cumbersome, and I was sure that wasn’t the case.

So I had to check Ann’s solution. Instead of NULL, she had set the value to be 0, and consequently used the following custom number formatting :

“$”#,##0;-“$”#,##0;”NO GOAL”

where the last bit is how to format 0. Very very sneaky huh!

So we’ve worked out how to identify when we’re at the SUB-CATEGORY level, and managed to display ‘NO GOAL’ when at that level, now we need to filter out some of the rows (but only when at this level).

Records to Show

IF [Max Sub-Cats in Window] = 0 THEN FALSE
ELSEIF ([# Sub-Cats] = 1 AND [# Sub-Cats]=[Max Sub-Cats in Window])
THEN //at lowest level
IF ATTR([MONTH]) < [Today] THEN TRUE
ELSE FALSE
END
ELSEIF [# Sub-Cats] = 0 AND [Max Sub-Cats in Window] = 1 THEN FALSE //still at lowest level but no sales so exclude
ELSE TRUE
END

This took a bit of trial & error to work out, and there may well be something much more simplified.

Adding this to the table, and again setting to compute by MONTH you can see which rows are reporting True and False. When collapsed to the SEGMENT or CATEGORY level, all rows are True, but the values start to vary when at the SUB-CATEGORY level.

Now, we need to work on the fields to help show us the actual & % difference.

RESULT

IF ([# Sub-Cats] = 1 AND [# Sub-Cats]=[Max Sub-Cats in Window]) THEN NULL ELSE SUM([ACTUAL SALES])-([SALES GOAL]) END

The difference should only show if we’re not at the lowest level. This needs to be custom formatted as

↑#,##0;↓#,##0

I use this site to find the arrows needed.

ACTUAL vs GOAL

[RESULT]/SUM([ACTUAL SALES])

will give me the % difference (format to percentage with 0 dp).

And the final piece of data we need (I think) is to compute a RAG status for each row. This is dependent on some parameters

GREEN (WITHIN X%)

Type Float, set to 0.1 but displayed as a percentage

and similarly

RED (ABOVE Y%)

With these we can work out

RAG

IF ABS([ACTUAL vs GOAL]) > [RED (ABOVE Y%)] THEN ‘WAY OFF TRACK’
ELSEIF ABS([ACTUAL vs GOAL]) < [GREEN (WITHIN X%)] THEN ‘ON TRACK’ ELSEIF ATTR([MONTH])>=[Today] THEN ‘NO COMPARISON’
ELSEIF [SALES GOAL]=0 THEN ‘NO COMPARISON’
ELSE ‘OFF TRACK’
END

If you add all these fields to your table then you can check how they all compare/change as you expand/collapse the hierarchy (make sure you set the compute by MONTH on all fields).

So now we’ve got all the building blocks you can create the actual table.

I’d recommend duplicating the above (and keeping as a ‘check data’ sheet). Then remove the fields you don’t need, and add the Records To Keep = True to the Filter shelf. Format the table appropriately, and add the title so you’re left with

Building the Line Chart

As mentioned above, I started with this chart, but ended up having to revisit it as I found I needed to use different pills that got created as I built the table. There is actually 1 additional field needed, which isn’t obvious until you start playing with the dashboard actions later:

GOAL

IF [SALES GOAL] > 0 THEN [SALES GOAL] END

This is needed so that when we filter by SUB-CATEGORY later and there is no goal, we only get a single line displaying the sales values, rather than also having a line plotted at 0.

  • MONTH on Filter, excluding Year 2016
  • MONTH (exact date, continuous) on Columns
  • Measure Values on Rows and Filtered to ACTUAL SALES and GOAL
  • Coloured accordingly by Measure Names

Then add TODAY to Detail and add a Reference Line to the MONTH axis, setting it to TODAY, adjusting the format of the line, and setting the fill above colour to grey.

Format the gridlines, tooltips & axis accordingly to match

Barcode / Strip Chart

For this we’re creating a bar chart using MIN(1), and fixing the axis to range from 0-1 so it fills up the entire space as follows

  • MONTH on Filter excluding Year 2016
  • MONTH as exact date, discrete (blue pill) on Columns
  • MIN(1) on Rows
  • RAG on Colour adjusted accordingly
  • ACTUAL vs GOAL on Tooltip

Then use the ‘type in’ functionality to create a pill labelled ‘vs Goal’ on the Rows

Creating the Dashboard

I used a vertical container to add the components onto the dashboard, as I used blank objects that are formatted with a black background, 0 padding and a height of 4 to create the horizontal line separators.

The interactivity between the table and the other charts is managed via Dashboard Filter actions.

To get these to work properly though, and in such a way that the selected SEGMENT and/or CATEGORY and/or SUB-CATEGORY can be referenced in the title of the line chart, 3 separate dashboard actions are required, with each one specifying the field to filter by, as follows :

Once these are set up, the associated pills will appear on the Filter shelf of the Line and Barcode charts, and are then available for selection when building the title of the line chart

You should now have your completed chart.

My published version is here.

Happy vizzin’! Stay Safe!

Donna

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s