Can you find the top and bottom performers?

The challenge this week came from Candra McRae, where the focus was to use statistics to identify the top & bottom performers, rather than the more common ‘top n’ and ‘bottom n’. By statistics, we’re specifically looking records within the top 25th percentile and the bottom 25th percentile.

So let’s dive in.

• Identify Current date
• Defining the calculations
• Building the chart
• Month Selector and interaction

Identify Current date

The data we’re using is the Superstore Sales data from 2021.1 which includes data up to 31st Dec 2021. The requirement talks about the current rolling x months worth of data compared to the previous x months worth of data.

This means we need a way to determine what ‘current’ is. Typically, in a real sales environment, you’d probably only have data up to ‘today’, and I did consider working up a solution based on ‘today’, but equally I like to deliver a solution that I know matches the challenger, as it helps to validate my workings, and also I like to have a solution that I can look back on in the future and know there’s data.

So I took Candra’s hint and based ‘current’ off of the maximum date in the data set, derived by

Max Date

DATE({FIXED:MAX([Order Date])})

Defining the calculations

If you’re a regular reader of my blogs, you’ll know when I can, I like to build the data out into a tabular form, so I can verify the calculations, and then I’ll build out the viz.

First up, we want to get a value for the ‘current rolling n months’.

To define ‘n’ we need a parameter.

pRollingMonth

An integer defaulted to 12. It doesn’t need to be a list, as this will be populated via a parameter action from another sheet – more on that later.

IF [Order Date]>= DATEADD(‘month’, -1 * [pRollingMonth] ,DATETRUNC(‘month’,[Max Date])) AND [Order Date] <= [Max Date] THEN [Sales] END

let’s break this down… DATETRUNC(‘month’,[Max Date]) truncates the Max Date which is 31st Dec 2021 to the 1st of the month ie it returns 01 Dec 2021.

DATEADD(‘month’, -1 * [pRollingMonth] ,DATETRUNC(‘month’,[Max Date])) , is then going back to the 12 months prior (-1×12=-12) , so is 01 Dec 2020.

So we’re only going to get a sales value if the Order Date is >= 01 Dec 2020 and <= 31 Dec 2021 (essentially 13 months of sales data).

For the previous sales, we first need

Prev Month

so in our current example, this will be 30 Nov 2021.

and then to get the previous rolling 12 month sales, we can apply similar logic using Prev Month instead of Max Date

Previous Rolling Sales

IF [Order Date]>= DATEADD(‘month’, -1 * [pRollingMonth] ,DATETRUNC(‘month’,[Prev Month])) AND [Order Date] <= [Prev Month] THEN [Sales] END

Both these fields can be formatted to 1 decimal place, \$ prefix and format in thousands (k).

And then we also need a difference to display on the tooltip

Difference

SUM([Current Rolling Sales]) – SUM([Previous Rolling Sales])

This needs to be additionally formatted so that negatives are displayed in brackets ().

Add all these into a table and sort by the Current Rolling Sales descending

So we’ve got the data needed for the bar, the line and the tooltip. We now need to work on the crux of the challenge – the calculations needed to identify the top & bottom.

We’re looking to identify the 25th percentile value based on Current Rolling Sales values displayed on screen

25th Percentile

WINDOW_PERCENTILE(SUM([Current Rolling Sales]),0.25)

and also we need the 75th percentile

75th Percentile

WINDOW_PERCENTILE(SUM([Current Rolling Sales]),0.25)

If you pop these table calcs into the table, you’ll see the values for each field are the same for each row

and with these we can now identify where each row falls

Colour

IF SUM([Current Rolling Sales]) >= [75th Percentile] THEN ‘Top’
ELSEIF SUM([Current Rolling Sales]) <= [25th Percentile] THEN ‘Bottom’
ELSE ‘Middle’
END

Finally we need to identify the rows with a negative difference and flag with a circle.

Sales Contraction Indicator

IF [Difference]<0 THEN ‘●’ ELSE ” END

I use this site to get the symbols for these types of requirements,

Pop these two fields in to the table, and you’ve got all the data needed to build the chart:

Building the chart

Candra states that we can’t use a reference line to display the previous sales data, so for the core chart we need to build a dual axis chart plotting Sub-Category against Current Rolling Sales (bar chart) and Previous Rolling Sales (gantt chart).

Current Rolling Sales is coloured by Colour. I created a Label:Current Rolling Sales field just based on Current Rolling Sales but formatted to 0dp to add to the Label shelf.

Then add the Sales Contraction Indicator field between these 2 fields, and format the font of that field so it is in red text (getting a coloured circle, was the part of this challenge I struggled most over, yet it really was very simple once the penny dropped!).

Then hide the first Sub-Category field (uncheck Show Header) so it no longer displays.

Apply various formatting to remove the row & column lines, gridlines etc, and adjust the tooltip and you should be done.

Month Selector and Interaction

A separate sheet is needed for this. We need to build a basic viz that has 12 data points with values 1-12. And we can get this from the Order Date field

Month Order Date

MONTH([Order Date])

will return the month number

Add this field as a discrete (blue) pill to the Columns shelf, set the mark type to square, and add Month Order Date to the Label shelf too. Colour the mark pale grey, and remove all borders etc, and hide the headers

When you add the 2 sheets onto the dashboard, you need to set a parameter action from the Month Selector sheet that sets the pRollingMonth parameter, using the value from the Month Order Date field. When unselecting, the value should default back to 12.

Hopefully there’s enough here to get you to the end! My published viz is here.

Happy vizzin’! Stay Safe!

Donna