So back from my holibobs and catching up on the #WorkoutWednesday challenges I’ve missed while I’ve been sunning myself on a Greek beach.
First up is another guest challenge posted by Jeffrey Shaffer who asked us to create a Top N bar chart of the quantity per manufacturer per region on a single viz. The main twist with this is that a manufacturer could be ranked in different locations for each region. The full challenge is here.
In the 2019.1 Superstore Sales data set I had on my machine, there is no Manufacturer field. I couldn’t see one in the link to the dataset included on the blog page either. Perhaps I was missing something, but given I was on catch up, I chose not to find out the real reason. Instead I decided to derive my own Manufacturer field by just taking the first word of the Product Name. It meant my quantity figures wouldn’t match up, but the principles behind the challenge wouldn’t be affected.
MID([Product Name],1, FINDNTH([Product Name],’ ‘,1)-1)
Note – I’ve watched Ann’s solution (published on the challenge page), since I published my version, and Ann just lifts a pre-defined Manufacturer group onto her viz, which also already has an ‘Other’ manufacturer. My calculated field above has no such value. Ann’s solution is therefore a bit more straightforward than the work I had to put in to determine ‘Other’.
I interpreted ‘Other’ to be the total quantity of all the manufacturers not in the top n. But of course the top n had to take into account the Region, so it was possible that a manufacturer listed explicitly in the top n for the Central region say, might be included within the ‘other’ pot for another region.
To work this out, I created a quantity field per region ie
Central – Qty
IF [Region] = ‘Central’ THEN [Quantity] END
East – Qty
IF [Region] = ‘East’ THEN [Quantity] END
South – Qty
IF [Region] = ‘South’ THEN [Quantity] END
West – Qty
IF [Region] = ‘West’ THEN [Quantity] END
I then created a set per region to store the top n manufacturers based on the Quantity per region
The set was based on the Manufacturer field, and limited to the Top n Manufacturers parameter I set up, associated to the relevant [Region] – Qty field created above.
I had 4 sets in the end : Top Central, Top East, Top South, Top West
Note – It is entirely possible with this set up, that each set could contain a completely different set of manufacturers, so up to top n X 4 manufacturers in total
So having got a set which stored my top n per region, I needed to get the ‘Other’ grouping. To do this I created a new field
IF [Region] = ‘Central’ AND [Top Central] THEN [Manufacturer]
ELSEIF [Region] = ‘East’ AND [Top East] THEN [Manufacturer]
ELSEIF [Region] = ‘South’ AND [Top South] THEN [Manufacturer]
ELSEIF [Region] = ‘West’ AND [Top West] THEN [Manufacturer]
With the Top n parameter set to 10, this gives me
There are 12 rows listed, but only 11 values per region (the top 10 + other). You can see here that Wilson is within the top 10 for East & West, but not Central and South, whilst Hon is in top 10 for Central, East & South but not West. This is why there are 12 rows. If I changed the Top n parameter to 15, I’d get 20 rows listed, as there are 19 manufacturers across the top 15 for all the regions + other.
To get the sorting, we need to introduce an old friend :
Adding this as a discrete (blue) pill onto the rows, moving the Manufacturer Category onto Text, and setting the Table Calculation Properties of the Index field to …
- Compute over both Region and Manufacturer Category (with Region listed first)
- To restart every Region
- To sort by SUM of Quantity descending
… I start to get what I need
But I’ve still got 1 too many rows, I need to deal with the ‘Other’ pot. If my top n parameter is set to 10, then I essentially want to be able to show rows 1-10 above if the Include Other parameter is true, or show rows 2-11 if the Include Other parameter is false (Include Other is a boolean parameter I created).
FILTER – Other
NOT([Include Other]) AND [Manfacturer Category] = ‘Other’
Adding this to the Filter shelf and setting to False, basically hides the Manufacturer Category = Other when Include Other = No, or shows Manufacturer Category = Other when Include Other = Yes
So when Include Other = No, I have 10 rows indexed 1-10, and when Include Other = Yes, I have 11 rows indexed 1-11.
I only want the number of rows based on my Top N parameter, so I created an additional field
FILTER – Index
[Index] <= [Top n Manufacturers]
which is added to the Filters shelf and set to true, so only the rows that are less than or equal to the parameter value are displayed.
So we’re nearly there, except Jeff added an additional curveball…
Display Manufacturer & Rank on hover
To tackle this, I needed 3 more calculated field and a set, as we’re going to need Set Actions to handle this.
‘(#’ + STR([Index]) + ‘)’
this simply formats the index value into the format (#1)
Manufacturer + Rank
ATTR([Manufacturer Category]) + ‘ ‘ + [Index Rank]
this concatenates the two fields together – note the ATTR() is needed as Index Rank references the aggregated table calculation of INDEX().
This is a set created by simply right-clicking on Manufacturer Category and selecting Create -> Set, then choosing an arbitrary option (eg Avery). This will be changed via Set Actions later…
IF ATTR([Highlighted Manufacturer]) THEN ([Manufacturer + Rank]) ELSE ATTR([Manfacturer Category]) END
This is basically displaying the manufacturer with the rank if the manufacturer is in the set.
Adding LABEL:Manufacturer into our data table above, you can see that the Avery manufacturer is displayed with the rank, but the others aren’t
To make the value of the set change on hover, the view needs to be added onto a dashboard, and a Set Action created.
Create a dashboard sheet, and add the view. Then select Dashboard -> Actions -> Add Action -> Change Set Values
Set the properties in the dialog as below – the action to run on Hover, the Target Set to the be the Highlighted Manufacturer created above, and remove all set values when you ‘hover off’
Hovering on Eldon, for example, you can see the manufacturer label is changing in the relative positions
So that’s all the building blocks now in place, the viz just needs to be created, which I did by duplicating my table version above, and moving the pills around.
I basically created a synchronised dual axis with Index on rows, Region, MIN(0), SUM(Quantity) on columns. MIN(0) was set to be a Gantt bar and labelled with the LABEL:Manufacturer field set to be left aligned. The SUM(Quantity) field is labelled with the Quantity. Both labels were set to match the mark colour, which was based on Region using the Nuriel Stone palette specified. Headers were hidden and various formatting applied to the grid lines/rows/columns etc.
My version of the viz is here.
I’m off to ponder about where that Manufacturer field came from, as I think it would have all been a lot simpler if it already existed… hmmmm…