Can you build a Sales calendar with top 3 highlighting?

An interesting challenge set by Ann this week, which seemed initially to me that it should be quite straightforward, but as usual ended up causing all sorts of frustrations.

The trickiest part was how to show marks for days when there weren’t any sales. From my initial observations of Ann’s viz (hovering over marks etc), I’d figured out that she had made use of the ability to set a sheet’s background as transparent and then float one sheet over the other. This was because when I hovered over the circles with no days displayed, the mark didn’t ‘highlight’ in any way, indicating that it was ‘behind’ something else.

So with this thought in mind, I went about trying to build the basic viz – a calendar view with all days for the background, and a calendar view with just the days with sales to layer over the top.

Ann didn’t state any requirements about how this could be tackled, so I assumed she hadn’t used any sort of date scaffolding (ie using another data set containing all the dates). I therefore tried to do the same, and did end up with something, but it just felt too complex, and wasn’t really ‘future’ proof for a real world scenario, as it relied on each day across a year existing in the dataset being used, regardless of year (ie there was an entry for 1st Jan, 2nd Jan etc all the way through to 31st Dec, but it could be 1st Jan 2016, 2nd Jan 2018 etc). Knowing I had to blog about what I’d done, I decided to take a step back, and approach the challenge based on what I would have done if I was asked to do something similar for work.

I still wanted to incorporate the transparency/floating sheets requirement though, since this was a piece of functionality I believe Ann was trying to demonstrate the use of in this challenge.

Date Scaffold

I first needed to create a data source to use as my date scaffold. This is simply an excel sheet containing 1 column, Date, with 1 row per day ranging from 01 Jan 2017 through to 31 Dec 2019.

I created this, saved the sheet and then connected Tableau Desktop to it. This will provide the data for my background calendar.

Building the Background Calendar

The overall calendar needs to be organised into 3 columns (based on months) and 4 rows (based on quarters). Within each column, there is then a column per day of week and within each row there is a row per week of the year. Essentially we’re building a trellis type chart.

Column Number

When building trellis charts, there’s all sorts of clever techniques to flex the number of rows & columns based on how many ‘entities’ in the data you’re trying to organise. In this instance we’re dealing with months and we know the data is going to be static, so I just created a simple calculated field to determine which column each month should sit in :

CASE MONTH([Date]) WHEN 1 THEN 1
WHEN 2 THEN 2
WHEN 3 THEN 3
WHEN 4 THEN 1
WHEN 5 THEN 2
WHEN 6 THEN 3
WHEN 7 THEN 1
WHEN 8 THEN 2
WHEN 9 THEN 3
WHEN 10 THEN 1
WHEN 11 THEN 2
WHEN 12 THEN 3
END

This field was then added to the columns shelf, and next to it I added WEEKDAY([Date]) .

Top Tip – by right clicking on a Date field in your dimensions pane, and dragging onto the columns/rows shelf, you get a date selection dialog displayed when you release the mouse, so you can quickly choose the type of date you want to display

By default, as I’m based in the UK, my weeks are set to start on Mondays, but the display starts on a Sunday. To fix this, right click on the datasource and select Date Properties

Onto the rows, I added QUARTER([Date]) using same technique as above.

Now I need to get the weeks to display, but if I just add WEEK([Date]) to rows, I get too many rows for each quarter

What I want is to be able to show weeks 6-9 and 10-14 on the same rows as weeks 1-5. To achieve this I created

Week Index

DATEPART(‘week’, [Date]) – {FIXED DATEPART(‘month’,[Date]): MIN(DATEPART(‘week’, [Date]))}

The FIXED part of the calculation is finding the minimum week number for each month. So for January, the minimum week number is 1, for February it’s 5, for March it’s 9. That minimum number is then being subtracted from each week number, so for

  • week 1 (in Jan), I have 1-1 = 0
  • week 2 (in Jan), I have 2-1 = 1 etc
  • week 5 (in Feb) I have 5-5 = 0
  • week b (in Feb) I have 6-5 = 1 etc

Adding Week Index to Rows rather than WEEK([Date]) gives me

which is the layout I’m after.

Restrict the Year

I created a parameter SELECT YEAR which I listed 2017, 2018, 2019. I then created a new calculated field

FILTER : Year

YEAR([Date]) = [SELECT YEAR]

which I added to the Filter shelf and set to True

I then just had to apply some formatting :

  • Set the mark type to circle and choose a light grey colour, increase the size slightly
  • Set the row banding
  • Lighten the row/column divider lines
  • Set the Q1, Q2 etc label headings to white font (this is the background sheet, so I don’t want them to display at all)
  • Set the Sunday, Monday etc label headings to white font
  • Don’t show the Week Index header
  • Don’t show the Column Number header
  • Hide all other column/row labels
  • Name the sheet Backgound

Building the Foreground Calendar

Using the Superstore dataset, I basically repeated all the steps above, but this time referencing the Order Date field in the data set so

Column Number

CASE MONTH([Order Date]) WHEN 1 THEN 1
WHEN 2 THEN 2
WHEN 3 THEN 3
WHEN 4 THEN 1
WHEN 5 THEN 2
WHEN 6 THEN 3
WHEN 7 THEN 1
WHEN 8 THEN 2
WHEN 9 THEN 3
WHEN 10 THEN 1
WHEN 11 THEN 2
WHEN 12 THEN 3
END

Week Index

DATEPART(‘week’, [Order Date]) – {FIXED DATEPART(‘month’,[Order Date]): MIN(DATEPART(‘week’, [Order Date]))}

FILTER: Year

YEAR([Order Date]) = [SELECT YEAR]

which I again added to the Filter shelf and set to True

When plotted on the sheet in the same way, you can start to see the gaps appearing where there is no sales for that day.

I then applied the following changes

  • Set mark type to circle, colour slightly darker grey than that selected above, added border to circle, and adjusted size slightly to match the other sheet
  • Removed the row banding completely
  • Removed the row/column divider lines
  • Set the Q1, Q2 etc label headings to larger font aligned middle & centre
  • Set the Sunday, Monday etc label headings to darker font, and formatted to just show First Letter
  • Don’t show the Week Index header
  • Don’t show the Column Number header
  • Hide all other column/row labels
  • Labelled the sheet Foreground
  • Added Order Date to the Text shelf, setting it to be Discrete Exact Date, then changed the format to dd/mm. Changed font size to 7 and centre aligned.
  • Added Sales to the tooltip, setting the format to be $ with no decimal places
  • Created a copy of Order Date (right click > duplicate), an added the copy to the tooltip, setting the format to be Wednesday, 14 March 2001
  • Adjusted the tooltip to match
  • Changed the background of the whole sheet from white to ‘None’ – this sets it to be transparent, and is the main trick for this display

Highlight Top 3

To achieve this requirement, I first created a parameter HIGHLIGHT TOP 3 containing the values DAYS, WEEKS, MONTHS

When MONTHS was selected, I needed to find the top 3 months in the year, etc. So I created some calculated fields

Total Monthly Sales

{FIXED YEAR([Order Date]), MONTH([Order Date]): SUM(Sales)}

This stores the total month’s sales against every row

Total Weekly Sales

{FIXED YEAR([Order Date]), WEEK([Order Date]): SUM(Sales)}

This stores the total week’s sales against every row

Value

IF [HIGHLIGHT TOP 3] = ‘MONTHS’ THEN SUM([Total Monthly Sales])
ELSEIF [HIGHLIGHT TOP 3] = ‘WEEKS’ THEN SUM([Total Weekly Sales])
ELSE SUM([Sales])
END

This is basically storing the relevant value I need to consider for the Top 3 based on what was selected in the parameter.

I then also created an additional field

Group By Date

IF [HIGHLIGHT TOP 3] = ‘MONTHS’ THEN DATETRUNC(‘month’,[Order Date])
ELSEIF [HIGHLIGHT TOP 3] = ‘WEEKS’ THEN DATETRUNC(‘week’, [Order Date])
ELSE [Order Date]
END

This captures a date at the relevant level on each row depending on the parameter selection.

In Top 3

From the Group By Date field, I then created a set, which I set to be Top 3 by Value

To see how all these fields interact, build a basic viz with Group By Date on rows (exact date, discrete), and Value on Text. Add In top 3 to rows too. Then add FILTER- Year to the filters shelf, and add to context. This step is crucial to ensure the year filter is applied before the set computes its top 3. Use the parameters to see how the values of Group By Date and Value change

So now you can see how the set is working with the parameter, the set can now be added to the Colour shelf of the Foreground sheet, and the colours adjusted accordingly. The FILTER – Year needs to be added to context on this sheet too.

Building the dashboard

So now the two sheets have been created, the dashboard can be built.

I started by setting the size of the dashboard to 1600 x 1300, adding a text field for the title, and text fields underneath for my standard ‘footer’. I then added the Background sheet into the middle between my title & footer, moving the parameters to form part of the title row. I hid the title of the sheet and set to Fit -> Entire View.

At this point everything on the dashboard is tiled.

I then changed the option to floating, and added my Foreground sheet. As with the background, I hid the title and set to Fit -> Entire View.

I then used the position values of the Background sheet (the x & y position and the height & width), to set the position values of the Foreground sheet to be exactly the same. The intention here was the circles on the Foreground should then be positioned directly over the circles on the Background, and as the Foreground was transparent, the circles that were missing on the Foreground where there were no sales, would show through from the Background sheet, along with the row banding & the row/column lines.

However, this just wouldn’t work as I hoped. I checked the padding options, I shifted things slightly left, right, up & down but when I got some circles lining up, others wouldn’t. It really was quite frustrating and I spent some time trying to fix this, but ended up publishing as it was, which you can see here. In truth I was secretly hoping that by publishing to Tableau Public, it would miraculously work, but it didn’t :-(.

I checked out Ann’s viz and she had managed to get it all to line up beautifully, although the position values on her sheets weren’t exact either, so I’m not sure if this was just trial & error to get right too. I’ll have to watch her solution when it’s published.

So that was attempt 1, but I really wasn’t happy, so came up with an alternative…

Take Two – Join the data at source

For this version, I used Tableau’s ability to join two excel data sources together to create a single data source, with 1 row for every day from 01 Jan 2017 to 31 Dec 2019, supplemented by the relevant superstore sales data against each date if there was a sale on that date.

The data was left outer joined, using Date = Order Date as the joining key

I then created the viz on a single sheet using Dual Axis. To get an axis though, I had to create a Week Index (Date) field and a Week Index (Order Date) field (just as the two Week Index fields described above), but when added to the view, they were set to continuous with the axis reversed to ensure the 0 was at the top rather than the bottom.

All the data required to colour the circles, apply the tooltips etc was added to the 2nd Week Index (Order Date) marks card.

The only difference with this version, is that when you hover on the non-labelled circles, they do ‘highlight’. I can live with this, and felt the solution was much ‘cleaner’ and far less complex 🙂 However it does rely on the fact that the types of data sources I was working with could be joined. In my day job, a lot of the data sources I use are published Tableau Server data sources, and as yet Tableau doesn’t allow these to be joined 😦

This version is published here .

Happy vizzin’

Donna

Can you build a Top N Bar Chart on a Single Worksheet?

So back from my holibobs and catching up on the #WorkoutWednesday challenges I’ve missed while I’ve been sunning myself on a Greek beach.

First up is another guest challenge posted by Jeffrey Shaffer who asked us to create a Top N bar chart of the quantity per manufacturer per region on a single viz. The main twist with this is that a manufacturer could be ranked in different locations for each region. The full challenge is here.

Determine Manufacturer

In the 2019.1 Superstore Sales data set I had on my machine, there is no Manufacturer field. I couldn’t see one in the link to the dataset included on the blog page either. Perhaps I was missing something, but given I was on catch up, I chose not to find out the real reason. Instead I decided to derive my own Manufacturer field by just taking the first word of the Product Name. It meant my quantity figures wouldn’t match up, but the principles behind the challenge wouldn’t be affected.

MID([Product Name],1, FINDNTH([Product Name],’ ‘,1)-1)

Note – I’ve watched Ann’s solution (published on the challenge page), since I published my version, and Ann just lifts a pre-defined Manufacturer group onto her viz, which also already has an ‘Other’ manufacturer. My calculated field above has no such value. Ann’s solution is therefore a bit more straightforward than the work I had to put in to determine ‘Other’.

I interpreted ‘Other’ to be the total quantity of all the manufacturers not in the top n. But of course the top n had to take into account the Region, so it was possible that a manufacturer listed explicitly in the top n for the Central region say, might be included within the ‘other’ pot for another region.

To work this out, I created a quantity field per region ie

Central – Qty

IF [Region] = ‘Central’ THEN [Quantity] END

East – Qty

IF [Region] = ‘East’ THEN [Quantity] END

South – Qty

IF [Region] = ‘South’ THEN [Quantity] END

West – Qty

IF [Region] = ‘West’ THEN [Quantity] END

I then created a set per region to store the top n manufacturers based on the Quantity per region

The set was based on the Manufacturer field, and limited to the Top n Manufacturers parameter I set up, associated to the relevant [Region] – Qty field created above.

I had 4 sets in the end : Top Central, Top East, Top South, Top West

Note – It is entirely possible with this set up, that each set could contain a completely different set of manufacturers, so up to top n X 4 manufacturers in total

So having got a set which stored my top n per region, I needed to get the ‘Other’ grouping. To do this I created a new field

Manufacturer Category

IF [Region] = ‘Central’ AND [Top Central] THEN [Manufacturer]
ELSEIF [Region] = ‘East’ AND [Top East] THEN [Manufacturer]
ELSEIF [Region] = ‘South’ AND [Top South] THEN [Manufacturer]
ELSEIF [Region] = ‘West’ AND [Top West] THEN [Manufacturer]
ELSE ‘Other’
END

With the Top n parameter set to 10, this gives me

There are 12 rows listed, but only 11 values per region (the top 10 + other). You can see here that Wilson is within the top 10 for East & West, but not Central and South, whilst Hon is in top 10 for Central, East & South but not West. This is why there are 12 rows. If I changed the Top n parameter to 15, I’d get 20 rows listed, as there are 19 manufacturers across the top 15 for all the regions + other.

To get the sorting, we need to introduce an old friend :

Index

INDEX().

Adding this as a discrete (blue) pill onto the rows, moving the Manufacturer Category onto Text, and setting the Table Calculation Properties of the Index field to …

  • Compute over both Region and Manufacturer Category (with Region listed first)
  • To restart every Region
  • To sort by SUM of Quantity descending

… I start to get what I need

But I’ve still got 1 too many rows, I need to deal with the ‘Other’ pot. If my top n parameter is set to 10, then I essentially want to be able to show rows 1-10 above if the Include Other parameter is true, or show rows 2-11 if the Include Other parameter is false (Include Other is a boolean parameter I created).

FILTER – Other

NOT([Include Other]) AND [Manfacturer Category] = ‘Other’

Adding this to the Filter shelf and setting to False, basically hides the Manufacturer Category = Other when Include Other = No, or shows Manufacturer Category = Other when Include Other = Yes

So when Include Other = No, I have 10 rows indexed 1-10, and when Include Other = Yes, I have 11 rows indexed 1-11.

I only want the number of rows based on my Top N parameter, so I created an additional field

FILTER – Index

[Index] <= [Top n Manufacturers]

which is added to the Filters shelf and set to true, so only the rows that are less than or equal to the parameter value are displayed.

So we’re nearly there, except Jeff added an additional curveball…

Display Manufacturer & Rank on hover

To tackle this, I needed 3 more calculated field and a set, as we’re going to need Set Actions to handle this.

Index Rank

‘(#’ + STR([Index]) + ‘)’

this simply formats the index value into the format (#1)

Manufacturer + Rank

ATTR([Manufacturer Category]) + ‘ ‘ + [Index Rank]

this concatenates the two fields together – note the ATTR() is needed as Index Rank references the aggregated table calculation of INDEX().

Highlighted Manufacturer

This is a set created by simply right-clicking on Manufacturer Category and selecting Create -> Set, then choosing an arbitrary option (eg Avery). This will be changed via Set Actions later…

LABEL: Manufacturer

IF ATTR([Highlighted Manufacturer]) THEN ([Manufacturer + Rank]) ELSE ATTR([Manfacturer Category]) END

This is basically displaying the manufacturer with the rank if the manufacturer is in the set.

Adding LABEL:Manufacturer into our data table above, you can see that the Avery manufacturer is displayed with the rank, but the others aren’t

To make the value of the set change on hover, the view needs to be added onto a dashboard, and a Set Action created.

Create a dashboard sheet, and add the view. Then select Dashboard -> Actions -> Add Action -> Change Set Values

Set the properties in the dialog as below – the action to run on Hover, the Target Set to the be the Highlighted Manufacturer created above, and remove all set values when you ‘hover off’

Hovering on Eldon, for example, you can see the manufacturer label is changing in the relative positions

So that’s all the building blocks now in place, the viz just needs to be created, which I did by duplicating my table version above, and moving the pills around.

I basically created a synchronised dual axis with Index on rows, Region, MIN(0), SUM(Quantity) on columns. MIN(0) was set to be a Gantt bar and labelled with the LABEL:Manufacturer field set to be left aligned. The SUM(Quantity) field is labelled with the Quantity. Both labels were set to match the mark colour, which was based on Region using the Nuriel Stone palette specified. Headers were hidden and various formatting applied to the grid lines/rows/columns etc.

My version of the viz is here.

I’m off to ponder about where that Manufacturer field came from, as I think it would have all been a lot simpler if it already existed… hmmmm…

Happy vizzin!

Donna