Can you show the top and bottom profit performers for the selected time period?

With #TC21 looming next week, Candra’s set this week’s challenge, based on inspiration from past Tableau Conferences – a simple looking, but effective visualisation for understanding profit performance within some pre-established timeframes.

  • Building the BANs
  • Identifying Top 5 / Bottom 5 / Everything Else
  • Building the Chart and Labelling the Bars
  • Adding the interactivity

Building the BANs

The timeframes we need to report over need to be based on a specific date. In this case it’s the latest date in the data set. If you were using this for a business dashboard, you might be basing it on Today / 1st of the Current Month etc. Rather than hardcode the date I need, I’ve worked out the latest month I want to use by

Max Month

{FIXED:MAX(DATE(DATETRUNC(‘month’,[Order Date])))}

Set all the Order Dates in the data set to be the 1st of the month, then get the maximum of these dates. So as the last date in the data set is 30th Dec 2021, that’s been truncated to 1st Dec 2021 which is then what this field stores.

I then want to capture the profit values for each month, quarter, year into separate fields, so we have

Month

IF DATETRUNC(‘month’, [Order Date])=[Max Month] THEN [Profit] END

This only stores Profit values for rows where the Order Date is also in December

Quarter

IF DATETRUNC(‘quarter’,[Order Date])=DATETRUNC(‘quarter’,[Max Month]) THEN [Profit] END

This only stores Profit values for rows where the Order Date is in the same quarter as December (ie the 4th quarter which is months Oct-Dec).

Year

IF DATETRUNC(‘year’,[Order Date])=DATETRUNC(‘year’,[Max Month]) THEN [Profit] END

This stores Profit data for rows where the Order Date is in the same year.

All these fields are formatted to be $ with 0 dp.

A basic viz can the be built with Measure Names on Columns and Measure Names and Measure Values on Text. The Measure Names heading is then hidden, and the font and table formatting adjusted so the sheet looks as below.

Note – Naming these fields Month, Quarter, Year rather than Monthly Sales, Quarterly Sales etc, makes this display much easier and also helps with the interaction later.

Identifying the Top 5 / Bottom 5 / Everything Else

We need to be able to identify the Sub-Categories which have the best profits, those that have the worst, and the ‘rest’. We’re going to use Sets to help us with this. However the set entries could change depending on whether we’re looking by month, by quarter or by year. So first we need to create a field that is going to store the particular Profit value we need depending on what time period is being selected.

We need a parameter pDatePart to capture the time frame. This is a string field which is just defaulted to the text ‘Month’.

The interactivity later will set this parameter to the different values.

So now we know the ‘selected’ date part, we need to get the appropriate profit value

Value To Plot

CASE [pDatePart]
WHEN ‘Month’ THEN [Month]
WHEN ‘Quarter’ THEN [Quarter]
ELSE [Year]
END

This just uses the values from the 3 measures we created to start with.

So now we can create the sets we need. Right click on Sub-Category > Create > Set and create a set called Top 5 that is based on the Top 5 Value to Plot values

Then create another set in the same way called Bottom 5

With these sets, we can now determine the Sub-Category ‘label’ that will be displayed

Sub-Category Display

IF [Top 5] OR [Bottom 5] THEN [Sub-Category] ELSE ‘Everyone Else’ END

and the grouping that will be used to colour the bars

Sub Cat Group

IF [Top 5] THEN ‘Top 5’
ELSEIF [Bottom 5] THEN ‘Bottom 5’
ELSE ‘Everything Else’
END

Building the Chart & Labelling the Bars

Ok, so now we’ve got the building blocks in place, we can build the chart. You will probably be tempted to build a bar chart (I did to start with), but positioning the labels then became a bit tricksy. When we get to the labels, we’re going to need to use the left and right alignment options. However, when you build a bar chart, if you right align the label, the label will be positioned outside at the end of the bar (even though this seems a little odd with negative values, as it looks to be on the left…).

Right aligned labels

But then we set the labels to be left aligned, the labels appear inside the bar instead, and not outside on the left.

Left aligned labels

So instead, rather than using the bar mark type, we need to build this chart using the gantt mark type, and base the Size on the Value to Plot field.

However, the value being plotted is actually an average value based on the number of Sub-Categories being ‘grouped’ as otherwise the value associated to Everything Else can end up bigger than all the rest. I created the following field

Avg Value To Plot

SUM([Value to Plot])/COUNTD([Sub-Category])

formatted to $ with 0dp.

So now we start building by adding Sub-Category Display to Rows and type in MIN(0) into Columns. Change the mark type to Gantt and add Avg Value To Plot to Size. Add Sub-Cat Group to Colour and adjust accordingly. Sort the Sub-Category Display field by Avg Value To Plot descending.

Now we can’t just label by a single field of the value or the sub-category, as while the ‘automatic’ label alignment option, almost puts the labels in the right positions, there is no way to define an ‘opposite’ to the ‘automatic’ alignment. We need to define some dedicated label fields based on where we want them to display.

Label – Left – Profit

IF [Avg Value To Plot]<0 THEN [Avg Value To Plot]
END

If we’re in the bottom half of the chart, we’re going to display the Profit value on the left side.

Label – Left – Sub Cat

IF [Avg Value To Plot]>=0 THEN ATTR([Sub-Category Display])
END

If we’re in the top half of the chart, we’re going to display the Sub-Category Display on the left side.

Add both these fields to the Label shelf and then adjust the label alignment to be left.

To label the other ends, we need to create two further label fields

Label – RightProfit

IF [Avg Value To Plot]>=0 THEN [Avg Value To Plot]
END

Label – Right – Sub Cat

IF [Avg Value To Plot]<0 THEN ATTR([Sub-Category Display])
END

We then need to create another MIN(0) on Columns (easiest way is to hold down control, then click on the existing MIN(0) field and drag it next to itself to create a duplicate. Then on the 2nd marks card, remove the two Label – Left – xxx fields and add the two Label – Right -xxx fields. Change the alignment to right.

The make the chart Dual Axis and synchronise the axis.

Now you can hide the Sub-Category Display header from showing, hide the axis, remove gridlines etc.

Adding the interactivity

Once the two sheets are on the dashboard, you can add a dashboard parameter action which will on select of the KPI/BAN chart, pass the Measure Name into the pDatePart parameter. When the mark is unselected, the parameter value should stay as it is.

And hopefully, you should now have a working viz. My published version is here.

Enjoy #TC21!

Happy vizzin’! Stay Safe!

Donna

Can you use set actions to create a reference date range?

There’s a lot packed into the challenge this week, which was “set” by Lorna Brown and Erica Hughes to test our Set Action skills. How detailed this blog will go, I have yet to decide… I’ve got a couple of hours to get this nailed, so it could get quite brief as we get towards the end 🙂

I’ve got 6 sheets/charts making up this dashboard, so my intention is to summarise each one, and I’ll define the various calculations that are going to be needed as we go.

  • The overall summary table
  • The selected months summary table
  • The trend line
  • The donut chart
  • The top 3 states table
  • The map
  • Adding the interactivity

The overall summary table

This challenge is focused on understanding the Sales per month. Whilst its possible to use the built in aggregation features of a date field, I often prefer to create explicit date fields at the level I require, so it’s easier to reference. Therefore, the first field I created for this challenge was

Order Date To Plot

DATETRUNC(‘month’, [Order Date])

This essentially ‘groups’ every order placed in a month to be tagged with the 1st of the month. I custom formatted this field to MMM yyyy (ie Oct 21).

For the overall summary table, I need to capture the total sales of the whole data set, and I use a Fixed LoD calculation for this.

Total Sales

{FIXED: SUM([Sales])}

This field is formatted to $0.00M

NOTE – I actually named this field <space>Total Sales<space> as I want to display the name of the field (the measure name) in the summary table, but the ‘selected months’ summary table also has a Total Sales measure which is a different calculation (see later). Adding the <spaces> is a sneaky way to get two fields with what appears to be the same name. As this field when displayed will be centred, the <spaces> aren’t noticeable.

We also need to get the monthly average sales for the whole data set

Average Sales by Month

AVG({FIXED [Order Date To Plot]: SUM([Sales])})

Format this to to $0.0K

We can now build the summary table by adding Measure Names to the Filter shelf and selecting these 2 fields. The placing Measure Names on Rows and Measure Names and Measure Values on Text. Reorder the measures as required, hide Measure Names on Rows and format the Text as required.

Change the title of the sheet to Superstore Sales, ensure the tooltip doesn’t display and remove all gridlines / row banding etc.

The selected months summary table

The core requirement of this challenge is to make use of set actions, so we’re obviously going to need a set which will contain the dates (months) the user will select on the chart. This set will be based off of Order Date To Plot. Right click on the field > Create > Set. Name it Order Dates To Plot Set and by default, select all the months between Oct 20 to Jun 21 inclusive.

Later I’ll describe how the values of this set will get updated, but for now, we need to get some information relating to the sales in these selected months.

Firstly, we want the total sales for the months in this set.

Total Sales

IF [Order Date To Plot Set] THEN [Sales] END

The default format for this field is set to $ with 0 dp.

Note – this is the other ‘total sales’ field mentioned earlier. This field name has no leading/trailing spaces.

To get the average, I needed a field just to store each member of the set (ie each selected month)

Selected Dates

IF [Order Date To Plot Set] THEN [Order Date To Plot] END

and with this I can then work out

Average Sales

AVG({FIXED [Selected Dates]: SUM([Total Sales])})

The final measure required for this section is the change within the date range, which is basically comparing the value of sales at the first month in the selection with the sales in the final month selected. We need a few fields to get to this.

Firstly, we want to identify the first and last months

Min Selected Date

{FIXED:MIN(IF [Order Date To Plot Set] THEN [Order Date To Plot] END)}

If the date is in the set, then return the date and then take the minimum of all the dates, and store against all the rows in the data. Similarly we have

Max Selected Date

{FIXED:MAX(IF [Order Date To Plot Set] THEN [Order Date To Plot] END)}

Putting this info into a table, you can see how the calculations are working. The values for the Min & Max dates are the same across every row.

Next we need to get the Sales at the min & max points, and spread that value across all rows

Sales at Min Date

{FIXED: SUM(IF [Order Date To Plot]=[Min Selected Date] THEN [Sales] END)}

Sales at Max Date

{FIXED: SUM(IF [Order Date To Plot]=[Max Selected Date] THEN [Sales] END)}

Now we can work out the difference

Change within Date Range

([Sales at Max Date]-[Sales at Min Date])/[Sales at Min Date]

format this to a percentage set to 1 dp

Finally, we need to know the number of months in the set, which is displayed in the title of the monthly summary sheet.

Months in Set

{FIXED: COUNTD(IF [Order Date To Plot Set] THEN [Order Date To Plot] END)}

If the date is within the set, then capture the date, and the count the distinct set of dates captured.

Make this a discrete field (move from the measures section at the bottom of the data pane to the dimensions section at the top (above the line), and add to the tabular view

Now we can build the summary sheet.

Add Measure Names to the Filter shelf and this time filter by Total Sales, Average Sales and Change within Date Range. Add Measure Names to Rows and Measure Values to Text. Reorder the measures.

Format the Total Sales to be in $K, by selecting the Format option from the context menu of the Total Sales pill on the Measure Values shelf (hover on the pill and click the carrot/down arrow that appears – by formatting this way, we’re changing the display of this field for this sheet only).

Add Min Selected Date and Max Selected Date to the Detail shelf and set to be Exact Date. Format both these fields via the pill context menu to be the ‘March 2001’ format

Also add Months in Set to the Detail shelf.

Adjust the title of the sheet as below

Finally you need to set the background of the worksheet to the relevant purple (I used #8074a8), adjust the colours of all the fonts to white and adjust the size/style of the fonts in the table. Remove all gridlines/row banding etc, and you should have something like below

The Trend Line

By this point we’ve built all the calculated fields we need for this chart. This is a dual axis line chart, as we want the colour of the line for the selected dates to be different from the non selected ones, and we want to display a label for the highest sales in the selected timeframe.

  • Add Order Date to Plot to Columns, and set as a Continuous (green) pill set to Exact Date
  • Add Sales to Rows
  • Add Total Sales to Rows
  • Make the chart dual axis, and synchronise axis.
  • Adjust the colours of the Measure Names colour legend
  • On the Label shelf of the Total Sales marks card, set to label the maximum value only
  • On the All Marks Card, add Min Selected Date and Max Selected Date to the Detail shelf and set to Exact Date.
  • Right click on the Order Date To Plot axis and Add Reference Line
  • Create a reference band that starts at the constant Min Selected Date, ends at the constant Max Selected Date, is bounded by dotted lines and shaded between
  • Hide the Sales and Total Sales axis, format tooltips and adjust the row & column dividers.
  • Change the title and you should get to

The donut chart

Donut charts are 2 different sized pie charts on top of each other, created using a dual axis chart. On the Rows shelf type in MIN(0). Then type the same next to it. This gives you two axis and two marks cards.

We only care about information related to the selected dates for this chart, so we can add Order Date To Plot Set to the Filter shelf, which by default will just restrict the information to the data ‘in’ the set.

Change the mark type of the 1st MIN(0) marks card to Circle and add Sales to the Tooltip shelf. Adjust the size of the charts/mark.

Change the mark type of the 2nd MIN(0) marks card to Pie and add Sales to the Angle shelf. Add State to the Detail shelf. Sort the State field by Sales descending.

Note – the circles might look the same at this point, but if you hover over the bottom one, you should see that it’s segmented by State.

We need some new fields now to help us identify the top ranking states.

Sales Rank

RANK(SUM([Sales]))

This is a table calculation, so it’s best to see how this field will work in a table view – build one out as below, and set the table calculation on the Sales Rank field as shown

We’re now going to ‘group’ the ranks into the top 3 and everything else

Sales Rank Group

IF [Sales Rank]<=3 THEN [Sales Rank] ELSE 10 END

We can now use this Sales Rank Group field to colour the pie chart. On the 2nd MIN(0) marks card, add Sales Rank Group to the Colour shelf. Adjust the table calculation to compute using State as above, then change the field to be Discrete (blue). Adjust the colours to suit.

Now make the chart dual axis, and synchronise the axis. Adjust the size of the 1st MIN(0) circle to be smaller than the pie. If it’s not showing, right click on the right hand axis and move marks to back. Colour the circle white. Adjust tooltips to suit and hide axis, column/row dividers etc. Update the title. You should have

The top 3 states table

  • Add Order Date To Plot Set to Filter
  • Add State to Rows and Sales to Text and sort descending.
  • Add Sales Rank to Filter and set to At Most is 3. This will just show the top 3 states.
  • Add State to Text
  • Add a Percent of Total Quick Table Calculation to the existing Sales field that’s on the Text shelf (via the context menu of the pill)
  • Add another instance of Sales back onto the Text shelf
  • Adjust / format the font size and layout of the fields on the Text shelf
  • Add Sales Rank to the Size shelf and set to be discrete (blue) and set the mark type to be Text. Adjust the size of the marks – it’s likely it’ll need to be reversed and the range adjusted.
  • Hide the State field on Rows, adjust the font colours, remove row banding and row/column dividers. You should end up with…

The map

  • Add Order Date To Plot Set to the Filter shelf
  • Add State to Detail – this should create a map (edit locations to be US if need be – Map -> Edit Locations menu)
  • Add Sales to the Colour shelf
  • Edit the colour range to a suitable purple range ( I set the darkest colour of the range to #6c638f)
  • Adjust the map layers (Map -> Map Layers) so only the option highlighted below is selected.

Adding the interactivity

Add all the sheets to the dashboard, using vertical and horizontal containers to arrange the relevant layout. Then add a Set Action (Dashboard > Actions > Add Action > Change Set Values). The Set Action should be configured as below :

And fingers crossed, you should now be able to select marks on the trend line and see all the other charts, except the initial summary table, all update. My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you find the needle in the haystack?

It was Candra’s turn to ‘set’ the #WOW2021 challenge this week providing a hint in the challenge description that the solution would involve sets.

As with many challenges, I built the data out in tabular format to start with to verify I had all the components and calculations correct. The areas of focus are

  • Identify number of distinct customers per product
  • Identify overall average number of distinct customers per product
  • Identify if product above or below average distinct customers
  • Identify Top 50 products by Sales
  • Identify Unprofitable Products
  • Identify products that are both in the top 50 AND unprofitable
  • Building the viz

Identify number of distinct customers per product

To start off, add Product Name, Sub-Category, Category to the Rows shelf to begin building out a table. Add Sales (formatted to $k 0dp) and Profit (formatted to $k 0dp with negative values as () ) to Text and sort by Sales descending.

To identify the distinct customers per product, we can create

Customer Count per Product

{FIXED [Product Name] : COUNTD([Customer ID])}

Add this to the view.

Identify overall average number of distinct customers per product

What we’re looking for here is the average of all the values we’ve got listed in the Customer Count per Product column. Ie we want to sum up those values displayed and divide by the number of rows.

The number of rows is equivalent to the number of products, which we can get from

Count Products

{FIXED : COUNTD([Product Name])}

And so to get the overall average we calculate

Avg Overall Customer Count

{FIXED: SUM([Customer Count Per Product])} / [Count Products]

Add these fields to the view as well, so you can see how the values work per row. The last two calculations give you the same value across all rows.

Identify if product above or below average distinct customers

Given the above display, this is just a case of comparing values in 2 columns

Higher than Avg Customer Count

AVG([Customer Count Per Product]) > SUM([Avg Overall Customer Count])

this returns true or false – add this to the view too.

Identify Top 50 products by Sales

We can create a set for this. Right click on Product Name > Create > Set. Name the set something suitable eg Top 50 Products, and on the Top tab, state the number (50) and the field (Sales) and the aggregation (Sum)

Add this to the view, and if you’ve sorted by the sales, you should find the top 50 rows are all In the set, and the rest are Out.

Identify Unprofitable Products

We can use another set for this. Again create a set off of Product Name, call it Unprofitable Products, and on the Condition tab, set the condition so that the Sum of Profit is less than 0

Add this onto the view too.

Identify products that are both in the top 50 AND unprofitable

For this, we’re explicitly looking for the rows that are both In the Top 50 Products set and In the Unprofitable Products set.

We can use the Combined Set functionality to do this.

In the left hand data pane, select both the Top 50 Products and the Unprofitable Products sets (hold down ctrl to multi select), then right click and Create Combined Set. I called the set Products to Include, and select to combine the sets by including Shared members in both sets

If you then add this field to the Filter shelf, you will be left with just the 13 Products that match

This is the single filter field you can use as per Candra’s requirements.

Building the viz

To get the text to display to the left of the bar, you actually need to create a ‘fake’ bar chart.

  • Add Products to Include to Filter
  • Add Product Name to Rows
  • On the Columns shelf, double click and type in MIN(1)
  • Add Sales to Columns to the right of MIN(1)
  • Sort by Sales descending

Against the MIN(1) marks card

  • Change the Size to small
  • Set the Opacity of the Colour to 0% and the border to None
  • Add Product Name, Sub-Category and Category to the Label shelf and adjust accordingly, aligning left
  • Increase the height of each row to make the text visible

On the Sales marks card

  • Add Higher than Avg Customer to the Colour shelf and adjust
  • Show mark labels
  • Create a new field Profit Ratio : SUM([Profit])/SUM([Sales]) Format to % with 0dp and add to Tooltip
  • Add Profit, and Customer Count by Product to Tooltip and adjust accordingly

Finally, uncheck Show Header against Product Name and MIN(1) and Sales and format the borders/gridlines etc. Add the title, then add to the dashboard.

All done (I hope…)! My published version is here.

Happy vizzin’! Stay Safe!

Donna

Can you find the top and bottom performers?

The challenge this week came from Candra McRae, where the focus was to use statistics to identify the top & bottom performers, rather than the more common ‘top n’ and ‘bottom n’. By statistics, we’re specifically looking records within the top 25th percentile and the bottom 25th percentile.

So let’s dive in.

  • Identify Current date
  • Defining the calculations
  • Building the chart
  • Month Selector and interaction

Identify Current date

The data we’re using is the Superstore Sales data from 2021.1 which includes data up to 31st Dec 2021. The requirement talks about the current rolling x months worth of data compared to the previous x months worth of data.

This means we need a way to determine what ‘current’ is. Typically, in a real sales environment, you’d probably only have data up to ‘today’, and I did consider working up a solution based on ‘today’, but equally I like to deliver a solution that I know matches the challenger, as it helps to validate my workings, and also I like to have a solution that I can look back on in the future and know there’s data.

So I took Candra’s hint and based ‘current’ off of the maximum date in the data set, derived by

Max Date

DATE({FIXED:MAX([Order Date])})

Defining the calculations

If you’re a regular reader of my blogs, you’ll know when I can, I like to build the data out into a tabular form, so I can verify the calculations, and then I’ll build out the viz.

First up, we want to get a value for the ‘current rolling n months’.

To define ‘n’ we need a parameter.

pRollingMonth

An integer defaulted to 12. It doesn’t need to be a list, as this will be populated via a parameter action from another sheet – more on that later.

Current Rolling Sales

IF [Order Date]>= DATEADD(‘month’, -1 * [pRollingMonth] ,DATETRUNC(‘month’,[Max Date])) AND [Order Date] <= [Max Date] THEN [Sales] END

let’s break this down… DATETRUNC(‘month’,[Max Date]) truncates the Max Date which is 31st Dec 2021 to the 1st of the month ie it returns 01 Dec 2021.

DATEADD(‘month’, -1 * [pRollingMonth] ,DATETRUNC(‘month’,[Max Date])) , is then going back to the 12 months prior (-1×12=-12) , so is 01 Dec 2020.

So we’re only going to get a sales value if the Order Date is >= 01 Dec 2020 and <= 31 Dec 2021 (essentially 13 months of sales data).

For the previous sales, we first need

Prev Month

DATEADD(‘month’, -1, [Max Date])

so in our current example, this will be 30 Nov 2021.

and then to get the previous rolling 12 month sales, we can apply similar logic using Prev Month instead of Max Date

Previous Rolling Sales

IF [Order Date]>= DATEADD(‘month’, -1 * [pRollingMonth] ,DATETRUNC(‘month’,[Prev Month])) AND [Order Date] <= [Prev Month] THEN [Sales] END

Both these fields can be formatted to 1 decimal place, $ prefix and format in thousands (k).

And then we also need a difference to display on the tooltip

Difference

SUM([Current Rolling Sales]) – SUM([Previous Rolling Sales])

This needs to be additionally formatted so that negatives are displayed in brackets ().

Add all these into a table and sort by the Current Rolling Sales descending

So we’ve got the data needed for the bar, the line and the tooltip. We now need to work on the crux of the challenge – the calculations needed to identify the top & bottom.

We’re looking to identify the 25th percentile value based on Current Rolling Sales values displayed on screen

25th Percentile

WINDOW_PERCENTILE(SUM([Current Rolling Sales]),0.25)

and also we need the 75th percentile

75th Percentile

WINDOW_PERCENTILE(SUM([Current Rolling Sales]),0.25)

If you pop these table calcs into the table, you’ll see the values for each field are the same for each row

and with these we can now identify where each row falls

Colour

IF SUM([Current Rolling Sales]) >= [75th Percentile] THEN ‘Top’
ELSEIF SUM([Current Rolling Sales]) <= [25th Percentile] THEN ‘Bottom’
ELSE ‘Middle’
END

Finally we need to identify the rows with a negative difference and flag with a circle.

Sales Contraction Indicator

IF [Difference]<0 THEN ‘●’ ELSE ” END

I use this site to get the symbols for these types of requirements,

Pop these two fields in to the table, and you’ve got all the data needed to build the chart:

Building the chart

Candra states that we can’t use a reference line to display the previous sales data, so for the core chart we need to build a dual axis chart plotting Sub-Category against Current Rolling Sales (bar chart) and Previous Rolling Sales (gantt chart).

Current Rolling Sales is coloured by Colour. I created a Label:Current Rolling Sales field just based on Current Rolling Sales but formatted to 0dp to add to the Label shelf.

To get the circles displayed, and to retain the order of the display, duplicate the Sub-Category field so you have a Sub-Category (copy) field. Add this to Rows alongside the existing Sub-Category field.

Then add the Sales Contraction Indicator field between these 2 fields, and format the font of that field so it is in red text (getting a coloured circle, was the part of this challenge I struggled most over, yet it really was very simple once the penny dropped!).

Then hide the first Sub-Category field (uncheck Show Header) so it no longer displays.

Apply various formatting to remove the row & column lines, gridlines etc, and adjust the tooltip and you should be done.

Month Selector and Interaction

A separate sheet is needed for this. We need to build a basic viz that has 12 data points with values 1-12. And we can get this from the Order Date field

Month Order Date

MONTH([Order Date])

will return the month number

Add this field as a discrete (blue) pill to the Columns shelf, set the mark type to square, and add Month Order Date to the Label shelf too. Colour the mark pale grey, and remove all borders etc, and hide the headers

When you add the 2 sheets onto the dashboard, you need to set a parameter action from the Month Selector sheet that sets the pRollingMonth parameter, using the value from the Month Order Date field. When unselecting, the value should default back to 12.

Hopefully there’s enough here to get you to the end! My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you design for an automatic phone layout?

Designing for mobile isn’t something I inherently think about when building a viz, so, this week’s #WOW challenge from Lorna was a useful reminder on the inbuilt functionality Tableau has to make a display typically aimed at a laptop user, also render well on a mobile display with minimal effort.

Tableau allows you to choose a variety of device layouts and define specific instances of how the sheets should display on each, but this challenge was focused on the Automatic Phone Layout option, where the aim was to build a 1200×800 dashboard, which automatically rendered vertically when viewed on mobile. Lorna referenced this Tableau blog as a starting point, which is definitely worth a read.

YTD Summary by Sales

In the requirements, Lorna mentions ‘last 2 years YTD’, but given the dataset used contains years 2016-2019, and the challenge wasn’t really focused on this area, I just presented sales for the whole year as follows :

  • YEAR(Order Date) to Rows sorted by Order Date descending (so 2019 is listed first)
  • SUM(Sales) on Text with a Quick Table Calculation of Percent Difference. By default, this calculation will work ‘down’ the table, so we need to edit the table calc so it is relative to the Next record, rather than Previous.
  • Format this field using custom formatting of ▲0%;▼0% ( I use this site to copy & paste the shape images from)
  • Add another copy of SUM(Sales) to Text and then modify the text to format as required.
  • To only display Years 2018 & 2019 either
    • Filter the Order Date to exclude 2016, and additional filter the SUM(Sales) % difference table calculation to only show non-null values
  • or select year 2016 and 2017, right-click and choose Hide
  • Add YEAR(Order Date) to Colour and adjust to suit
  • Apply formatting to remove row/column/grid lines/ row banding. hide labels and rotate headings
  • Change the title of the sheet to YTD

Repeat all of this to create a version for Profit.

Sales Trend Line

This is pretty much Desktop 101 – one of the first charts you’re likely to build when training on Desktop 🙂 So here’s just the picture 🙂

Again repeat to create a version for Profit.

Top 10 Products by Sales Bar Chart

In the requirement Lorna stated (and displayed a title for) Top 10 Customers by Sales, but the chart displayed in her solution showed Products.

I chose therefore to show the same as Lorna (with a corrected title). The instruction in the requirements was also to use Sets. So to do this

Top 10 Products by Sales

Right click on Product Name, select Create -> Set and adjust as below

Then to build the chart

  • Add Sales to Rows
  • Product Name to Columns, sorted by Sales descending
  • Sales to Text
  • Order Date to Filter set to 2018 & 2019 only
  • Top 10 Products by Sales to Filter

At this point you might find you only have 9 rows displayed, and this is because the 10 largest product sales are considering all years, before then filtering by the years. We need to set the Order Date filter to be Added to Context (right-click on pill). This will force Tableau to filter the data by the relevant years. and then work out the Top 10. When added to context the pill will be grey rather than blue on the filter shelf.

Apply formatting as required

Top & Bottom 10 Products by Profit

Similar to above, I created a Top 10 Products by Profit set and also a Bottom 10 Products by Profit set ( just change the ‘Top’ drop down to select Bottom).

Top & Bottom Products by Profit

To the get a Top 10 & Bottom 10 set, you can create a combined set. I can’t recall if I’ve ever come across this before… I think it was probably part of some #WoW a long time ago… it certainly isn’t something I use regularly.

Right click on one of the Profit sets created and select Create Combined Set and choose your sets in the presented dialog. In this instance we want all members from both sets.

To build the chart

  • Add Profit to Rows
  • Product Name to Columns, sorted by Profit descending
  • Profit to Text
  • Order Date to Filter set to 2018 & 2019 only and added to context
  • Top & Bottom Products by Profit to Filter

We need a ‘header’ to label the Top & Bottom. I added Top 10 Products By Profit to Rows, and placed it in front of Product Name. This works since the sets are mutually exclusive – no product can be in both the top 10 and the bottom 10. This will display header values of In or Out, which I then aliased to display Top 10 or Bottom 10 (right clicked on ‘In’ and Edit Alias). Then just format to suit.

Building the Dashboard

We’ve got all the component parts , so now we want to add to the dashboard in such a way that the sheets will display vertically when viewed on mobile, even though the requirements stipulate a 1200 x 800 ‘desktop’ type layout.

The requirements also state to use the Automatic Phone Layout option only, which means you shouldn’t have a need to ‘fiddle’ with how the components are displayed when in mobile view.

Basically you’re building on the Default view

but when you click the Phone option, the padlock remains ‘locked’ ie the ‘automatic’ view, and all the objects display in the right order on top of each other.

The key to this, is knowing the fact that the automatic phone layout follows an A-Z approach – across the page from left to right, then down and across from left to right again.

I arranged my objects in rows as

  • Row 1 = Title
  • Row 2 = Heading
  • Row 3 = Sales YTD, Sales Trend, Top 10 Products
  • Row 4 = Heading
  • Row 5 = Profit YTD, Profit Trend, Top & Bottom 10 Products

All charts were set to Fit Entire View, and I didn’t spend any time creating specific layout containers – I just dropped the objects in the places I wanted.

This subsequently presented as below when I clicked the Phone option on the Dashboard tab

The part I found most tricky was getting the Top & Bottom 10 Products chart to display the detail I could see when viewing Lorna’s solution. I simply ended up needing to adjust the font sizes to be size 6 on the labels and the displayed product names, and moving the column sizer so I had enough of the bar to display.

My published version is here.

Happy vizzin’! Stay safe!

Donna

Which products are most profitable?

I’m starting to write this blog with a bit of uncertainty today as I’m not ultimately sure where I’m going to end up….

I wasn’t even sure I was going to pen an entry this week…. I found Luke’s challenge tough, and the path I took full of multiple wrong turns, that meant trying to write out a comprehensible ‘how I did it’ quite tricky, as reviewing now I’m questioning ‘what did I do that for…’.

Whilst with perseverance and a bit of inspiration from Rob Saunders, I did manage to post a working solution, I knew I wasn’t overly happy with it, particularly because my expand and collapse functions didn’t behave as I saw others do… I had to click twice to collapse.

Before starting to write, I decided to check out Luke’s workbook which he’d finally published to see if I could understand where I was going wrong.

I’d used both a set action and parameter action in my attempt; Luke had just a parameter action. In trying to understand why I ended up with a set action and see if I could do without it, my solution gradually started unravelling, as various calculated fields needed changing.

So, I think the best way to approach this blog is to rebuild my solution from scratch, using only my existing workbook as a reference, and I will attempt to write and screen shot as I build. This could take some time, and I have no idea how successful I’ll be…. I may well get to a point where I’ve taken a wrong turn again, and everything I’ve written needs to be scrapped… at which point I may just have to say ‘sorry, I tried!’……

Ok, let’s get cracking.

Top N Sub-Categories

At the ‘first’ level of the table, we need to display the top n sub-categories ordered by Profit. Those not in the top n should be displayed under an ‘All Others’ grouping, and always displayed at the bottom. The top n can vary based on the user input.

We need a parameter to define the value of the top n.

SUB-CATEGORIES TO SHOW

This is an integer parameter ranging from 1-10, defaulted to 3. I’ve titled it exactly as displayed on the output, so it’s one less change to make later (hence the capitals).

The easiest way to group the sub-categories into those in the top n, is to use a set. Right click Sub-Category and Create -> Set.

Top N SubCats by Profit

Use the Top tab to define the rules for which Sub-Category to include, referencing the SUB-CATEGORIES TO SHOW parameter and the Profit field as shown below.

SubCat Group

IF [Top N SubCats by Profit] THEN [Sub-Category] ELSE ‘All Others’ END

If the Sub-Category is in the set then the name will display, otherwise it will be grouped under the ‘All Others’.

Add SubCat Group to Rows and change the sort on the pill to sort by Field = Profit desc

This will result in All Others being listed at the top, but to resolve that, drag the Top N SubCats by Profit set onto the Rows an place in front of the SubCat Group.

Let’s get some of the measures we need out onto the canvas.

Margin

SUM([Profit])/SUM([Sales])

format to percentage with 0dp.

Add Margin, Profit & Sales onto the sheet.

Top N Products

The next level in the table displays the Top N products per SubCat Group based on their margin. For those not in the Top N, the products should be grouped under ‘All Others’, and listed at the bottom. The Top N is once again defined by a user input via a parameter.

PRODUCTS TO SHOW

Once again an integer parameter ranging from 1-10 but defaulted at 5 this time.

Unfortunately, this time we can’t use a set to define our Top N grouping. This is because the set will only consider the Top N across all Products and will not consider the fact the Products are nested per Sub-Category grouping. So we need to come up with an alternative. I did do a bit of research to find ideas, and found this article by Emma Whyte at The Information Lab to get me started : Showing a Nested Top N with Other in Tableau.

The Top N needs to be based on the order of the Margin, or the rank.

Margin Rank

RANK_UNIQUE([Margin])

Change this to be a Discrete field.

This will give us a unique ‘number’ per row displayed based on the value of Margin. If records have the same Margin value, using RANK_UNIQUE will mean they get a different rank number (as opposed to how other ranking functions work). The table calculation of INDEX() could work just as well. NOTE – there is potential though that when the Margin values are the same for different products, what makes the Top N and what doesn’t may differ, so you might find you get a slightly different list from some of the solutions you see.

Add Margin Rank and Product Name onto the sheet, and edit the table calculation on Margin Rank, to compute by Product Name only

We now want to group the Product Name based on the Top N products.

Product Name Group

IF [Margin Rank] <= MIN([PRODUCTS TO SHOW])
THEN ATTR([Product Name])
ELSE ‘All Others’ END

We also only want to show (PRODUCTS TO SHOW + 1) rows per SubCat Group ie if PRODUCTS TO SHOW = 5, we want to display 6 rows per SubCat Group, where the 6th row displays ‘All Others’. The 6th row also needs to show the Profit, Sales and Margin values associated to all the Products in the ‘All Others’ Product Name Group.

So we need to calculate some new fields that will store a revised value for Proft, Sales and Margin, depending what row we’re working with.

Sales For Others

WINDOW_SUM(IF [Margin Rank] > [PRODUCTS TO SHOW] THEN SUM([Sales]) END)

If the Product Name isn’t in the Top N, then get it’s Sales value, and then sum all of those rows that meet the same condition.

If you put this onto the sheet, and set the table calculation for each of the nested calculations (Sales for Others & Margin Rank) to be by Product Name only, you’ll see that the value displayed in every row for each SubCat Group is the sum of the values associated to the rows in the ‘All Others’ group.

So now we need a field that’s either going to display the Sales for the 1 product or the sales for the group of products, depending on what row we’re on.

Grouped Sales

IF [Margin Rank]<= MIN([PRODUCTS TO SHOW]) THEN SUM([Sales]) ELSE [Sales For Others] END

Again the table calculation settings need to be set to compute by Product Name only. This is the field we ultimately want to display, so it needs to be formatted accordingly. The Sales and Sales For Others fields can be removed.

Along similar lines, we need

Profit For Others

WINDOW_SUM(IF [Margin Rank] > [PRODUCTS TO SHOW] THEN SUM([Profit]) END)

Grouped Profit

IF [Margin Rank]<= MIN([PRODUCTS TO SHOW]) THEN SUM([Profit]) ELSE [Profit For Others] END

Margin For Others

IF [Margin Rank] > [PRODUCTS TO SHOW] THEN [Profit For Others]/[Sales For Others] END

Grouped Margin

IF [Margin Rank] <= MIN([PRODUCTS TO SHOW]) THEN [Margin] ELSE [Margin For Others] END

When added to the sheet, once again make sure all the table calculation properties for all the nested calculations are set to compute by Product Name only.

As mentioned above, we only want to show PRODUCTS TO SHOW + 1 rows, so let’s create a field we can filter by :

Show?

[Margin Rank] <= MIN([PRODUCTS TO SHOW]) + 1

Add this to the Filter shelf and set to True. Once again the table calculation needs to be set to compute by Product Name only.

Add Totals

We want a grand total and subtotals only at the level of SubCat Group. Add the totals by Analysis -> Totals -> Show Column Grand Totals to get the overall total. Then on the SubCat Group pill, click and add SubTotals

We’ve now got the main components of the table. We now need to get the interactivity working to allow the expand / contract on arrow selection.

Expand / Contract All Others Product Group

First up, we’ll just tidy up our table display

  • Hide the In/Out Top N SubCats By Profit field
  • Hide the Margin Rank field
  • Hide the Product Name field.
  • Remove the Margin field
  • Alias the Grouped Margin, Grouped Profit, Grouped Sales fields
    • Right click and Edit Alias. Name the field ‘ Margin ‘ (note the trailing and leading spaces. You can’t alias just as Margin as a field already exists with that name, but the spaces make it think it’s a new name.

We need an additional field that will store our ‘arrow’ icon. We’re going to revisit this field. For starters

Product Group Header

IF [Margin Rank] <= [PRODUCTS TO SHOW] THEN ”
ELSE ‘►’
END

I use this site to get my icon characters from. Add this field to the sheet after the SubCate Group pill, once again setting the table calculation to compute by Product Name.

In the dashboard, the aim is to click on an arrow associated to a single SubCat Group, which will expand the Product Group Name field to display the actual Product Name (rather than ‘All Others’) with their associated Margin, Sales & Profit values, and also show a ▼ icon.

This will be achieved using Parameter Actions, for which we need a parameter :

Selected Sub Category Group

String parameter defaulted to ”

Display this parameter on the sheet, as we can start to test the interactivity ‘manually’ without the need for the dashboard. What the dashboard action will do is on ‘click’, it will be set to populate the value of this parameter with the associated SubCat Group value. We can then do some checks based off of this and set various fields accordingly. It means we need to revisit some of the fields.

First up let’s set the arrow….

Edit Product Group Header to be

IF [Selected SubCat Group] = MIN([SubCat Group]) THEN
IF [Margin Rank] > [PRODUCTS TO SHOW] THEN ‘▼’ ELSE ” END
ELSEIF [Margin Rank] <= [PRODUCTS TO SHOW] THEN ”
ELSE ‘►’
END

Test this by entering the value of ‘Copiers’ into the Selected SubCat Group parameter. The arrow against ‘All Others’ should change.

We also need to change the value of the Product Name Group to show the actual Product Name on selection, so

edit the Product Name Group

IF [Margin Rank] <= MIN([PRODUCTS TO SHOW])
THEN ATTR([Product Name])
ELSE
IF [Selected SubCat Group] = MIN([SubCat Group])
THEN ATTR([Product Name])
ELSE ‘All Others’ END
END

Again test this out by changing the value in the parameter.

But we need to make more rows show too, so

edit Show?

([Margin Rank] <= MIN([PRODUCTS TO SHOW]) + 1) OR (MIN([SubCat Group]) =[Selected SubCat Group])

The values of our measures are still the totals though, so we need to edit these fields to

Grouped Sales

IF ([Margin Rank]<= MIN([PRODUCTS TO SHOW]) OR ([Selected SubCat Group] = MIN([SubCat Group])) ) THEN SUM([Sales]) ELSE [Sales For Others] END

Grouped Profit

IF ([Margin Rank]<= MIN([PRODUCTS TO SHOW]) OR ([Selected SubCat Group] = MIN([SubCat Group])) ) THEN SUM([Profit]) ELSE [Profit For Others] END

Grouped Margin

IF ([Margin Rank] <= MIN([PRODUCTS TO SHOW]) OR ([Selected SubCat Group]=MIN([SubCat Group]))) THEN [Margin] ELSE [Margin For Others] END

However while this works if you play with setting and clearing the parameter on the sheet, it won’t quite fully work if added as a dashboard action, as while the action can set the parameter we can’t ‘clear it’.

We need to ‘tie’ the parameter action to another field

SubCat Group for Reset

IF [Selected SubCat Group] = MIN([SubCat Group]) THEN ”
ELSE MIN([SubCat Group])
END

Add this to the Detail shelf (if you want to see how it changes based on the parameter value, add it to the Rows and test changing the parameter).

It needs to be on the sheet so it can be referenced from the dashboard action.

Adding the Action

Add the sheet to a dashboard, then add Parameter Action as below where the Target Parameter is Selected SubCat Group and the field it references is SubCat Group for Reset

And after all that, you should have a working solution. Phew!

Pretty pleased I got there without taking a detour 🙂 The table just now needs various formatting applied, which I’m going to leave to you to do 🙂 Just tweet me if you’re having problems!

The parameter action may be confusing you a bit – it took a while to really get my head round it, so I’ve tried to explain this a bit more below…

How the parameter action works

On initial load of the sheet, the Selected SubCat Group parameter is blank. So for the SubCat Group = Copiers, the SubCat Group for Reset will also be Copiers as SubCat Group is not the same as Selected SubCat Group parameter. Other fields are also set based on the fact these two fields aren’t the same (like the arrow pointing to the right etc).

When the right arrow is clicked on the dashboard, the value of SubCat Group for Reset is used to populate the Selected SubCat Group parameter. So in this example, Selected SubCat Group will now contain the value Copiers. As the Selected SubCat Group parameter is now the same as the SubCat Group, various fields change their behaviour (like the arrow now points down, and more rows are displayed). But also, the value of the SubCat Group for Reset is also changed; as the SubCat Group is the same as the Selected Sub Cat Group parameter, SubCat Group for Reset now contains a blank string.

So at the point the down arrow is now clicked again on the dashboard, the value of the SubCat Group for Reset is again used to populate the Selected SubCat Group parameter. As SubCat Group for Reset is blank, then the parameter will now be populated with a blank value, and so all the login in the fields will be based on the fact that SubCat Group is not the same as the Selected SubCat Group parameter, and the table will display just as it did on first load.

Hope that helps to demystify what’s going on… it’s certainly helped me!

Thanks for sticking with me if you got this far 🙂

The version of the challenge I built while writing this (my 3rd published version) is here.

Happy vizzin’! Stay Safe!

Donna

Can you build a Sales calendar with top 3 highlighting?

An interesting challenge set by Ann this week, which seemed initially to me that it should be quite straightforward, but as usual ended up causing all sorts of frustrations.

The trickiest part was how to show marks for days when there weren’t any sales. From my initial observations of Ann’s viz (hovering over marks etc), I’d figured out that she had made use of the ability to set a sheet’s background as transparent and then float one sheet over the other. This was because when I hovered over the circles with no days displayed, the mark didn’t ‘highlight’ in any way, indicating that it was ‘behind’ something else.

So with this thought in mind, I went about trying to build the basic viz – a calendar view with all days for the background, and a calendar view with just the days with sales to layer over the top.

Ann didn’t state any requirements about how this could be tackled, so I assumed she hadn’t used any sort of date scaffolding (ie using another data set containing all the dates). I therefore tried to do the same, and did end up with something, but it just felt too complex, and wasn’t really ‘future’ proof for a real world scenario, as it relied on each day across a year existing in the dataset being used, regardless of year (ie there was an entry for 1st Jan, 2nd Jan etc all the way through to 31st Dec, but it could be 1st Jan 2016, 2nd Jan 2018 etc). Knowing I had to blog about what I’d done, I decided to take a step back, and approach the challenge based on what I would have done if I was asked to do something similar for work.

I still wanted to incorporate the transparency/floating sheets requirement though, since this was a piece of functionality I believe Ann was trying to demonstrate the use of in this challenge.

Date Scaffold

I first needed to create a data source to use as my date scaffold. This is simply an excel sheet containing 1 column, Date, with 1 row per day ranging from 01 Jan 2017 through to 31 Dec 2019.

I created this, saved the sheet and then connected Tableau Desktop to it. This will provide the data for my background calendar.

Building the Background Calendar

The overall calendar needs to be organised into 3 columns (based on months) and 4 rows (based on quarters). Within each column, there is then a column per day of week and within each row there is a row per week of the year. Essentially we’re building a trellis type chart.

Column Number

When building trellis charts, there’s all sorts of clever techniques to flex the number of rows & columns based on how many ‘entities’ in the data you’re trying to organise. In this instance we’re dealing with months and we know the data is going to be static, so I just created a simple calculated field to determine which column each month should sit in :

CASE MONTH([Date]) WHEN 1 THEN 1
WHEN 2 THEN 2
WHEN 3 THEN 3
WHEN 4 THEN 1
WHEN 5 THEN 2
WHEN 6 THEN 3
WHEN 7 THEN 1
WHEN 8 THEN 2
WHEN 9 THEN 3
WHEN 10 THEN 1
WHEN 11 THEN 2
WHEN 12 THEN 3
END

This field was then added to the columns shelf, and next to it I added WEEKDAY([Date]) .

Top Tip – by right clicking on a Date field in your dimensions pane, and dragging onto the columns/rows shelf, you get a date selection dialog displayed when you release the mouse, so you can quickly choose the type of date you want to display

By default, as I’m based in the UK, my weeks are set to start on Mondays, but the display starts on a Sunday. To fix this, right click on the datasource and select Date Properties

Onto the rows, I added QUARTER([Date]) using same technique as above.

Now I need to get the weeks to display, but if I just add WEEK([Date]) to rows, I get too many rows for each quarter

What I want is to be able to show weeks 6-9 and 10-14 on the same rows as weeks 1-5. To achieve this I created

Week Index

DATEPART(‘week’, [Date]) – {FIXED DATEPART(‘month’,[Date]): MIN(DATEPART(‘week’, [Date]))}

The FIXED part of the calculation is finding the minimum week number for each month. So for January, the minimum week number is 1, for February it’s 5, for March it’s 9. That minimum number is then being subtracted from each week number, so for

  • week 1 (in Jan), I have 1-1 = 0
  • week 2 (in Jan), I have 2-1 = 1 etc
  • week 5 (in Feb) I have 5-5 = 0
  • week b (in Feb) I have 6-5 = 1 etc

Adding Week Index to Rows rather than WEEK([Date]) gives me

which is the layout I’m after.

Restrict the Year

I created a parameter SELECT YEAR which I listed 2017, 2018, 2019. I then created a new calculated field

FILTER : Year

YEAR([Date]) = [SELECT YEAR]

which I added to the Filter shelf and set to True

I then just had to apply some formatting :

  • Set the mark type to circle and choose a light grey colour, increase the size slightly
  • Set the row banding
  • Lighten the row/column divider lines
  • Set the Q1, Q2 etc label headings to white font (this is the background sheet, so I don’t want them to display at all)
  • Set the Sunday, Monday etc label headings to white font
  • Don’t show the Week Index header
  • Don’t show the Column Number header
  • Hide all other column/row labels
  • Name the sheet Backgound

Building the Foreground Calendar

Using the Superstore dataset, I basically repeated all the steps above, but this time referencing the Order Date field in the data set so

Column Number

CASE MONTH([Order Date]) WHEN 1 THEN 1
WHEN 2 THEN 2
WHEN 3 THEN 3
WHEN 4 THEN 1
WHEN 5 THEN 2
WHEN 6 THEN 3
WHEN 7 THEN 1
WHEN 8 THEN 2
WHEN 9 THEN 3
WHEN 10 THEN 1
WHEN 11 THEN 2
WHEN 12 THEN 3
END

Week Index

DATEPART(‘week’, [Order Date]) – {FIXED DATEPART(‘month’,[Order Date]): MIN(DATEPART(‘week’, [Order Date]))}

FILTER: Year

YEAR([Order Date]) = [SELECT YEAR]

which I again added to the Filter shelf and set to True

When plotted on the sheet in the same way, you can start to see the gaps appearing where there is no sales for that day.

I then applied the following changes

  • Set mark type to circle, colour slightly darker grey than that selected above, added border to circle, and adjusted size slightly to match the other sheet
  • Removed the row banding completely
  • Removed the row/column divider lines
  • Set the Q1, Q2 etc label headings to larger font aligned middle & centre
  • Set the Sunday, Monday etc label headings to darker font, and formatted to just show First Letter
  • Don’t show the Week Index header
  • Don’t show the Column Number header
  • Hide all other column/row labels
  • Labelled the sheet Foreground
  • Added Order Date to the Text shelf, setting it to be Discrete Exact Date, then changed the format to dd/mm. Changed font size to 7 and centre aligned.
  • Added Sales to the tooltip, setting the format to be $ with no decimal places
  • Created a copy of Order Date (right click > duplicate), an added the copy to the tooltip, setting the format to be Wednesday, 14 March 2001
  • Adjusted the tooltip to match
  • Changed the background of the whole sheet from white to ‘None’ – this sets it to be transparent, and is the main trick for this display

Highlight Top 3

To achieve this requirement, I first created a parameter HIGHLIGHT TOP 3 containing the values DAYS, WEEKS, MONTHS

When MONTHS was selected, I needed to find the top 3 months in the year, etc. So I created some calculated fields

Total Monthly Sales

{FIXED YEAR([Order Date]), MONTH([Order Date]): SUM(Sales)}

This stores the total month’s sales against every row

Total Weekly Sales

{FIXED YEAR([Order Date]), WEEK([Order Date]): SUM(Sales)}

This stores the total week’s sales against every row

Value

IF [HIGHLIGHT TOP 3] = ‘MONTHS’ THEN SUM([Total Monthly Sales])
ELSEIF [HIGHLIGHT TOP 3] = ‘WEEKS’ THEN SUM([Total Weekly Sales])
ELSE SUM([Sales])
END

This is basically storing the relevant value I need to consider for the Top 3 based on what was selected in the parameter.

I then also created an additional field

Group By Date

IF [HIGHLIGHT TOP 3] = ‘MONTHS’ THEN DATETRUNC(‘month’,[Order Date])
ELSEIF [HIGHLIGHT TOP 3] = ‘WEEKS’ THEN DATETRUNC(‘week’, [Order Date])
ELSE [Order Date]
END

This captures a date at the relevant level on each row depending on the parameter selection.

In Top 3

From the Group By Date field, I then created a set, which I set to be Top 3 by Value

To see how all these fields interact, build a basic viz with Group By Date on rows (exact date, discrete), and Value on Text. Add In top 3 to rows too. Then add FILTER- Year to the filters shelf, and add to context. This step is crucial to ensure the year filter is applied before the set computes its top 3. Use the parameters to see how the values of Group By Date and Value change

So now you can see how the set is working with the parameter, the set can now be added to the Colour shelf of the Foreground sheet, and the colours adjusted accordingly. The FILTER – Year needs to be added to context on this sheet too.

Building the dashboard

So now the two sheets have been created, the dashboard can be built.

I started by setting the size of the dashboard to 1600 x 1300, adding a text field for the title, and text fields underneath for my standard ‘footer’. I then added the Background sheet into the middle between my title & footer, moving the parameters to form part of the title row. I hid the title of the sheet and set to Fit -> Entire View.

At this point everything on the dashboard is tiled.

I then changed the option to floating, and added my Foreground sheet. As with the background, I hid the title and set to Fit -> Entire View.

I then used the position values of the Background sheet (the x & y position and the height & width), to set the position values of the Foreground sheet to be exactly the same. The intention here was the circles on the Foreground should then be positioned directly over the circles on the Background, and as the Foreground was transparent, the circles that were missing on the Foreground where there were no sales, would show through from the Background sheet, along with the row banding & the row/column lines.

However, this just wouldn’t work as I hoped. I checked the padding options, I shifted things slightly left, right, up & down but when I got some circles lining up, others wouldn’t. It really was quite frustrating and I spent some time trying to fix this, but ended up publishing as it was, which you can see here. In truth I was secretly hoping that by publishing to Tableau Public, it would miraculously work, but it didn’t :-(.

I checked out Ann’s viz and she had managed to get it all to line up beautifully, although the position values on her sheets weren’t exact either, so I’m not sure if this was just trial & error to get right too. I’ll have to watch her solution when it’s published.

So that was attempt 1, but I really wasn’t happy, so came up with an alternative…

Take Two – Join the data at source

For this version, I used Tableau’s ability to join two excel data sources together to create a single data source, with 1 row for every day from 01 Jan 2017 to 31 Dec 2019, supplemented by the relevant superstore sales data against each date if there was a sale on that date.

The data was left outer joined, using Date = Order Date as the joining key

I then created the viz on a single sheet using Dual Axis. To get an axis though, I had to create a Week Index (Date) field and a Week Index (Order Date) field (just as the two Week Index fields described above), but when added to the view, they were set to continuous with the axis reversed to ensure the 0 was at the top rather than the bottom.

All the data required to colour the circles, apply the tooltips etc was added to the 2nd Week Index (Order Date) marks card.

The only difference with this version, is that when you hover on the non-labelled circles, they do ‘highlight’. I can live with this, and felt the solution was much ‘cleaner’ and far less complex 🙂 However it does rely on the fact that the types of data sources I was working with could be joined. In my day job, a lot of the data sources I use are published Tableau Server data sources, and as yet Tableau doesn’t allow these to be joined 😦

This version is published here .

Happy vizzin’

Donna

Can you create a BCG growth share matrix with Sales and Orders?

Week 39 of #WorkoutWednesday was set by Lorna this week. Details of the challenge and a description of what a BCG Growth Share Matrix is here.

The main complexity of this challenge is the calculations, and more so with Lorna’s stipulation ‘no table calcs allowed’.

I started using Tableau before LoDs were introduced, so I tend to be much more comfortable with Table Calcs than I do with LoDs; I know there are many people who are the other way round.

So, let’s get started. In the interest of time (it’s now Sunday afternoon, and I’ve dipped into this challenge on & off since Wednesday), I’m going to focus on the calcs involved in achieving my solution, and not necessarily going into the detail of putting it all together on the Viz.

Sales Scatter

To build the Sales Scatter Plot, we need the following information to help get the %growth and %market share values :

  • The total sales across all the specified years per subcategory
  • The sales for the latest year per subcategory
  • The total sales for the previous years per subcategory
  • The overall total sales across all the specified years

The number of years to compare against is set by a parameter to be 2,3,or 4 years (No.of Years).

To limit the data just to these years, I created a Dates To Include as

YEAR([Order Date]) >= [Max Year in Dataset] – ([No of Years]-1)

Max Year in Dataset is another field that I could have simply hardcoded to 2019, but I decided to be more dynamic, and derive it using an LoD

YEAR({FIXED: MAX([Order Date])})

which gets the latest Order Date in the data and then retrieves the Year part of it, which happens to be 2019 in this instance. Dates To Include is then True if the year of the order date is greater than, or equal to, the maximum year less one less than the No of Years param.

Eg if No of Years = 2, then Dates to Include is True if the year of the order date >= 2019 – (2-1) = 2019-1 = 2018.

if No of Years = 4, then Dates to Include is True if the year of the order date >= 2019 – (4-1) = 2019-3= 2016.

To sense check the numbers I need, I built a tabular view by Sub Category adding Dates to Include = True to the filters shelf.

Total sales across all the specified years per subcategory

is simply achieved by adding SUM([Sales]) to the view.

Sales Latest Year

The sales for the latest year per subcategory is

IF YEAR([Order Date]) = [Max Year in Dataset] THEN [Sales] END

Sales Previous Years

The sales for the previous years per subcategory is then just the opposite

IF YEAR([Order Date]) <> [Max Year in Dataset] THEN [Sales] END

The value of this field ultimately changes as the No of Years parameter changes. If No of Years = 2, then this will contain the 2018 sales, if its 3, then it will contain the sum of the sales in 2017 & 2018 etc.

The Growth % is stated as being the change for the latest year in comparison to the average sales of the previous years. So next I need

Average Sales Previous Years

SUM([Sales Previous Years])/([No of Years]-1)

Note the [No of Years]-1, as when comparing across 2 years, I’m really only being asked to compare 2019 with the previous year, ie 2018.

Now I can find the Sales Growth %

Growth Sales

(SUM([Sales Latest Year])-[Average Sales Previous Years])/[Average Sales Previous Years]

The difference of the latest year compared to the average, as a proportion of the average.

Now we need the Market Share of Sales. This would usually be a pretty simple Percent of Total quick table calculation on SUM([Sales]), but I’ve been challenged to do this without table calcs. So an LoD is needed instead. I created

Sales All Years

{FIXED [Dates to Include]:SUM([Sales])}

which gives me my total sales across the specified years, at the ‘overall’ level.

From this I can calculate

Market Share Sales

SUM([Sales])/SUM([Sales All Years])

The table below gives the breakdown of all these fields at a sub-category level for 2 years

Next step is to categorise each row into Cash Cows, Dogs, Question Marks or Stars based on their % Growth & % Market Share values. Lorna stated that 7% should be considered the mid-point for % Market Share, but that the Growth mid point should be half of the maximum growth value.

From the table above, 64.8% (against appliances) is the maximum value in the set, so I need a new field against each row that outputs 32.4%. Once again, with Table Calcs I would just have used WINDOW_MAX to find the max and halve it. But instead I needed another LOD, this one a bit more complex…

Mid Sales Growth Constant (Sub Cat)

({EXCLUDE [Sub-Category]:MAX({INCLUDE [Sub-Category]: (SUM([Sales Latest Year])-[Average Sales Previous Years])/[Average Sales Previous Years]})})/2

I find INCLUDE and EXCLUDE LoDs a bit of a black art, and there may well have been another way to do this, but it was a bit of trial and error in the end, and so once it gave the value I was after, I stuck with it 🙂 I’ve tried to write some sentences to explain it better, but can’t 😦

So with the mid point defined, the categorisation is then

Sales Category (Sub Cat)

IF [Growth Sales]<=MIN([Mid Sales Growth Constant (Sub Cat)]) AND [Market Share Sales]<=MIN([Mid Market Share Constant]) THEN ‘Dogs’ ELSEIF [Market Share Sales]<=MIN([Mid Market Share Constant]) AND [Growth Sales]>MIN([Mid Sales Growth Constant (Sub Cat)]) THEN ‘Question Marks’
ELSEIF [Growth Sales]>MIN([Mid Sales Growth Constant (Sub Cat)]) AND [Market Share Sales]>MIN([Mid Market Share Constant]) THEN ‘Stars’
ELSE ‘Cash Cows’
END

This is all the information now needed to plot the Sales Scatter chart. Custom shapes were sourced from flaticon.com and I used Paint to colour and resize them appropriately, before saving them into a folder in the Shapes folder of the My Tableau Repository directory on my laptop.

Order Scatter

The steps for creating the Orders scatter are the same in principle. I need to find

  • The total orders across all the specified years per subcategory
  • The number of orders for the latest year per subcategory
  • The number of orders for the previous years per subcategory
  • The overall number of orders across all the specified years

However in doing this I got quite different numbers from Lorna, and its due to how you choose to count orders. Unlike a sales value, where the value is just attributed to a single line in the data set, an Order ID can span multiple lines, which means they can span sub-categories too. Eg imagine you have

Order A – contains 2 line of appliances, and 1 line of phones

Order B – contains 1 line of appliances.

How many orders are there? Answer = 2

How many order lines are there? Answer = 4

How many orders contain appliances? Answer = 2 (100%)

How many order lines contain appliances? Answer = 3 (75%)

How many orders contain phones? Answer 1 (50%)

How many order lines contain phones? Answer 1 (25%)

So when it comes to the ‘market share %’, should I be considering order lines (which means it’ll sum to 100%) or distinct orders which means it’ll sum to more than 100%.

I wrestled with this, and ultimately concluded, this challenge is purely for illustrative purposes, and to stick with my original assumption based on distinct orders (which I’d already calculated before finding the discrepancy). So my Market Share % won’t sum to 100%, and probably isn’t really a definition of Market Share, but I made peace with myself and moved on 🙂

# Orders

COUNTD([Order ID])

Orders Latest Year

COUNTD(IF YEAR([Order Date]) = [Max Year in Dataset] THEN [Order ID] END)

Orders Previous Years

COUNTD(IF YEAR([Order Date]) <> [Max Year in Dataset] THEN [Order ID] END)

Average Orders Previous Years

[#Orders Previous Years]/([No of Years]-1)

Growth Orders

([#Orders Latest Year]-[Average Orders Previous Years])/[Average Orders Previous Years]

Total Orders

{FIXED [Dates to Include] : COUNTD([Order ID])}

Market Share Orders

([# Orders])/MIN([Total Orders])

For 2 years this gave me

Once again the Market share mid point was defined to be a 7% constant, while the growth % mid point needed to be half the max value, which in this instance based on the above, was 1/2 of 51.2%. The same type of LOD was required which gave me

Mid Orders Growth Constant (Sub Cat)

({EXCLUDE [Sub-Category]:MAX({INCLUDE [Sub-Category]: ([#Orders Latest Year]-[Average Orders Previous Years])/[Average Orders Previous Years]})})/2

and from this I could the categorise with

Orders Category (Sub Cat)

IF [Growth Orders]<=MIN([Mid Orders Growth Constant (Sub Cat)]) AND [Market Share Orders]<=MIN([Mid Market Share Constant]) THEN ‘Dogs’ ELSEIF [Market Share Orders]<=MIN([Mid Market Share Constant]) AND [Growth Orders]>MIN([Mid Orders Growth Constant (Sub Cat)]) THEN ‘Question Marks’
ELSEIF [Growth Orders]>MIN([Mid Orders Growth Constant (Sub Cat)]) AND [Market Share Orders]>MIN([Mid Market Share Constant]) THEN ‘Stars’
ELSE ‘Cash Cows’
END

Top 20 Manufacturers Bar Chart

To create the bar chart, I created a new field Manufacturer-SubCat as

[Manufacturer] + ‘ (‘ + [Sub-Category] + ‘)’

I then set about creating the same data tables as I’ve included above, one for Sales and one for Orders.

The majority of the measures used were the same, but when it came to determining the growth mid-point, I found creating a separate field was simpler than trying to use a single field that worked across both Sub-Category and Manufacturer-SubCat. I followed the same methodology though:

Mid Sales Growth Constant (Manu Sub Cat)

({EXCLUDE [Manufacturer-SubCat]:MAX({INCLUDE [Manufacturer-SubCat]: (SUM([Sales Latest Year])-[Average Sales Previous Years])/[Average Sales Previous Years]})})/2

Mid Orders Growth Constant (Manu Sub Cat)

({EXCLUDE [Manufacturer-SubCat]:MAX({INCLUDE [Manufacturer-SubCat]: ([#Orders Latest Year]-[Average Orders Previous Years])/[Average Orders Previous Years]})})/2

I also found though that the 7% market share constant didn’t really give me the split, so I decided a 3% constant would be better to give me categorisations at this level into each bracket. As I had different comparison values, I needed new categorisation fields too.

Sales Category (Manu Sub Cat)

IF [Growth Sales]<=MIN([Mid Sales Growth Constant (Manu Sub Cat)]) AND [Market Share Sales]<=MIN([Mid Market Share Constant (Manu Sub Cat)]) THEN ‘Dogs’ ELSEIF [Market Share Sales]<=MIN([Mid Market Share Constant (Manu Sub Cat)]) AND [Growth Sales]>MIN([Mid Sales Growth Constant (Manu Sub Cat)]) THEN ‘Question Marks’
ELSEIF [Growth Sales]>MIN([Mid Sales Growth Constant (Manu Sub Cat)]) AND [Market Share Sales]>MIN([Mid Market Share Constant (Manu Sub Cat)]) THEN ‘Stars’
ELSE ‘Cash Cows’
END

Orders Category (Manu Sub Cat)

IF [Growth Orders]<=MIN([Mid Orders Growth Constant (Manu Sub)]) AND [Market Share Orders]<=MIN([Mid Market Share Constant (Manu Sub Cat)]) THEN ‘Dogs’ ELSEIF [Market Share Orders]<=MIN([Mid Market Share Constant (Manu Sub Cat)]) AND [Growth Orders]>MIN([Mid Orders Growth Constant (Manu Sub)]) THEN ‘Question Marks’
ELSEIF [Growth Orders]>MIN([Mid Orders Growth Constant (Manu Sub)]) AND [Market Share Orders]>MIN([Mid Market Share Constant (Manu Sub Cat)]) THEN ‘Stars’
ELSE ‘Cash Cows’
END


To Sort the Manufacturer-SubCategory data by Sales or Orders, I created a SORT BY parameter with values of Sales and Orders.

I then created

Sort Value

IF [Sort] = ‘Order’ THEN [# Orders] ELSE SUM([Sales]) END

On the Manufacturer-SubCat field, I then set the Sort by property to be

Manufacturer-SubCat was then also added to the Filters shelf, with the filter set to

And that covers the main calcs and complexities that went into my version/interpretation of this challenge. I hate it when I can’t get the numbers to match 😦 but ultimately now feel comfortable with what I did and I hope it might explain why some of you may have got differences too…

My version of the viz is here, and I’ve included additional detail on the tooltips too which should show the numbers that went into the % calcs for each mark.

Happy vizzin!

Donna

Can you build a Top N Bar Chart on a Single Worksheet?

So back from my holibobs and catching up on the #WorkoutWednesday challenges I’ve missed while I’ve been sunning myself on a Greek beach.

First up is another guest challenge posted by Jeffrey Shaffer who asked us to create a Top N bar chart of the quantity per manufacturer per region on a single viz. The main twist with this is that a manufacturer could be ranked in different locations for each region. The full challenge is here.

Determine Manufacturer

In the 2019.1 Superstore Sales data set I had on my machine, there is no Manufacturer field. I couldn’t see one in the link to the dataset included on the blog page either. Perhaps I was missing something, but given I was on catch up, I chose not to find out the real reason. Instead I decided to derive my own Manufacturer field by just taking the first word of the Product Name. It meant my quantity figures wouldn’t match up, but the principles behind the challenge wouldn’t be affected.

MID([Product Name],1, FINDNTH([Product Name],’ ‘,1)-1)

Note – I’ve watched Ann’s solution (published on the challenge page), since I published my version, and Ann just lifts a pre-defined Manufacturer group onto her viz, which also already has an ‘Other’ manufacturer. My calculated field above has no such value. Ann’s solution is therefore a bit more straightforward than the work I had to put in to determine ‘Other’.

I interpreted ‘Other’ to be the total quantity of all the manufacturers not in the top n. But of course the top n had to take into account the Region, so it was possible that a manufacturer listed explicitly in the top n for the Central region say, might be included within the ‘other’ pot for another region.

To work this out, I created a quantity field per region ie

Central – Qty

IF [Region] = ‘Central’ THEN [Quantity] END

East – Qty

IF [Region] = ‘East’ THEN [Quantity] END

South – Qty

IF [Region] = ‘South’ THEN [Quantity] END

West – Qty

IF [Region] = ‘West’ THEN [Quantity] END

I then created a set per region to store the top n manufacturers based on the Quantity per region

The set was based on the Manufacturer field, and limited to the Top n Manufacturers parameter I set up, associated to the relevant [Region] – Qty field created above.

I had 4 sets in the end : Top Central, Top East, Top South, Top West

Note – It is entirely possible with this set up, that each set could contain a completely different set of manufacturers, so up to top n X 4 manufacturers in total

So having got a set which stored my top n per region, I needed to get the ‘Other’ grouping. To do this I created a new field

Manufacturer Category

IF [Region] = ‘Central’ AND [Top Central] THEN [Manufacturer]
ELSEIF [Region] = ‘East’ AND [Top East] THEN [Manufacturer]
ELSEIF [Region] = ‘South’ AND [Top South] THEN [Manufacturer]
ELSEIF [Region] = ‘West’ AND [Top West] THEN [Manufacturer]
ELSE ‘Other’
END

With the Top n parameter set to 10, this gives me

There are 12 rows listed, but only 11 values per region (the top 10 + other). You can see here that Wilson is within the top 10 for East & West, but not Central and South, whilst Hon is in top 10 for Central, East & South but not West. This is why there are 12 rows. If I changed the Top n parameter to 15, I’d get 20 rows listed, as there are 19 manufacturers across the top 15 for all the regions + other.

To get the sorting, we need to introduce an old friend :

Index

INDEX().

Adding this as a discrete (blue) pill onto the rows, moving the Manufacturer Category onto Text, and setting the Table Calculation Properties of the Index field to …

  • Compute over both Region and Manufacturer Category (with Region listed first)
  • To restart every Region
  • To sort by SUM of Quantity descending

… I start to get what I need

But I’ve still got 1 too many rows, I need to deal with the ‘Other’ pot. If my top n parameter is set to 10, then I essentially want to be able to show rows 1-10 above if the Include Other parameter is true, or show rows 2-11 if the Include Other parameter is false (Include Other is a boolean parameter I created).

FILTER – Other

NOT([Include Other]) AND [Manfacturer Category] = ‘Other’

Adding this to the Filter shelf and setting to False, basically hides the Manufacturer Category = Other when Include Other = No, or shows Manufacturer Category = Other when Include Other = Yes

So when Include Other = No, I have 10 rows indexed 1-10, and when Include Other = Yes, I have 11 rows indexed 1-11.

I only want the number of rows based on my Top N parameter, so I created an additional field

FILTER – Index

[Index] <= [Top n Manufacturers]

which is added to the Filters shelf and set to true, so only the rows that are less than or equal to the parameter value are displayed.

So we’re nearly there, except Jeff added an additional curveball…

Display Manufacturer & Rank on hover

To tackle this, I needed 3 more calculated field and a set, as we’re going to need Set Actions to handle this.

Index Rank

‘(#’ + STR([Index]) + ‘)’

this simply formats the index value into the format (#1)

Manufacturer + Rank

ATTR([Manufacturer Category]) + ‘ ‘ + [Index Rank]

this concatenates the two fields together – note the ATTR() is needed as Index Rank references the aggregated table calculation of INDEX().

Highlighted Manufacturer

This is a set created by simply right-clicking on Manufacturer Category and selecting Create -> Set, then choosing an arbitrary option (eg Avery). This will be changed via Set Actions later…

LABEL: Manufacturer

IF ATTR([Highlighted Manufacturer]) THEN ([Manufacturer + Rank]) ELSE ATTR([Manfacturer Category]) END

This is basically displaying the manufacturer with the rank if the manufacturer is in the set.

Adding LABEL:Manufacturer into our data table above, you can see that the Avery manufacturer is displayed with the rank, but the others aren’t

To make the value of the set change on hover, the view needs to be added onto a dashboard, and a Set Action created.

Create a dashboard sheet, and add the view. Then select Dashboard -> Actions -> Add Action -> Change Set Values

Set the properties in the dialog as below – the action to run on Hover, the Target Set to the be the Highlighted Manufacturer created above, and remove all set values when you ‘hover off’

Hovering on Eldon, for example, you can see the manufacturer label is changing in the relative positions

So that’s all the building blocks now in place, the viz just needs to be created, which I did by duplicating my table version above, and moving the pills around.

I basically created a synchronised dual axis with Index on rows, Region, MIN(0), SUM(Quantity) on columns. MIN(0) was set to be a Gantt bar and labelled with the LABEL:Manufacturer field set to be left aligned. The SUM(Quantity) field is labelled with the Quantity. Both labels were set to match the mark colour, which was based on Region using the Nuriel Stone palette specified. Headers were hidden and various formatting applied to the grid lines/rows/columns etc.

My version of the viz is here.

I’m off to ponder about where that Manufacturer field came from, as I think it would have all been a lot simpler if it already existed… hmmmm…

Happy vizzin!

Donna