Which Sub-Categories are frequently ordered together?

Week 49 of 2021, and it was Ann Jackson’s turn to set her final #WOW challenge (sniff sniff!). Ann’s been setting challenges for the last 4 years, and we’re all going to miss her (I’m sure she may pop up as a guest challenger at some point in the future ….).

Ann decided to revisit her first ever challenge which I did complete here.

Obviously in 4 years, Tableau has moved on, and new features make building this concept a bit more straightforward. So let’s crack on.

Modelling the data

Over the years Ann has adapted her style which includes capitalisation of text. Whilst the challenge can be built with the Tableau provided Superstore Sales data source, Ann very kindly provided her own version which already contained the capitalised fields. I used this version.

To do this type of ‘basket analysis’ you need to use 2 instances of the data source which you model using relationships as follows :

Add a relation of Order ID = Order ID and Sub-Category <> Sub-Category

This results in a data pane which lists a set of fields from the Ann Jackson Superstore data source and a set of the same fields from the Ann Jackson Superstore1 data source. In this blog, I’ll reference these data sources as AJS and AJS1. If you examine a single order with multiple sub-categories you’ll see how the data is being related

The first 2 columns above are from the AJS data source – the order contains 4 different sub-categories. The last 2 columns are from the AJS1 data source, and you can see that for each AJS.Sub-Category, only 3 AJS1.Sub-Category records are listed. This allows us to see the combination of what was ordered with what.

Building the matrix

Add AJS.Sub-Category to Rows and AJS1.Sub-Category to Columns and you have the start of your matrix.

The number of orders is displayed in each cell, which is

# Orders

COUNTD([Order ID])

Add this onto Text

This adds Null column, which is the number of orders containing a single Sub-Category.

Right-click on the word Null in the column to select the column, and Edit Alias to rename the Null to NONE.

To remove the top right hand side of the matrix, I indexed each row/column.

Sub Cat Index

INDEX()

Sub Cat 2 Index

INDEX()

Add Sub Cat Index to Rows, change to discrete (blue pill) and move to be in front of AJS.Sub-Category. Change the table calculation so that it is computing by the AJS.Sub-Category field. Each row should now be numbered from 1 to 17.

Now add Sub Cat 2 Index to Columns, again change to discrete and move to be in front of the AJS1.Sub-Category pill. Edit the table calculation to verify it is computing by the AJS1.Sub-Category field. All the columns should now be numbered 1-18.

We want to remove the data for the records where the Row Index is greater then the Column Index

FILTER

[Sub Cat Index]>=[Sub Cat 2 Index]

Add this field to the Filter shelf, and set to true.

To colour the cells, add # Orders to the Colour shelf, and change the mark type to Square. Then edit the colour palette to use the Colour Brewer Blue Purple colour palette Ann advises (she’s used these palettes in the past, so I already have them installed, but you should be able to get them quickly from here).

Format the text to match mark colour and to be aligned centre. I also set the font to be Tableau Medium and bold. Set the column and row border divider lines to white.

Now uncheck show header against the two Sub Cat Index fields, and hide labels for column/ rows on the other column/row headings to remove them. Rotate the label for the columns and adjust the alignment, Adjust the font on both the column and row headers to bold and you should have the finished visual.

The Tooltips

We need to calculate the average sales contribution per order. We need 2 calculations for this, as we need to show a value for the AJS.Sub-Category on the rows and a value for the AJS1.Sub-Category displayed on the columns.

Average Sales (Rows)

SUM([Sales])/[# Orders]

Average Sales (Cols)

SUM([Sales (Ann Jackson Superstore1)])/[# Orders]

Add both of these to the Tooltip shelf. It’ll make a Null row appear – just right click and exclude.

Ann’s also been particular about some of the text displayed, so we need calculated fields to apply the required logic.

TOOLTIP – Sub Cat 2 Title

“& ” + [Sub-Category (Ann Jackson Superstore1)]

TOOLTIP- Sub Cat Text

IF [Sub Cat 2 Index]=1 THEN “SUB-CATEGORY” ELSE “SUB-CATEGORIES” END

TOOLTIP – Sub Cat 2

IF NOT(ISNULL([Sub-Category (Ann Jackson Superstore1)])) THEN [Sub-Category (Ann Jackson Superstore1)] + “:” END

Add all three fields to the Tooltip shelf, and adjust the tooltip as below

And with that, you should now have the fully completed challenge. My published version is here.

Finally, once again I want to thank Ann for all her contributions to the #WOW community over the last 4 years. I will miss her capitals and bold colour palettes – I’ll be storing away her pre-capitalised version of Superstore for sure :-)!

Happy vizzin’!

Donna

Thanksgiving Day NFL Games

Sean Miller posted this week’s challenge based on the results of the annual NFL games hosted on Thanksgiving Day. It immediately reminded me of a previous #WOW challenge that Lorna posted in 2019 when she visualised Rugby League wins (see my viz here).

This is a table calculations based challenge. I did start using FIXED LoDs to help calculate the summary measures (Total Games and Win %) displayed at the front, but found that as there are 2 years (1975 and 1977) when the Dallas Cowboys did not host a game, I ended up with some pesky NULL values displaying which affected how the running sum area chart displayed.

Defining the calculations

As its a table calc challenge, I’ll build out what I can into a table to start with, to sense check I’m getting the correct numbers.

First up add Home Team, Game Date and Visiting Team to Rows and display Home Score and Visiting Score.

We start by determining the result of the fixture, based on whether it’s a home or away win or a tie. In the lollipop chart home wins are plotted at 1 and away wins at -1, so we’re going to store the result as a numeric value rather than text.

Result

FLOAT(IF [Home Score]>[Visiting Score] THEN 1
ELSEIF [Home Score]<[Visiting Score] THEN -1
ELSE 0 END)

The output is wrapped within a FLOAT, as this will help how the axis displays. Without it, by default Tableau will define the field to be a whole number, and the axis will extend to +/-2 which is too much room. We can’t adjust (fix) the axis to a decimal if the field itself is an integer, and adjusting to +/-1 chops off the displayed marks.

If you add this to the display, it will show 1, 0 -1 as you expect. You’ll notice though that the Axis on the lollipop chart is labelled as Win/Loss. This is achieved by applying a custom format to the field – “Win”;”Loss”;”Tie”

This is a sneaky but effective trick. The information stated before the first semi-colon applies to positive numbers, the info after the first semi-colon applied to negative numbers, and the information after the optional second semi-colon applies to zero.

Unfortunately though, it would appear that, at the point of writing, Tableau Public, isn’t honoring the zero formatting, and is displaying Win rather than Tie. The display works on Desktop though.

The win/loss/tie text is just a formatting feature and affects what is displayed, but the underlying value is still a number.

The Result field will be used to plot the lollipop chart. We now want a field to plot the area chart against. This is a running total of the Result values (ie win =1, win, win = 1+1, win, win, loss = 1+1 -1) and we need a table calculation.

However, as stated above due to a couple of missing years, I had to make an adjustment to ensure the running total displayed as Sean had in his challenge. I created another field

Result Adjusted

IIFNULL(SUM([Result]),0)

If the Result field doesn’t exist, as there is no data, then use 0 instead.

To see what’s going on, we’re going to need a different view of the data where the date field is continuous (green) rather than discrete (blue).

Build the below, and filter just for the first 10 years – you’ll see the gaps where the are no marks in 1975 and 1977 for Dallas

Use the context menu of the green YEAR(Game Date) pill and select the option to Show Missing Values. Marks will now display

Add Result to Label. Each mark is labelled Win or Loss, except the ones for Dallas for 1975 & 1977 as there is no data

Now add Result Adjusted to Label. A 0 value is now displayed against those two marks.

We can now build a running total off of this measure instead

Running Total Wins

RUNNING_SUM(([Result Adjusted]))

Add this to the Label too and verify the table calculation is computing by the Game Date field only. The running total for the 2 ‘missing’ dates is displaying a value which is the same as the previous value (since we’ve added 0 onto the running total). This will give us the flat line in the area chart when we come to build it.

Now back to our table of data, we can focus on the other calculated fields we need….

Total Games

WINDOW_COUNT(COUNTD([Game Date]))

This is a table calculation and is simply counting the number of distinct dates displayed. Add this to the table display we were building to start with, and adjust the table calculation to compute by all fields except Home Team. The total should display the same value for all the rows against each Home Team.

Next we want a field to indicate if the row is a win.

Is Win?

INT([Home Score]>[Visiting Score])

This is taking a boolean of true or false and converting to an INT (1 or 0).

From this we can work out the Win rate

Win %

WINDOW_SUM(SUM([Is Win?]))/[Total Games]

Add up all the Is Win? values associated to the Home Team as a proportion of the Total Games played. Format this field to a percentage with 0 dp. Again, add to the table and adjust the table calc to compute by all fields except Home Team, and verify the same settings applied to both the calculations nested in this calculation

For the All-Time Record, we need to know the number of wins and number of losses. We have a field to help us with the wins, but need an equivalent for the losses

Is Loss?

INT([Home Score]<[Visiting Score])

And from this we can work out

All-Time Record

STR({FIXED [Home Team]: SUM([Is Win?])}) + ‘-‘ +
STR({FIXED [Home Team]: SUM([Is Loss?])})

This is the one field I kept from my LoD based attempt.

The circles on the lollipop chart are coloured based on the difference in the score, so lets’s create that

Score Difference

[Home Score]-[Visiting Score]

And finally we need some fields to help display the tooltips properly. The tooltip indicates whether the result was ‘won’ or ‘lost’ which is different text to the axis labels.

TOOLTIP-Result

IF [Result]=1 THEN ‘won’
ELSEIF [Result]=-1 THEN ‘lost’
ELSE ‘tied’
END

The tooltip also displays the scores, but the scores are always presented as highest score – lowest score and not home score – visiting score. So we need fields to store the right values

TOOLTIPHigher Score

IF [Is Win?]=1 THEN [Home Score] ELSE [Visiting Score] END

TOOLTIP – Lower Score

IF [Is Loss?]=1 THEN [Home Score] ELSE [Visiting Score] END

Pop all these fields out onto the table, so you can validate you’ve got all your calcs right before building the viz.

Building the area chart

Add Home Team to Rows, Game Date (continuous, show missing values) to Columns and Running Total Wins to Rows (ensure table calculation set as required). Change to mark type of Area. You should have 2 horizontal lines from 1974-1975 and 1976-1977 against the Dallas Cowboys row.

Adjust the tooltip, edit the label of the Running Total Wins axis , and remove the label of the Game Date axis.

Building the lollipop chart

Now add Result to Rows directly after the Home Team pill. Change the mark type to circle.

Add Score Difference to the Colour shelf of the circle mark, and adjust the starting colour range to a dark grey. Readjust the colour of the area chart to blue too. Add a border to the area chart too (via the colour shelf).

Add another instance of Result to the Rows shelf, next to the existing one. Set the mark type of this to bar. Reduce the size to the smallest possible, set the colour to grey and remove the border.

Now set this to be dual axis, synchronise the axis, and set the marks of the 2nd Result axis displayed on the right hand side to move marks to back. Uncheck Show Header to remove this axis from displaying.

Add Visiting Team, TOOLTIP-Result, TOOLTIP-Higher Score and TOOLTIP-Lower Score to the Tooltip shelf of both the Result marks cards, and adjust the tooltip on both to

Remove the Column dividers.

Now drag Total Games to Rows and drop next to the Home Team field. Change to be discrete (blue). Verify the number is what you expect and adjust the table calc if need be.

Add All-Time Record and Win % (set to discrete) to the view too. Then format these 4 fields so the text is larger and aligned centrally.

All that’s left now is to add the sheet to a dashboard. My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Filtering Table Calcs

It was Sean Miller’s turn this week to set the challenge, which had an added twist – to complete only using the web authoring feature in Tableau Public, something I’d never tried before.

Most challenges use a version of Superstore Sales which I usually already have on my laptop, due to the multiple instances of Tableau Desktop installed, but this time, I had to get the data from data.world. I saved this locally then logged into Tableau Public online and started the process to build a viz online, by clicking the Create a Viz (Beta) option.

When prompted I then chose to upload from computer, the data source file I’d downloaded, and once done I was presented with the online canvas to start building

Unsure as to whether there was any auto save facility online, I decided to save the workbook pretty much immediately, although I was first prompted to navigate to the Data Source tab and create an Extract before I could save.

Now I was ready to start building out the requirements.

I started with the trend chart. The grain of the chart was at the month level, so I first created a specific field to store the Order Date at this particular level

Order Date Month

DATE(DATETRUNC(‘month’,[Order Date]))

I then added this to the Columns shelf as a continuous (green) exact date, and added Sales to the Rows shelf. I changed the mark type to Area, set the Colour to a blue, dropping the opacity to 50% in an attempt to match Sean’s colouring and adjusted the tooltip.

I then added another instance of Sales to the Rows shelf next to the existing one, changing the mark type of this instance to Line and setting the Colour to a darker blue at 100% opacity. I then used the context menu on this pill to set a Quick Table Calculation of type Moving Average.

By default this sets the average to the 3-month rolling average as per the requirement. I adjusted the tooltip, made the chart dual axis and synchronised the axis, then removed Measure Names from the Colour shelf that had automatically been added.

My next step at this point was to work on the filtering requirement. To help with this, the first thing I did was to Duplicate as crosstab (right click on the chart tab). This created a tabular view of the data so I could work on getting the calculations I need for the filtering. I flipped the columns and rows, so my starting point was as below.

I chose to use parameters to capture the min and max dates that the user selects on the dashboard.

pMinDate

Date parameter defaulted to 01 Jan 1900

And I also created pMaxDate exactly the same way.

I then needed fields to store the relevant dates depending on whether a selection had been made or not

Min Date Selected

IF [pMinDate]=#1900-01-01# THEN {FIXED : MIN([Order Date Month])}
ELSE [pMinDate]
END

The FIXED statement sets the date to the minimum month in the data set, if the parameter is set to its default. The below doed similar to get the maximum date in the data set.

Max Date Selected

IF [pMaxDate]=#1900-01-01# THEN {FIXED : MAX([Order Date Month])}
ELSE [pMaxDate]
END

Using these dates, I then created a field to determine whether the month was within the min & max dates

Is Month Selected?

[Order Date Month]>=[Min Date Selected] AND
[Order Date Month]<= [Max Date Selected]

Pop this field onto the table view, and show the parameters and set them to some other values eg Min Date = 01 Jun 2018 and Max Date = 01 Sept 2018. You’ll also need to adjust the table calc setting of the Moving Average Sales pill to compute by Is Month Selected as well.

You’ll see True is stamped against the rows which match the date range. Now, if you make a note of the Moving Average values against these rows, and then add Is Month Selected? = True to the Filter shelf, you should get just the True rows displaying, BUT the Moving Average values are now different. This is because, with this type of field as a filter, the filter is being applied BEFORE the table calculation is being applied – the table calculation essentially has ‘no knowledge’ of the rest of the rows of data as they have been filtered out. Solving this issue is the crux of this challenge.

Remove Is Month Selected? from the filter shelf. We’re going to create a new field to filter by instead

FILTER

LOOKUP(MIN([Is Month Selected?]),0)

This field uses the LOOKUP table calculation, and is basically looking at the value of the Is Month Selected? field in it’s ‘own row’ (denoted by the 0 parameter).

Pop this onto the table view, and you can see that it’s essentially a duplicate of the Is Month Selected? field.

Now add the FILTER =True field to the Filter shelf instead, and this time the rows will once again filter as required, but the Moving Average values will be the same as those in the unfiltered list, which is what we need.

This works because the field we’re filtering by is a table calculation. When added to the filter shelf, these types of fields are applied at the ‘end’ of the processing; ie the table calculation is computed against all the rows in the data AND THEN the filter gets applied. As a result the Moving Average values are retained.

This is all part of Tableau’s Order Of Operations and is a fundamental concept when working with filters.

Armed with this knowledge, we can now apply the FILTER=True to the Filter shelf of the chart we built.

Now we can start to work on the bar charts, which will show Sales along with Average Monthly Sales. For this 2nd measure, I need to work out the number of months captured between the min & max dates.

No. Months

DATEDIFF(‘month’, [Min Date Selected],[Max Date Selected])+1

And with this I can now create

Avg Monthly Sales

SUM([Sales])/MIN([No. Months])

The bar chart viz can then be quickly created by adding Sub-Category to Rows and both Sales and Avg Monthly Sales to Columns, and sorting by Sales descending. The colours of each bar are adjusted to suit.

The bars also need to react to the dates selected, but we can simply add the Is Month Selected? = True to the Filter shelf for this, as none of the values displayed on this chart are reliant on table calculations.

To remove the gridlines from both charts, select Format > Workbook from the menu, scroll down the Format dialog displayed on the right and set Gridlines to Off

Finally, title both sheets appropriately, and then add to a dashboard. We need to add 3 actions to this dashboard – 2 parameter actions and 1 filter action.

Create a parameter action which will on Select, set the pMinDate parameter by passing the Minimum Order Date Month, and will reset back to 01 Jan 1900 when unselected.

Then create another instance, which sets the pMaxDate parameter instead, passing the Maximum Order Date Month.

Finally, create a Filter Action which on Select of the bar chart, filters the trend chart

And with that, the challenge should be complete. My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you use set actions to create a reference date range?

There’s a lot packed into the challenge this week, which was “set” by Lorna Brown and Erica Hughes to test our Set Action skills. How detailed this blog will go, I have yet to decide… I’ve got a couple of hours to get this nailed, so it could get quite brief as we get towards the end 🙂

I’ve got 6 sheets/charts making up this dashboard, so my intention is to summarise each one, and I’ll define the various calculations that are going to be needed as we go.

  • The overall summary table
  • The selected months summary table
  • The trend line
  • The donut chart
  • The top 3 states table
  • The map
  • Adding the interactivity

The overall summary table

This challenge is focused on understanding the Sales per month. Whilst its possible to use the built in aggregation features of a date field, I often prefer to create explicit date fields at the level I require, so it’s easier to reference. Therefore, the first field I created for this challenge was

Order Date To Plot

DATETRUNC(‘month’, [Order Date])

This essentially ‘groups’ every order placed in a month to be tagged with the 1st of the month. I custom formatted this field to MMM yyyy (ie Oct 21).

For the overall summary table, I need to capture the total sales of the whole data set, and I use a Fixed LoD calculation for this.

Total Sales

{FIXED: SUM([Sales])}

This field is formatted to $0.00M

NOTE – I actually named this field <space>Total Sales<space> as I want to display the name of the field (the measure name) in the summary table, but the ‘selected months’ summary table also has a Total Sales measure which is a different calculation (see later). Adding the <spaces> is a sneaky way to get two fields with what appears to be the same name. As this field when displayed will be centred, the <spaces> aren’t noticeable.

We also need to get the monthly average sales for the whole data set

Average Sales by Month

AVG({FIXED [Order Date To Plot]: SUM([Sales])})

Format this to to $0.0K

We can now build the summary table by adding Measure Names to the Filter shelf and selecting these 2 fields. The placing Measure Names on Rows and Measure Names and Measure Values on Text. Reorder the measures as required, hide Measure Names on Rows and format the Text as required.

Change the title of the sheet to Superstore Sales, ensure the tooltip doesn’t display and remove all gridlines / row banding etc.

The selected months summary table

The core requirement of this challenge is to make use of set actions, so we’re obviously going to need a set which will contain the dates (months) the user will select on the chart. This set will be based off of Order Date To Plot. Right click on the field > Create > Set. Name it Order Dates To Plot Set and by default, select all the months between Oct 20 to Jun 21 inclusive.

Later I’ll describe how the values of this set will get updated, but for now, we need to get some information relating to the sales in these selected months.

Firstly, we want the total sales for the months in this set.

Total Sales

IF [Order Date To Plot Set] THEN [Sales] END

The default format for this field is set to $ with 0 dp.

Note – this is the other ‘total sales’ field mentioned earlier. This field name has no leading/trailing spaces.

To get the average, I needed a field just to store each member of the set (ie each selected month)

Selected Dates

IF [Order Date To Plot Set] THEN [Order Date To Plot] END

and with this I can then work out

Average Sales

AVG({FIXED [Selected Dates]: SUM([Total Sales])})

The final measure required for this section is the change within the date range, which is basically comparing the value of sales at the first month in the selection with the sales in the final month selected. We need a few fields to get to this.

Firstly, we want to identify the first and last months

Min Selected Date

{FIXED:MIN(IF [Order Date To Plot Set] THEN [Order Date To Plot] END)}

If the date is in the set, then return the date and then take the minimum of all the dates, and store against all the rows in the data. Similarly we have

Max Selected Date

{FIXED:MAX(IF [Order Date To Plot Set] THEN [Order Date To Plot] END)}

Putting this info into a table, you can see how the calculations are working. The values for the Min & Max dates are the same across every row.

Next we need to get the Sales at the min & max points, and spread that value across all rows

Sales at Min Date

{FIXED: SUM(IF [Order Date To Plot]=[Min Selected Date] THEN [Sales] END)}

Sales at Max Date

{FIXED: SUM(IF [Order Date To Plot]=[Max Selected Date] THEN [Sales] END)}

Now we can work out the difference

Change within Date Range

([Sales at Max Date]-[Sales at Min Date])/[Sales at Min Date]

format this to a percentage set to 1 dp

Finally, we need to know the number of months in the set, which is displayed in the title of the monthly summary sheet.

Months in Set

{FIXED: COUNTD(IF [Order Date To Plot Set] THEN [Order Date To Plot] END)}

If the date is within the set, then capture the date, and the count the distinct set of dates captured.

Make this a discrete field (move from the measures section at the bottom of the data pane to the dimensions section at the top (above the line), and add to the tabular view

Now we can build the summary sheet.

Add Measure Names to the Filter shelf and this time filter by Total Sales, Average Sales and Change within Date Range. Add Measure Names to Rows and Measure Values to Text. Reorder the measures.

Format the Total Sales to be in $K, by selecting the Format option from the context menu of the Total Sales pill on the Measure Values shelf (hover on the pill and click the carrot/down arrow that appears – by formatting this way, we’re changing the display of this field for this sheet only).

Add Min Selected Date and Max Selected Date to the Detail shelf and set to be Exact Date. Format both these fields via the pill context menu to be the ‘March 2001’ format

Also add Months in Set to the Detail shelf.

Adjust the title of the sheet as below

Finally you need to set the background of the worksheet to the relevant purple (I used #8074a8), adjust the colours of all the fonts to white and adjust the size/style of the fonts in the table. Remove all gridlines/row banding etc, and you should have something like below

The Trend Line

By this point we’ve built all the calculated fields we need for this chart. This is a dual axis line chart, as we want the colour of the line for the selected dates to be different from the non selected ones, and we want to display a label for the highest sales in the selected timeframe.

  • Add Order Date to Plot to Columns, and set as a Continuous (green) pill set to Exact Date
  • Add Sales to Rows
  • Add Total Sales to Rows
  • Make the chart dual axis, and synchronise axis.
  • Adjust the colours of the Measure Names colour legend
  • On the Label shelf of the Total Sales marks card, set to label the maximum value only
  • On the All Marks Card, add Min Selected Date and Max Selected Date to the Detail shelf and set to Exact Date.
  • Right click on the Order Date To Plot axis and Add Reference Line
  • Create a reference band that starts at the constant Min Selected Date, ends at the constant Max Selected Date, is bounded by dotted lines and shaded between
  • Hide the Sales and Total Sales axis, format tooltips and adjust the row & column dividers.
  • Change the title and you should get to

The donut chart

Donut charts are 2 different sized pie charts on top of each other, created using a dual axis chart. On the Rows shelf type in MIN(0). Then type the same next to it. This gives you two axis and two marks cards.

We only care about information related to the selected dates for this chart, so we can add Order Date To Plot Set to the Filter shelf, which by default will just restrict the information to the data ‘in’ the set.

Change the mark type of the 1st MIN(0) marks card to Circle and add Sales to the Tooltip shelf. Adjust the size of the charts/mark.

Change the mark type of the 2nd MIN(0) marks card to Pie and add Sales to the Angle shelf. Add State to the Detail shelf. Sort the State field by Sales descending.

Note – the circles might look the same at this point, but if you hover over the bottom one, you should see that it’s segmented by State.

We need some new fields now to help us identify the top ranking states.

Sales Rank

RANK(SUM([Sales]))

This is a table calculation, so it’s best to see how this field will work in a table view – build one out as below, and set the table calculation on the Sales Rank field as shown

We’re now going to ‘group’ the ranks into the top 3 and everything else

Sales Rank Group

IF [Sales Rank]<=3 THEN [Sales Rank] ELSE 10 END

We can now use this Sales Rank Group field to colour the pie chart. On the 2nd MIN(0) marks card, add Sales Rank Group to the Colour shelf. Adjust the table calculation to compute using State as above, then change the field to be Discrete (blue). Adjust the colours to suit.

Now make the chart dual axis, and synchronise the axis. Adjust the size of the 1st MIN(0) circle to be smaller than the pie. If it’s not showing, right click on the right hand axis and move marks to back. Colour the circle white. Adjust tooltips to suit and hide axis, column/row dividers etc. Update the title. You should have

The top 3 states table

  • Add Order Date To Plot Set to Filter
  • Add State to Rows and Sales to Text and sort descending.
  • Add Sales Rank to Filter and set to At Most is 3. This will just show the top 3 states.
  • Add State to Text
  • Add a Percent of Total Quick Table Calculation to the existing Sales field that’s on the Text shelf (via the context menu of the pill)
  • Add another instance of Sales back onto the Text shelf
  • Adjust / format the font size and layout of the fields on the Text shelf
  • Add Sales Rank to the Size shelf and set to be discrete (blue) and set the mark type to be Text. Adjust the size of the marks – it’s likely it’ll need to be reversed and the range adjusted.
  • Hide the State field on Rows, adjust the font colours, remove row banding and row/column dividers. You should end up with…

The map

  • Add Order Date To Plot Set to the Filter shelf
  • Add State to Detail – this should create a map (edit locations to be US if need be – Map -> Edit Locations menu)
  • Add Sales to the Colour shelf
  • Edit the colour range to a suitable purple range ( I set the darkest colour of the range to #6c638f)
  • Adjust the map layers (Map -> Map Layers) so only the option highlighted below is selected.

Adding the interactivity

Add all the sheets to the dashboard, using vertical and horizontal containers to arrange the relevant layout. Then add a Set Action (Dashboard > Actions > Add Action > Change Set Values). The Set Action should be configured as below :

And fingers crossed, you should now be able to select marks on the trend line and see all the other charts, except the initial summary table, all update. My published viz is here.

Happy vizzin’! Stay Safe!

Donna

How much do top sub-categories contribute to sales?

A colourful #WorkoutWednesday challenge this week, courtesy of Ann Jackson incorporating pie charts, top N functionality and interactivity and a highlight table. Pie charts can cause much debate amongst the data viz community and if this one was just representing the multitude of sub-categories, it certainly wouldn’t be ideal. But when the core aim is to simply present 2 key measures (those in the top N against the rest), the pie is a familiar and effective visual. In this instance, the outer ring segmenting all the sub-categories provides additional context without detracting from the main purpose of the viz.

So lets build…

  • Creating the core calculations
  • Building the Pie Chart
  • Building the Highlight Table
  • Adding the Interactivity

Creating the core calculations

First up, we’re going to need a parameter to define the ‘Top N’. Create an integer parameter with a range from 1 to 17, that steps every 1 interval, and is defaulted to 5.

pTopN

Next we’re going to use a Set to capture the Sub-Categories that are in the Top N Sales. Right click on Sub-Category -> Create ->Set. Use the Top tab to define a set captures the Sum of Sales that is based on the pTopN parameter.

Now, we want to create a grouping of those in and out of the set, which will be used as part of the highlight table

Sub-Cat Group

IF [Sub-Category Set] THEN ‘IN TOP ‘ + STR([pTopN])
ELSE ‘ALL ELSE’
END

Pop all these fields out into a table so you can see what’s going on as you change the pTopN parameter. Sort the Sub-Category by Sales descending.

Now we need to identify the % value of Sales for the Sub-Categories that are in the Top N (this is the label on the darker segment of the central pie chart), so for that we need

Total Sales

{FIXED:SUM([Sales])}

Top N Sales (in hindsight, this should have been named Sales per Group or similar)

{FIXED [Sub-Category Set] : SUM([Sales])}

Top N Sales %

IF ATTR([Sub-Category Set]) THEN
SUM([Top N Sales])/SUM([Total Sales])
END

Format this to percentage with 0 dp.

Adding to the table, we can see the values

The final field we need in order to build the pie, is an additional one to store the label text

Label:SubText

IF [Sub-Category Set] THEN ‘TOP ‘ + STR([pTopN]) END

Building the Pie Chart

To achieve this we’re going to build a dual axis pie chart, where one pie is used to define the In/Out of Top N segmentation in the centre, and the other pie is used to create the outer ring.

Create an axis by typing in MIN(0) onto the Rows shelf, and then adding another instance of MIN(0) next to it. This will generate 2 marks cards, which is where the fields to build the pie charts will be placed.

In the first MIN(0) marks card, change the mark type to Pie, then add Top N Sales to the Angle shelf and Sub-Category Set to the Colour shelf. Adjust colours to suit. Then add Top N Sales % and Label:SubText to the Label shelf. Adjust size of the view and the chart to suit. Also remove all text from the Tooltip.

Positioning the text is a bit fiddly. If you click on the text so the cursor changes to a cross symbol, you can then drag it to a better location. However, when you change the Top N parameter, the text will move. You can go through each parameter value and reposition the text each time (which I did.. it wasn’t too onerous for 17 values), however I found when published to Tableau Public, the positioning wasn’t honoured. Ann’s solution was the same, so I didn’t get too hung up on this, although if anyone resolved it, I’d love to know!

Now on the 2nd MIN(0) marks card, again change the mark type to Pie, and this time add Sales to the Angle shelf and Sub-Category to Colour. Sort the Sub-Category field by Sales descending. Additionally add Sub-Category Set to the Detail shelf (this will be needed later on to make the interactivity work). Edit the colour palette to use the Hue Circle options. Adjust the size of the pie chart. Adjust the tooltip too.

Now make the chart dual axis and synchronize the axis. If the colourful chart is displayed ‘on top’, then right click on the right hand axis and select move marks to back. Adjust the sizes of both pies, so the colour wheel is slightly larger than the other one.

Now hide the axis, and remove all borders and gridlines.

Building the Highlight Table

I’ve built the highlight table as a bar chart. Start off by adding Sub-Category Set, Sub-Cat Group and Sub-Category to Rows. Sort Sub-Category by Sales descending. Then type in MIN(1) into the Columns shelf.

Now add subtotals via the Analysis > Totals > Add all Subtotals menu. This adds 2 additional rows to each section

But we don’t want the ‘grand total’, so click on the Sub-Category Set context menu, and uncheck Subtotals

To position the totals at the top, go to Analysis > Totals > Column Totals To Top

Then add Sub-Category to the Colour shelf, and adjust the colour of the Total bar to white

We now need to get some text onto those bars, but we need some additional calculations to help up with this. Firstly, we want to get the rank of the Sub-Category. We’ll use a table calculation for this

Sales Rank

RANK(SUM([Sales]))

We also need a way to identify the Total rows differently from the main Sub-Categories. I referred to this Tableau KB for help here, and subsequently created

Size

SIZE()

To see what this is doing, add Size to the Label shelf, and adjust the table calculation setting to compute by all fields except the Sub-Category Set. The size of the total rows is 1.

Based on this logic, we can then create

LABEL:Bar

IF [SIZE]=1 THEN ‘SUBTOTAL FOR GROUP’
ELSE ‘#’+STR([Sales Rank]) + ‘ ‘ + ATTR([Sub-Category])
END

Add this to the Label shelf instead of the Size field and adjust the table calc settings as above. Align left. Then add Sales to the Label shelf too and adjust so its on the same row. Adjust the tooltip too.

Now hide the Sub-Category Set and the Sub-Category fields. Right click on the ‘IN TOP x’ text and Rotate Label, then click on Sub-Cat Group text and Hide Field Labels for Rows. Format the header text to suit.

Hide the MIN(1) axis, and set columns and gridlines to None. Adjust the Row dividers to be darker

Adding the Interactivity

Add the 2 sheets onto a dashboard, and add a Highlight Dashboard Action, that on Hover of either of the charts, it highlights the other chart based on the Sub-Category Set only.

I think that’s covered everything. My published viz is here.

Happy vizzin’! Stay Safe!

Donna

How many consecutive starts?

Another table calculation related challenge this week, set by Luke, visualising cumulative starts for NFL Quarterbacks per team from 2006.

Luke provided the data within a Tableau workbook template on Tableau Public, so I started by downloading the workbook and understanding the data structure.

The challenge talks about teams playing over 17 weeks, but the data showed some data associated to weeks 28-32. So I excluded these weeks by filtering them out.

I then started to build out the data in tabular form, so I could start to build up what was required. I added Team, Season, Week and Player ID to Rows, and just to reduce the amount of data I was working with while I built up the calcs, chose to filter to the Teams ARZ, ATL & BLT.

What we’re looking to do is examine each Player ID and work out whether it is the first record for the Team or whether it differs from the previous row’s data. If so then we’re at the ‘start’ of a run, so we record a ‘counter’ value of 1. If not, the values match, so we need to increment the counter.

We’ll do this in stages.

Firstly, let’s get the previous Player ID value.

Prev Player ID

LOOKUP(MIN([Player Id]),-1)

This ‘looks up’ the record in the previous (-1) row. Change this field to be discrete and add to the Rows. Set the table calculation to compute by all fields except Team.

Each Prev Player ID value matches the Player ID from the row before, unless its the first row for a new Team in which case the value is Null.

Then we can create a field to check if the values match

Match Prev Player ID

MIN([Player Id])=[Prev Player ID]

Add this to the view and set the table calc as above, and the data shows True, False or NULL

Now we can work out the consecutive streak values

Consecutive Streak

IF (NOT([Match Prev Player ID])) OR ISNULL([Match Prev Player ID]) THEN 1
ELSE 1+PREVIOUS_VALUE(-1)
END

If we don’t match the previous value or we’re at the start of a new team (as value is NULL), then start the streak by setting the counter to 1, otherwise increment the counter. Add this to the view and set the table calc for both the nested calculations as per the settings described above.

Next we need to identify the last value of the Consecutive Streak for each Team.

Current Streak

WINDOW_MAX(IF LAST()=0 THEN [Consecutive Streak] END)

The inner IF statement, will return the value of Consecutive Streak stored against the last row for the Team. All other rows will be Null/blank. The WINDOW_MAX() statement then ‘spreads’ this value across all the rows for the Team.

Add this onto the view, and set the table calc for all the nested calcs.

Finally, we need one more bit of data. The chart essentially plots values from 2006 week 1 through to 2020 week 17. We need to ‘index’ these values, so we have a continuous week number value from the 1st week. We can use the Index table calculation for this

Index

INDEX()

Add this field to the view, set it to be discrete (blue pill) and position after the Week field on the Rows. Set the table calc as usual, so the Index restarts at each Team.

Now we’ve got all the data points we need, we can build the viz. I did this by duplicating the tabular view and then

  • Remove Prev Player ID and Match Prev Player ID
  • Move Season, Week and Player ID from Rows to Detail
  • Move Current Streak from Text to Rows and change to be discrete (blue)
  • Move Index from Rows to Columns and change to be continuous (green)
  • Move Consecutive Streak from Text to Rows
  • Change mark type to bar, and set to fit width to expand the view.
  • Change Size to be Fixed, width size 1 and aligned right
  • Set the border on the Colour shelf to be None.
  • Remove the Team filter and adjust the row height

All that’s left now is to set the tooltip (add Player to the Tooltip shelf to help this), and then apply the formatting. You can use workbook formatting (Format -> Workbook menu) to set all the font to Times New Roman.

Hopefully this is enough to get you to the end 🙂 My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Visualise Our Survey Data

This week, Ann Jackson set a table calculations based challenge, using the responses from a recent survey on #WorkoutWednesday, as the most requested topic was for table calcs!

There’s a lot of visuals going on in this challenge, and I’m shortly off on my holibobs (so will be playing catch up in a couple of weeks), so I’m going to try to pare down this write up and attempt just to focus on key points for each chart.

Donut Chart

By default when you connect, Respondent is likely to be listed in the ‘Measures’ part of the data pane – towards the bottom. This needs to be dragged into the top half to turn it into a dimension. You can then create

# of Respondents

COUNTD([Respondent])

which is the key field measures are based on throughout this dashboard.

When building donuts, we need to get a handle on the % of total respondents for each track, along with the inverse – the % of total non-respondents for each track. To do this I created fields

Total # of Respondents

TOTAL(COUNTD([Respondent]))

and then

Track – % of Total

[# of Respondents]/([Total # of Respondents])

along with the ‘inverse’ of

Non Track – % of Total

1-[Track – % of Total]

To then build the donut, we ultimately need to create a dual axis chart, with Which track do you participate in? on Columns and a MIN(1) field on Rows. Manually reorder the entries so the tracks are listed in the relevant order.

On the first MIN(1) axis/marks card, build a pie chart. Add Measure Names to the filter shelf and filter to the Track & Non Track % of Total fields. Set Mark Type to Pie Chart and add Measure Values to the angle shelf. Add both Which track do you participate in? and Measure Names to the Colour shelf. Set a white border on the Colour shelf. Reorder the entries in the colour legend, and set the colours appropriately.

The create another MIN(1) field next to the existing one on the Rows shelf

Set this marks type to circle, and remove all the fields from the colour & detail shelves. Set the colour to white. Add Which track do you participate in? and Track – % of Total to the Label shelf and format. Reduce the Size. Make dual axis, and synchronise. Further adjust sizes to suit.

Participation Bar Chart

Plot How often do you participate? against # of Respondents, and then add a Quick Table Calculation to the measure using Percent of Total. Manually re-sort the order of the entries, show mark labels and Colour the bars light grey. Apply relevant formatting.

Diverging Bar Chart

In this bar chart, the percentage of ‘agree’ responses are plotted to the right on the +ve scale and the percentage of the ‘disagree’ responses are plotted to the left on the -ve scale. The percentage of the ‘inbetweeners’ (neither agree nor disagree) is halved, and displayed on both sides. To address this, I created the following:

# of Respondents – Diverging +ve

CASE ATTR([Answer])
WHEN ‘Agree’ THEN [# of Respondents]
WHEN ‘Strongly Agree’ THEN [# of Respondents]
WHEN ‘Neither Agree nor Disagree’ THEN [# of Respondents]/2
END / [Total # of Respondents]

This is the % of total respondents for the ‘agree’ responses and half of the ‘inbetweeners’.

Similarly I then have

# of Respondents – Diverging -ve

(CASE ATTR([Answer])
WHEN ‘Disagree’ THEN -1*[# of Respondents]
WHEN ‘Strongly Disagree’ THEN -1 *[# of Respondents]
WHEN ‘Neither Agree nor Disagree’ THEN ([# of Respondents]/2) * -1
END) / [Total # of Respondents]

which is doing similar for the ‘disagree’ responses, except all results are multiple by -1 to make it negative.

The Question field is added to the Filter shelf and the relevant 5 questions are selected. Answer is also on the Filter shelf with the N/A answer excluded.

Add Question to Rows (and manually sort the entries), then add # of Respondents – Diverging -ve and # of Respondents – Diverging +ve to Columns and add Answer to the Colour shelf. Manually resort the entries in the colour legend and adjust the colours accordingly.

Make the chart dual axis, and synchronise the axis. Change the mark type back to bar and remove Measure Names from the colour shelf if it was added. Edit the bottom axis to fix the range from -0.99 to 0.99 and amend the title. Format the axis to display as percentage to 0 dp. Hide the top axis.

Additionally format both the measures to be percentage 0dp, but for the # of Respondents – Diverging -ve custom format, so the negative value is displayed as positive on the tooltip.

Adjust formatting to set row banding, remove gridlines etc and set tooltips.

Vertical bar chart

The best way to start building this chart is to duplicate the diverging one. Then remove both measures from the Columns shelf and add Answer to Columns. Manually re-sort the answers. Add # of Respondents to Rows and add a Quick Table Calculation of percent of total.

Show the marks label, and align bottom centre, and match mark colour. Hide the axis from displaying, and also hide the Question field (uncheck show header). Update the Tooltip.

Heatmap

Right click on the Question field > Aliases and set the alias for the relevant questions

Also add Question to Filter and select relevant values. Add Answer to Filter too and exclude NULL.

Add Question to Columns and add Answer to the Text shelf. Add # of Respondents to the Text shelf, and set to Percent of Total quick table calculation. Edit the table calculation to compute using the Answer field only

We need to get each of these columns ‘sorted’ from high to low – we want to rank them. To do this, add # of Respondents to Rows, then change it to be a blue discrete pill. Add a Rank quick table calculation and once again set to compute by Answer only. Also set the rank to be Unique

Now change the mark type to square, and then add the # of Respondents percent of total field onto Colour as well as Text (the easiest way to do this to retain all the table calc settings, is to hold down Ctrl then click and drag the pill from the Text shelf onto Colour. This should duplicate the pill.

Format the % of total displayed to be 0dp, and adjust the label. Change the Colour to use the purple range and set a white border too. Hide the ‘rank’ field from displaying and hide field labels for columns too.

The dashboard

I used a vertical container then added the objects as required, using nested horizontal containers to organise the side by side charts.

To make the diverging bar and vertical bar charts look like they are one chart, adjust the padding of diverging bar chart object to have 0 to the right, and similarly, adjust the padding of the vertical bar to have 0 padding to the left.

I found it a bit fiddly to get the charts to line up exactly. Both charts were set to fit entire view. The diverging bar chart displays it’s title. I also displayed a title on the vertical bar chart, but made the text white so it’s invisible.

Dashboard filter actions are set against the donut and the participation bar charts.

The filter uses selected fields, which for the donut chart references the Which track do you partcipate in? field. A similar dashboard action needs creating for the participation chart as the source and references the How often do you partcipate? field.

A highlight dashboard action is required for the diverging and vertical bar charts. They only impact each other and should be set up as below on hover.

Hopefully I’ve covered everything… my published version is here.

Happy vizzin’! Stay Safe!

Donna

Let’s go streaking!

It was Sean Miller’s turn to set the challenge this week, where the primary focus was to find the highest number of consecutive months where the monthly sales value was higher than the previous month.

This was a table calculations based challenge, and I always tackle these by building out the data required in a tabular format. The challenge was also reminiscent of a previous challenge Sean has set, which I’ve blogged about here, and admit I used as a reference myself.

So let’s get started.

To start with, we need the month date, the Sub-Category, the Sales value and the difference in Sales from the previous month. For the month date, I like to define this explicitly

Order Date Month

DATE(DATETRUNC(‘month’,[Order Date]))

This aligns all Order Dates to the 1st of the relevant month.

Add Sales Category, Order Date Month (set to discrete exact date blue pill), and Sales into a view, then set a Quick Table Calculation of Difference on the Sales pill

Edit the table calculation to compute by Order Date Month only, so the previous calculation restarts at each Sub-Category.

Then drag this pill from the marks card into the left hand data pane to ‘bake’ the calculated field into the data model. Name the field Sales Diff. The re-add Sales back into the view too, so you can double check the figures.

Identify whether there is an increase with the field

Diff is +ve

IF [Sales Diff]>0 THEN 1 ELSE 0 END

Add this into the view too, and verify the calculation is computing by Order Date Month only again.

Now we need to work out if the row matches the previous value

Match Prev Value

LOOKUP([Diff Is +ve],-1) = [Diff Is +ve]

The LOOKUP is looking at the previous row (identified by the -1) and comparing to the current. If they match then it returns True else False.

Again add into the view, and again double check the table calc settings. In this case there is nested calculations so you need to double check the settings against each calc referenced in the drop down

Now we need to work out when there are consecutive increases, and how many of them there are

Increase Streak

IF (NOT([Match Prev Value])) AND ([Diff Is +ve] = 1) THEN 1
ELSEIF [Diff Is +ve] = 1 THEN ([Diff Is +ve]+PREVIOUS_VALUE([Diff Is +ve]))

END

If the current row has a +ve difference and the previous row wasn’t +ve, then we’re at the start of an increase streak, so set to 1. Else, if the current row has a +ve difference then we must be on a consecutive increase, so add to the previous row, and this becomes a recursive calculation, so builds up the values..

Add this onto the view, set the table calc settings, and you can see how this is working…

So now we’ve identified the streaks in each Sub-Category, we just want the maximum value.

Longest Streak

WINDOW_MAX([Increase Streak])

Add this and set the table calc setting again. You’ll see the max value is spread across every row per Sub-Category.

Finally we need to identify Sales values in the months when the streak is at its highest.

Sales of Month with Longest Streak

IF [Longest Streak]=[Increase Streak] THEN SUM([Sales]) END

Add this into the view again (don’t forget those table calc settings), and you’ll notice that for some Sub-Categorys there are multiple points with the same max streak

With all this we can now build the viz, which is relatively straight forward….

Add Order Date Month (exact date, continuous green pill) to Columns, Sub-Category to Rows and Sales to Rows. Edit the Sales axis to be independent, then change the line type of the Path to stepped

Add Sales of Month with Longest Streak to Rows and set to dual axis, and synchronise. Make sure the mark type of the 2nd axis is set to circle, and remove Measure Names from the colour shelf of both marks.

Manually set the colour of the line chart to grey. Add Longest Streak to the Colour shelf of the circle marks card. Adjust the colour to use the green palette, set to stepped of 5 value and ensure the range starts at 0 and ends at 5 (don’t forget to edit the table calc settings!).

Now add Longest Streak as a discrete blue pill to the view too.

This is all the core components. The last thing we need to do is sort the list. I wasn’t entirely sure how it had been sorted, apart from the largest Longest Streak at the top. I created a new field for this

Sort

[Longest Streak]*-1

and added this as a blue discrete pill in front of Sub-Category….

…, then hid the column.

Then just apply the tooltip and relevant formatting on the chart.

For the legend, I created a new field

Legend

CASE [Sub-Category]
WHEN ‘Art’ THEN 0
WHEN ‘Chairs’ THEN 1
WHEN ‘Labels’ THEN 2
WHEN ‘Paper’ THEN 3
WHEN ‘Phones’ THEN 4
ELSE 5 END

and added this into a new sheet as below

The components then just need to be added to the dashboard. My published version is here.

Happy vizzin’! Stay Safe!

Donna

Profitability with Dual Axis Charts

Luke Stanke returned for this week’s challenge, to build a pareto chart & bar chart on an unsynchronised dual axis. The crux of this challenge is table calculations, so as with any challenge like this, I’m going to build out what I need in tabular form first, so I can thoroughly validate I’m getting the right values. Once that is done, I’ll build the chart, then finally I’ll look at how to get the measures needed for the subtitle text.

  • Defining the core calculations
  • Building the chart
  • Working out the measures for the subtitle

Defining the core calculations

For the pareto, we need to plot % of orders against cumulative profit, so we need to build up some fields to get to these.

Add Order ID to Rows and Profit to Text and sort by Profit descending.

For the cumulative profit, we can add a Running Total Quick Table Calculation to the Profit pill

Add another Profit pill back into the view, and you can see how the table calculation is adding up the values of the Profit from the previous rows.

The triangle symbol indicates the field is a table calculation. By default, if you edit the table calculation, the calculation is computing down the table. I always choose to ‘fix’ how my calculations are computing, so that the values don’t inadvertently change if I move the pill elsewhere. So I recommend you set the table calc to Compute Using Order ID

I also want to ‘bake’ this table calculation into the data model (ie create a dedicated calculated field) that I can pick up and reuse. The simplest way to do this is to press Ctrl, then drag the field into the left hand data field pane (this will effectively copy the field rather than remove it from the view). Name the field and then you can verify it’s contents.

Cumulative Profit

RUNNING_SUM(SUM([Profit]))

So that’s one of the measures we need. Onto the next.

First of all we need to get a cumulative count of the number of orders.

Count Orders

COUNTD([Order ID])

Add this to the measures and it will display the value 1 per row (since each row is an order). Add a Running Total Quick Table Calculation to this field too, and again set to Compute Using Order ID. ‘Bake’ this into the data model too, by dragging the field as described above, and create a new field

Cumulative Order Count

RUNNING_SUM([Count Orders])

Now we need to get a handle on the total number of orders. I could do this with a LoD, but will stick with table calcs

Total Order Count

WINDOW_SUM([Count Orders])

Add to the view, and compute using Order ID again.

Now we can calculate the cumulative % of total orders

Cumulative % of Total Orders

[Cumulative Order Count]/[Total Order Count]

Format this to a % with 2 dp.

Add to the view and again compute using Order ID. You should see the values increase until 100%.

NOTE – I could have got this value by adding a Running Total table calculation to the Order Count field, and then editing that table calculation and adding a secondary table calculation to get to the % of total. However, I want to be able to reference the output of this field later on, so having a dedicated calculated field is the better option.

Ok, so now we have the 2 measures we need to plot the basic chart – Cumulative Profit and Cumulative % of Total Orders.

Building the chart

I typically start by duplicating the data sheet and then moving pills around

  • Duplicate Sheet
  • Remove Cumulative Order Count and Total Order Count
  • Move Order ID to the Detail shelf. Reset the sort on this pill to sort by Profit Descending
  • Remove Measure Names
  • Move Cumulative Profit to Rows
  • Move Cumulative % of Total Orders to Columns
  • Move Profit to Tooltip
  • Change mark type to Line
  • Add Sales to Tooltip and adjust tooltip accordingly

The chart needs to be coloured based on whether the marks has a profit > 0 or not. So for this we need

Profit is +ve

SUM([Profit]) >0

Add this to the Colour shelf and adjust accordingly.

Now we can add the second axis by adding Sales to the Rows shelf, then

  • Change mark type of the Sales marks card to bar
  • Remove the Profit is +ve field from the Colour shelf
  • Change the size to the smallest value
  • Adjust the tooltip
  • Make dual axis
  • Bring the Cumulative Profit axis to the front (right click on the axis > move marks to front)

Now the chart just needs to be formatted

  • remove column and row borders
  • edit the axis titles
  • format all the axes to to 8pt, and change the font of the axis title to Times New Roman
  • format the % of Total Orders axis to be 0dp

Working out the measures for the subtitle

For this, we are going to revert back to the tabular view.

We need to identify the point at which the Profit value starts to become negative. Let’s add the Profit is +ve field to Rows.

We’re looking for the row highlighted, which is the row where the previous value is true, while itself is false, which is achieved by

Profitable Marker

LOOKUP([Profit is +ve],-1) AND NOT([Profit is +ve])

Let’s add this now (ensuring the compute using Order ID)

We need to get a handle on the Cumulative % of Total Orders value for this row, but spread it across all the rows in data set, which we can do by

% of Total Profitable

WINDOW_MAX(IF [Profitable Marker] THEN [Cumulative % of Total Orders] END)

Add this on, compute by Order ID, and you can see the value for the ‘true’ line is displayed against every row. Format this field to % 0 dp.

For the potential profitability decrease, we need to get the Cumulative Profit value for the Profitable Marker row, along with the final (total) Cumulative Profit value.

Total Cumulative Profit

WINDOW_MAX(IF LAST()=0 THEN [Cumulative Profit] END)

This takes the value from the very last row in the data and again spreads across the all the rows.

With this, we can now work out the potential decrease

Potential Profitability Decrease

WINDOW_MAX(IF [Profitable Marker] THEN ([Cumulative Profit]-[Total Cumulative Profit])/[Cumulative Profit] END)

of the profitable marker row, take the difference between the ‘current’ cumulative profit and the final cumulative profit, as a proportion of the current value. Spread this across every row. Format to % of 0dp.

Now, as we have worked out these 2 values, % of Total Profitable and Potential Profitability Decrease to be the same across every row, you can add them to the Detail shelf of the All marks card on the chart viz, and reference them in the Title of the viz. (Don’t forget to ensure all table calc fields are set to compute using Order ID).

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Sales Goal Selection Tool

Ann Jackson set this week’s #WOW2021 challenge, based on a recent ‘real world’ situation she had encountered.

Analysing Ann’s solution (by interacting with her published solution), I deduced we’d need to use set actions to add and remove the selected Sub-Categories into and out of the set (I hadn’t noticed Ann had tagged the challenge on the main page with Set Actions 🙂 ). I also realised the initial visual in the first column, wasn’t using reference lines to depict the target, as the tooltip displayed on hover, was much more detailed than what you can add to a reference line tooltip.

So armed with this knowledge, I set about building what was required.

  • Defining the required calculations
  • Building the BANs
  • Building the viz
  • Adding the interactivity

Defining the required calculations

This is one of those challenges where I want to get all the calcs sorted up front in a tabular view, before even attempting the viz. I’ll go through what I ended up with, but be assured this did take a bit of time and change of direction to get what I needed.

Let’s start with parameters. Firstly I created a parameter to simulate ‘today’, that is hardcoded to 9th June 2021

I then created a parameter to store the % uplift we want to use for one of the goal options. I default this to 0.1 (ie 10%)

We need to work out the sales so far this year (ie the sales in 2021 up to and including 9th June 2021), and the equivalent sales for the previous year (ie the sales in 2020 up to and including 9th June 2020).

SALES YTD by Sub Cat

IF YEAR([Order Date]) = YEAR([pToday]) AND [Order Date] <= [pToday] THEN [Sales] END

SALES LY YTD by Sub Cat

IF YEAR([Order Date]) = YEAR([pToday])-1 AND [Order Date] <= DATEADD(‘year’,-1,[pToday]) THEN [Sales] END

Let’s pop these into a table with the Category and Sub-Category fields, and add grand totals, so we can see what figures we need to be aiming for the BANs at the top.

Now let’s work out the values for the goals.

One goal is based on finding the average sales per day in 2021, then extrapolating this across the whole year.

So firstly, we need to know how many days in the year up to ‘today’.

# Days So Far This Year

DATEDIFF(‘day’,DATETRUNC(‘year’,[pToday]), [pToday] ) +1

This finds the number of days between 01 Jan 2021 and 09 June 2021 (then adds 1, as we need to include 9th June too).

# Days in Year

DATEDIFF(‘day’,DATETRUNC(‘year’,[pToday]), DATEADD(‘year’,1,DATETRUNC(‘year’,[pToday])))

This finds the number of days between 01 Jan 2021 and 01 Jan 2022.

These methods ensure the right number of days is recorded if the ‘current’ year happens to be a leap year.

We can now work out one of the goals

Same Pace Goal

(SUM([SALES YTD by Sub Cat])/[# Days So Far This Year]) * [# Days In Year]

Add up the Sales so far this year and divide by the number of days so far to get the average sales value per day, then multiply by the total number of days in the year.

For the other goal, we need to determine the sales for the whole of the previous year, and multiple by the uplift %.

SALES LY

IF YEAR([Order Date]) = YEAR([pToday])-1 THEN [Sales] END

This gives the total sales for 2020.

LY + Percent Increase

SUM([SALES LY])*(1+[pPercentIncrease])

This applies the % increase parameter to the total sales for 2020.

Now we want to decide which of the goal values to use based on ‘selection’. To define the selected Sub-Categories, we’re going to use a set. Right-click on Sub-Category > Create > Set, name accordingly and select ‘Phones’ as the initial value in the set.

Use Last Year With Increase

Now we can use whether the record is in or out of the set in the logic to determine the goal to use

YEAR_END GOAL per Sub-Cat

IF ATTR([Use Last Year With Increase]) then [LY + Percent Increase Goal] ELSE [Same Pace Goal] END

Add the set and the goal field to the table, and you’ll see the value in the final column is matching the relevant previous columns, depending on whether the Sub-Category is IN or OUT of the set.

NOTE – when you do this, you’ll get In or Out displayed against each row. To get the description as I’ve got displayed, right click on the text In or Out and Edit Alias.

These data fields are going to be used to build out the central viz. There’s a couple of other fields we need to finish off the data requirements for the viz

Value to Goal

[YEAR-END GOAL Per Sub Cat]-SUM([SALES YTD by Sub Cat])

and

% of Goal

SUM([SALES YTD by Sub Cat])/[YEAR-END GOAL Per Sub Cat]

This field needs to be formatted to percentage with 0 dp (all other monetary fields need to be formatted to $ with 0 dp).

So now we have the core fields stored against each row that will help us build out the main viz. You can edit the set and add further sub-categories, so you can validate how the values change.

Now we want to work on the data we need for the BANs. You may think the grand totals sum up all the rows above, however while this works fine for the SALES YTD by Sub Cat and SALES LY YTD by Sub Cat columns, it isn’t the case for the YEAR-END GOAL by Sub Cat column.

If we select all the rows in that column, and examine the tooltip that appears on hover, the total of the selected columns differs from the total at the bottom of the column.

The value in the hover text is what we need. The discrepancy with the column grand total is because it’s working across the whole data and not row by row. Because at least 1 record is in the set, its taking the logic to use the same pace goal. You’ll notice the total of this column matches the total of the Same Pace Goal column. In fact as you add more values to the set, the total will match the Same Pace Goal total up until all values are in the set. At that point the total will match the LY + Percent Increase column.

So back to the summary measures. We’re going to use table calcs to solve this.

SALES YTD

WINDOW_SUM(SUM([SALES YTD by Sub Cat]))

SALES LY YTD

WINDOW_SUM(SUM([SALES LY YTD by Sub Cat]))

YEAR-END GOAL

WINDOW_SUM([YEAR-END GOAL Per Sub Cat])

These are all basically summing up the values displayed in the rows on the screen. Add them to the table, and you can see the same values are displayed on each row which tally to other data in the table

Right, we have all the data fields we need, let’s start building.

Building the BANs

On a new sheet, add the fields as below

We want to turn this into 1 row.

Create a field called

Index

INDEX()

and add this to the Rows shelf and change to be discrete (blue pill). Each row should be numbered from 1 to 17.

Now drag the Index field from the Rows shelf onto the Filter shelf and when prompted select 1.

You’ve now got 1 row displayed, but the data associated to all the rows is being included in the calcs (if you’d filtered just to Sub Category = Accessories for example, the date would be just related to the rows with the Accessories value).

Now we can tidy this to look how we want

  • Add Measure Names to Text shelf (change display to entire view if need be).
  • Hide Sub-Category from displaying (uncheck Show Header)
  • Right Click on each column title and Edit Alias to remove the ‘along…’ text
  • Format the text appropriately and align centrally
  • Adjust the tooltip to remove the Sub-Category info
  • Hide the Measure Names from displaying (uncheck Show Header).
  • Format the Row Dividers to be thick

Building the Viz

Add Category and Sub-Category to the Rows shelf on a new sheet. Change the sort of the Category field to sort by data source order, descending. Change the sort of the Sub-Category field to sort as below

Add SALES YTD by Sub Cat to Columns, then drag SALES LY YTD by Sub Cat onto the canvas and drop onto the SALES YTD by Sub Cat axis when you see the 2 columns appear

Drag Measure Names from the Rows shelf to the Size shelf. Add Measure Names to the Colour shelf. Adjust sizes and colours accordingly. Add a white border to the bars on the colour shelf.

Turn stack marks off (Analysis > Stack Marks > Off) to allow the bars to overlay each other.

Add both SALES YTD by Sub Cat and SALES LY YTD by Sub Cat to the Tooltip shelf, so you can reference both values regardless which bar you’re hovering over.

Add YEAR-END GOAL by Sub Cat to the Columns shelf and set to dual axis and synchronise axis. Reset the mark type of the Measure Values card to bar, and of the YEAR-END GOAL by Sub Cat card to Gantt.

Remove the Measure Names from the Size and Colour on the YEAR-END GOAL by Sub Cat card. Set the colour of this mark to black.

On the All marks card, add YEAR-END GOAL by Sub Cat to the Tooltip shelf so once again it’s value can be referenced by all marks.

Adjust the tooltip on the All marks card to match.

‘Type in’ MIN(0) to the Columns shelf (double click in the space next to the pills to enable the text edit feature). On this MIN(0) marks card, change mark type to Text and add Value to Goal to the Text shelf. Adjust the text to include the additional wording and adjust size too if need be. Adjust tooltip too.

Repeat this process for the % of Goal field too.

For the circles, create another MIN(0) column in the same way, but this time change mark type to circle, and add Use Last Year with Increase set to the Colour shelf and adjust. Adjust tooltips.

Format axes/gridlines/text/row banding accordingly and rotate text of the Category field.

Adding the interactivity

Add the sheets to a dashboard. Add a set action to add Sub-Categories to the set on click (Dashboard > Action > Add Action > Change Set Values). Set the action to run on select (on click) of the chart viz, to add values to the Use Last Year with Increase set.

We then need another set action to remove the Sub-Categories, but this needs to work by clicking a link in the tooltip which displays. The name of this action will display on the tooltip.

And fingers crossed.. that should be it! My published viz is here.

Happy vizzin’! Stay Safe!

Donna