Prep that and build this!

It’s a double whammy this week – a #PreppinData & #WOW2022 combo! And I’m going to attempt to blog a solution guide to both! Wish me luck… it may take some time!

The #PreppinData & #WOW2022 crew combined with the requirement to prep some Salesforce data with Tableau Prep as per the challenge here, and to then use the output to visualise in Tableau Desktop as per the challenge here.

Prepping the Data

Two files were provided for the input: Opportunity provided 1 row per opportunity, and Opportunity History providing multiple rows per Opportunity, where each row indicated the stage in the lifecycle the opportunity goes through.

The first requirement was to pivot the Opportunity data, to get 2 rows per Opportunity; 1 indicating the date opened, and the other indicating the date closed (or expected to be closed).

After connecting to the Opportunity date and adding a Clean step to view the contents, add a Pivot step to transform Columns to Rows, adding the CloseDate and CreatedDate fields to the pivot.

Before adding any further steps, rename Pivot1 Names to Stage and rename Pivot1 Values to Date.

Add a Clean Step.

Change the CloseDate value in the Stage field to be renamed to ExpectedCloseDate, and the change the CreatedDate value in the Stage field to be renamed to Opened. To do this, simply double click on the value and type in.

We then need to further update this field, based on whether the Opportunity record is closed or not. The requirement said to refer to the StageName field for this, even though there is an IsClosed field. I chose to stick with the requirements.

Create a calculated field

Stage

IF CONTAINS([StageName], ‘Closed’) AND [Stage]=’ExpectedCloseDate’ THEN [StageName] ELSE [Stage] END

Having applied this logic, your Stage field should now contain 4 values rather than 2.

Then remove all fields except for Date, Stage, Id and StageName, and rename the step Opportunity.

We now need to combine with the OpportunityHistory data. This data contains a row for each stage the Oppotrunity goes through. What we’re looking to do is to supplement this with a Stage = Opened record, and, in the event the opportunity hasn’t been closed, a Stage = ExpectedCloseDate record.

For this we’ll need a Union step.

Add in the OpportunityHistory data, and add a Clean step. Rename this step to History. To help with the union step, rename the CreatedDate field to Date. In the Opportunity step above, rename Id to OppID.

Add a Union step to the Opportunity path. Then drag History and add it to the Union. Add a Clean step and view the data. If the renaming worked ok, you should have 6 fields.

Update the SortOrder field by creating a new calculated field

SortOrder

If [Stage]=’Opened’ THEN 0
ELSEIF [Stage]=’ExpectedCloseDate’ THEN 11
ELSE [SortOrder] END

Update the Stage fields for those which have null records.

Stage

IF ISNULL([Stage]) THEN [StageName] ELSE [Stage] END

All the records with SortOrder = null are actually duplicates now, as we have them captured as part of the pivoting step we did initially. All these records can therefore be excluded (click on the null value in the SortOrder field, right click and Exclude).

Now remove the redundant fields of Table Naems, Stage Name, and you should be left with 4 columns and 876 rows of data, which you can output to csv or similar.

Building the Viz

Now we’ve got the data sorted, we can build the viz. If you haven’t done the PreppinData challenge, you will need to download the Opportunity.csv input file and the provided Output.csv files from the website.

Modelling the data

In Tableau Desktop, connect to the file generated in the above process (or download the output file from the PreppinData site if you haven’t built your own). My file was called 2022_06_08_SalesOpps.csv.

Then add an additional connection to the Opportunity.csv file you used in the Prep challenge (or again download from the PreppinData site).

Add a relationship from Opp ID to Id

Building the Open Opportunities chart

We need to work out what Lorna has used to identify an ‘open’ opportunity. After a bit of trial and error, I discovered it was any opportunity that had an ExpectedCloseDate stage. To identify these, create a new calculated field

Has Expected Close Date Stage

{FIXED [Opp ID]: SUM(IF [Stage]= ‘ExpectedCloseDate’ THEN 1 ELSE 0 END)}

This will return the value 1 against all the rows for an Opp ID which have an ExpectedCloseDate Stage, and 0 for those that don’t.

As you can see below, all the rows for the 1st two Opp IDs listed are flagged with 1 as they have at least 1 stage which is ExpectedCloseDate. For the other Opp IDs listed, they are all 0.

Right click this field and drag to the Filter shelf. When you release the mouse, choose the All values option from the dialog displayed, then Next

Then select 1 to filter the sheet just to the the Opportunity records we care about.

Add Opp ID and Name to Rows. Add Date to Columns as a continuous exact date (green pill).

Change the mark type to Circle and add the Stage field to Colour, and adjust accordingly using the relevant colour palette.

To add the grey lozenge marks, add another instance of Date next to itself. This will create a second marks card called Date2. Remove the Stage field from the Colour.

Change the mark type of the Date2 card to line, and adjust the colour to a light grey and the size to be a bit larger.

Make this chart dual axis and synchronise the axis. Move the lozenge marks ‘to the back’ (right click on the top axis and move marks to back.

We need to identify those with a Close Date in the past. For the purposes of this exercise, I hardcoded ‘today’ into a parameter pToday and set it to 8th June 2022. Otherwise in a ‘live’ situation I would refer to the TODAY() function rather than the parameter.

Past Close Date?

IF [Close Date]<[pToday] THEN ‘●’ ELSE ” END

I use this site to get the circle mark.

Add this field to the Rows before the Opp ID field. You won’t get the symbol against every row, but don’t worry about that just yet. Format this circle to be right aligned and red font.

Now we need to mange the sorting. Firstly create a new parameter

pSortBy

an integer containing values 1 and 2, defaulted to 2 where the values displayed are aliased as below

To determine how long an opportunity has been open we need

Days Open

DATEDIFF(‘day’, [Created Date], [Close Date])

We then need a field to utilise this parameter and determine the measure we can use to sort

Sort By

CASE [pSortby]
WHEN 1 THEN SUM([Days Open]) * -1
ELSE INT(MIN([Close Date]))
END

If we’re sorting based on the Longest Open we’ll use the number of days the opportunity has been open. By default the data will ascend from smallest to largest value but we want the opposite, so we multiple by -1.

If we’re sorting based on the Close Date. then we can just use the Close Date field itself, converted into a integer.

Add Sort By to Rows , change it to be discrete (blue) and then move it to be the first pill on the Rows shelf. The data should now be reordered, and you’ll now get a red circle per row.

Add the pSortBy parameter to the sheet and test the sorting. Once happy, hide the Sort By field (right click and uncheck Show Header), then Hide Field Labels for Rows. Remove row & column dividers and make the first column narrow. You should now have your chart.

Building the legend

Lorna’s decided the legend should be circles rather than the ‘out of the box’ squares, so a custom sheet is required.

On a new sheet, add Sort Order discrete dimension (blue pill) and Stage to Rows. Add Stage to Filter and exclude Closed Lost and Closed Won. We’re going to need to organise these Stages into 2 rows and 3 columns, so we need some fields to help.

Row Index

IF [Sort Order] ❤ THEN 1 ELSE 2 END

Column Index

IF [Sort Order] = 0 OR [Sort Order] = 4 THEN 1
ELSEIF [Sort Order]=2 OR [Sort Order]= 8 THEN 2
ELSE 3 END

Note I’m very much ‘hardcoding’ this as this is acceptable for this view I believe.

Add Row Index as discrete dimension to Rows and Column Index as discrete dimension to Columns, and move Stage to Label.

Type in FLOAT(MIN(0)) into Columns to create a fake axis. Change mark type to circle, and add Stage to Colour. Increase the width of each row if all your text isn’t showing.

Edit the axis so it is fixed to start at -0.1 and end at 0.5 – this will shift the circles to the left.

Uncheck Show Header against the Row Index, Column Index and the MIN(0) axis. Then remove all gridlines and row/column dividers. Turn off tooltips.

The Summary View

Now we need to work on the summary values at the top of the screen. We’re focussing on ‘open’ opportunities in the current month, which is June 2022 (based on the pToday parameter). We need Has Expected Close Date Stage = 1 on Filter, and to identify the current month we use

Is This Month?

DATETRUNC(‘month’, [Close Date]) = DATETRUNC(‘month’, [pToday])

Add this to Filter and set to True

Now let’s have closer look at the rows of data we’re got so far.

Add Opp ID, Sort Order (discrete dimension) and Stage to Rows and Amount to Text.

What we’re looking for is to aggregate the Amount by Stage, but if we remove the Opp ID and Sort Order fields, we don’t get the right values

This is because we’ve got multiple rows per Opp ID, so the value is counting in multiple Stages. We want to just limit the rows to the latest Stage before the ExpectedCloseDate stage.

First lets work out the maximum Sort Order value for each Opp ID that isn’t the Expected Close Date row (which has a Sort Order = 11).

Max Stage Sort Order

{FIXED [Opp ID]: MAX(IF [Sort Order]<=10 THEN [Sort Order] END)}

Add this to the tabular view as a discrete dimesion. Hopefully you can see from the image below that we’re getting the value we want.

Now let’s identify the matching row

Stage To Keep

[Sort Order] = [Max Stage Sort Order]

Add this to the Filter shelf and set to True.

Now if we remove Sort Order, Max Stage Sort Order and Stage from Rows, we get the correct summarised values.

Move Stage to Columns and add Stage to Text and Colour. Set to Fit Entire View. Align Text centred and remove all row/column dividers. Uncheck Show Header against the Stage field on Rows. Use the Colour Legend to manually sort the values into the correct order.

Update the sheet title.

Past Close Date

This is a bit simpler… add Past Close Date? to Filter and select ●. Add Stage to Filter and select ExpectedCloseDate. Add Amount to Text and the auto generated field Opportunity.csv (Count) to Text.

Format the text and add a title, and you’re all done You just need to add everything to the dashboard now.

My published version is here.

Happy vizzin’!

Donna

Advertisement

Filter Challenge : How well do you know Tableau’s Order of Operations?

It was Erica Hughes’ turn to set the #WOW challenge this week. This ended up being a bit of a journey for me, because of one requirement – to use the max date of the dataset.

I typically use a FIXED LOD to determine the max date, which I did initially in this case, and after building a solution, tripped up when it came to adding some of the filters ‘to context’. The context filter meant my Max Date was changing depending on the filter and subsequently I wasn’t getting the right results for the ‘days since last purchase’.

So I ended up having to put my thinking cap back on, and after a lot of trial and error and reading on the internet (including this article by the Flerlage Twins) , I finally managed to get a solution that involved both LOD calculations (including a rarely used INCLUDE LOD) and Table calculations, and no context filters at all. This was a pretty complex solution. The table of data had to include the Order Date on the Detail shelf, resulting in multiple rows per customer showing, which meant I had to then use an INDEX()=1 filter to only show 1 row for each customer. I then used an additional INDEX() against each customer, so I could filter to show only the customers in position 1-10. My solution to this is published here.

After publishing, I checked Erica’s solution and found how she’d handled the max date requirement. It used a concept I haven’t had to use so far, so I decided to rebuild a less complex solution, which is what I will now blog about.

Capturing the max date of data set which isn’t affected by context filters

The usual way to define the maximum date in the data set is to create a FIXED LOD calculation, and this is still required

Max Date

{FIXED :MAX([Order Date])}

This returns 30 Dec 2021.

The problem is, once a context filter is applied to a sheet, the value of this field can change. For example, if Region is applied as a context filter and set to ‘South’, then what Tableau will do, based on the ‘order of operations’, is first filter all the data to just those records where Region=South. It will then work out the max date of these filtered records, and if there are no orders on 30 Dec 2021 for the South Region, then the value of the Max Date field will differ. This is because context filters get applied before FIXED LOD calculations. We need a way to ensure we can retain the 30 Dec 2021 without hardcoding it.

The solution is to use a parameter.

pMaxDate

A date parameter which is set to use the value of the Max Date field when the workbook is opened.

This is quite a sneaky but clever way to retain this, which is why I’m reworking my solution to use it, so I have something for future reference myself. When the workbook opens, the pMaxDate parameter will get set to 30 Dec 2021 and won’t ever change, whereas the Max Date field will change if context filters are used.

Building the Table

It may seem a bit odd, but I’m going to start with the table of data that’s displayed, as this requires the most calculations, and the most validation.

On a new sheet, add Customer Name, Customer ID to Rows and Sales onto Text. Order by Sales desc. Format Sales to $ with 0dp.

We need to determine the latest order date for each customer in order to work out other information.

Latest Order Date

{FIXED [Customer ID] : MAX([Order Date])}

This finds the maximum Order Date for each Customer.

And with this, we can now work out

Days Since Last Purchase

DATEDIFF(‘day’, [Latest Order Date], [pMaxDate]

Note this is comparing the Latest Order Date to the pMaxDate parameter instead of Max Date.

Format this to custom number, with 0 dp and a suffix of ‘ days’.

Add Latest Order Date to Rows (exact date, discrete) and Days Since Last Purchase into the table.

Now we need to work out

Latest Order Amount

{FIXED [Customer ID]: SUM(IF [Order Date]=[Latest Order Date] THEN [Sales] END)}

For each customer, if the Order Date is the same as the Latest Order Date, then retrieve the Sales value.

Format this to $ with 0 dp and add to the view.

Now add Customer ID to the Filter shelf and set to filter by the Top 10 based on Sales.

To validate the behaviour of the calculations add Region to the Filter shelf and select ‘South’. You’ll notice only 8 rows are shown, and not 10.

This is because the data has been filtered based on the top 10 customers with the most sales first, and then restricted those top 10, to those with sales in the South. Only 8 of the top 10 customers have sales in the South, hence the 8 rows.

To resolve this, we need to add Region to Context (right click on the filter and Add to Context . This has the effect of filtering all the data by Region = South first, and then displaying the top 10 customers by Sales. We now have 10 rows displayed.

Add State to the Filters shelf, and add that to context too. Test the table of results by selecting different combinations of regions and/or states and comparing to the results on Erica’s solution to verify all the calculations are behaving as expected.

Now we’re happy the table is displaying the data as expected, we can format it and make it ready for the dashboard :

  • Remove Latest Order Date & Customer ID
  • Set the Row Banding
  • Set the Row Dividers to Level 0, so only row lines appear at top and bottom
  • Remove Column Dividers
  • Format all text (row/column headings & text data) to be 8pt.
  • Alias Sales to be Total Sales (right click on the Sales title in the columns and Edit Alias)
  • Remove Region & State from the Filter shelf. These will get re-added later.

Building the Map

On a new sheet, double click State to load a map (you may need to set your location to United States : Map menu -> Edit Locations).

Change mark type to a Filled Map, add Region to Colour and adjust colours.

Remove all map layers (Map menu -> Map Layers and uncheck all options listed on left hand side). Change border on Colour shelf to white.

Remove row & column dividers, and remove all map options (Map menu -> Map Options and uncheck all options).

Drag a new instance of State onto the canvas and Add a Marks Layer. This will create a 2nd marks card on the left. Change the colour of the circles to black. Add Sales to the Size shelf and adjust. Adjust the tooltip of both marks card (you’ll need to add Sales to the Tooltip on the map marks card).

Building the Legend

On a new sheet add Region to Columns and type in MIN(1) into the Columns shelf too. Add Region to the Label shelf, and the Colour shelf. Adjust the height of the rows, so you can see the text.

Edit the axis so it is fixed from 0 to 1. Then format the Label so the font is larger and centred.

Hide the axis and the Region (uncheck show header). Remove row & column dividers and any gridlines. Adjust the border on the Colour shelf to be white. Set the tooltip not to show.

Building the Bar Chart

On a new sheet, add Sub-Category to Rows and Sales to Columns. Order by Sales desc. Add Sub-Category to the Filter shelf and set to show Top 10 by Sales. Reduce the Size of the bars, and align the row labels to be left aligned, and the font to be 8.

Format the axis, so the values are displayed in $K.

Hide the row label, and adjust Tooltip. You may need to create an additional calculated field based on Sales to add to the Tooltip which you can then format to $ with 0dp.

Adding the interactivity & context filters

Add the sheets onto a dashboard. You’ll need to use a mixture of vertical & horizontal layout containers, inner and outer padding and background colours to get the required layout.

Add a dashboard filter action which runs on Select when the Legend sheet is selected, which affects all other sheets and shows all values when unselected.

Add another filter dashboard action, which this time runs on Select when the Map sheet is selected, which affects all sheets except the Legend sheet, and also shows all values when unselected.

Now click on one of the regions in the legend, then click on a state in the map. This has the effect of adding filters to the shelves on the other sheets. Navigate to the Table sheet, and add the 2 ‘Action’ filters to context

Do the same for the Bar char sheet.

And you should now have a completed viz. My published version based on this solution is here.

Happy vizzin’!

Donna

Adding more detail & context

This week it was Kyle’s turn to add a 3rd piece to the dashboard (see challenge here).

I chose to build on my previous week’s solution, which you can download from here, so all fields referenced etc will be based on that (ie I may reference fields that don’t exist in the solution published by the #WOW crew, and that have not been created as part of this stage of the solution).

For this part of the challenge, we’re looking to add a trend line, which displays a line for each row in the bar chart above, which means we need to display the median values of the Sales and Proft Ratio measures, since the ‘Other’ bar represents a group of multiple states.

We also need to colour and size the lines according to whether the bar chart has been clicked on or not.

First up, to build the line chart, we need to create some new LoD fields, as the ones we have so far have been FIXED at a State level, and now we need to consider the month of each order. I’m essentially going to repeat the steps I used when building the bar chart, but this time as the month of the Order Date into the calculation

Sales by State & Date

{FIXED [State], DATETRUNC(‘month’, [Order Date]): SUM([Sales])}

PR by State & Date

{FIXED [State], DATETRUNC(‘month’, [Order Date]): SUM([Profit])/SUM([Sales])}

From these, I can then create the measures I need for the display

Sales by Display State & Date

{FIXED [State To Display], DATETRUNC(‘month’, [Order Date]): MEDIAN([Sales by State & Date])}

format this to $ with 0 dp

PR by Display State & Date

{FIXED [State To Display], DATETRUNC(‘month’, [Order Date]): MEDIAN([PR by State & Date])}

format this to % with 0 dp

I can then build the initial line chart, with Order Date on Columns, set to the continuous month level (eg May 2020) and Sales by Display State & Date and PR by Display State & Date on Rows. Add State To Display to Detail.

Now we need to colour and size the lines based on a combination of whether the State is the one selected in the scatter plot, or whether the user has clicked on one of the bars.

I decided that I would capture the ‘state’ the user selected on the bar into a new parameter. So I created

pSelectedStateFromBar

A string parameter defaulted to ‘blank’

With this, I can the create

Size – Line

[State To Display] = [pSelectedStateFromBar]
OR
[Selected State]

This returns true if the State To Display field matches the value captured in the parameter, or the (existing) Selected State field is true.

Add this to the Size shelf on the All Marks card and adjust so the True option is slightly thicker that the False option. You may need also need to adjust the slider against the Size shelf to get the thickness just right.

We also use these 2 parameters to determine what the colour of the line needs to be

Colour – Line

IF [Selected State] AND ([pSelectedStateFromBar]=” OR [pSelectedStateFromBar]=[State]) THEN ‘dark’
ELSEIF [State To Display]=[pSelectedStateFromBar] THEN ‘mid’
ELSE ‘light’
END

Add this onto the Colour shelf. You’ll never have all 3 options displayed at the same time, so you’ll need to set the pSelectedSateFromBar to empty and to a value in order to adjust the colours. You’ll also need to ensure ‘dark’ is listed above ‘light’ and then ‘mid’ is listed above ‘light’ so these lines appear ‘on top’.

The chart just needs tidying up now – edit/remove the axis titles, reduce the size of the text on the axis, remove gridlines and set the tooltips.

Once done, you’re ready to add to the dashboard.

I rearranged my dashboard to have a horizontal layout container to with the scatter plot in the left hand column, and a vertical container in the right hand column. The vertical container then had the bar chart above the line chart.

The final step was to the then add a parameter action that on ‘select’ of the bar chart, it passed the State to Display value into the pSelectedStateFromBar parameter.

And that should be it. My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Adding Detail & Context

It was Sean Miller’s turn to expand on last week’s #WOW2022 challenge, by adding an additional viz to the scatter plot (challenge details here).

The assumption is you should be able to build on the challenge solution you have built in the previous week. My solution to the Reference Box challenge is here. I adopted (and blogged about) a table calc solution. However, the published solution used LoDs. Both methods achieved the desired result, but I decided when starting this challenge, that I would build this ‘extension’ using LoDs too, so anyone who uses the published solution as a starting point, gets help via this blog.

So if you used my previous blog to build the challenge, you’ll need to first create the LoD equivalent of the calculated fields we used (note I did not change the scatter plot viz to use these fields) :

PR- 25th Percentile LoD

{PERCENTILE({FIXED [State]: SUM([Profit])/SUM([Sales]) },0.25)}

PR-75th Percentile LoD

{PERCENTILE({FIXED [State]: SUM([Profit])/SUM([Sales]) },0.75)}

Sales- 25th Percentile LoD

{PERCENTILE({ FIXED [State]: SUM([Sales]) },0.25)}

Sales- 75th Percentile LoD

{PERCENTILE({ FIXED [State]: SUM([Sales]) },0.75)}

Note – It’s also worth reiterating at this point, that I will be referencing calculated fields/parameters/objects in this blog created as part of my initial challenge.

In order to build the bar chart, we need to categorise each State into a grouping; the selected/highlighted state, the states in the reference box, all other states.

To identify the states in the reference box, we need to use FIXED LoDs to get the value of the Sales and Profit Ratio at the State level.

Sales by State

{FIXED [State]: SUM([Sales])}

PR by State

{FIXED [State]: SUM([Profit])/SUM([Sales])}

We can then use these fields along with the LoDs further above to determine whether a State is in the reference box

In Reference Box?

[PR by State]>=[PR-25th Percentile LOD] AND
[PR by State]<= [PR-75th Percentile LOD] AND [Sales by State]>=[Sales – 25th Percentile LOD] AND
[Sales by State]<= [Sales – 75th Percentile LOD]

This simply returns a boolean.

Now we can categorise each State

State to Display

IF [Selected State] THEN [State]
ELSEIF [In Reference Box?] THEN [State]
ELSE ‘Other (median)’
END

And with the above, we can then define the measures we want to show

Sales to Display

{FIXED [State To Display]: MEDIAN([Sales by State])}

PR to Display

{FIXED [State To Display]: MEDIAN([PR by State])}

And with these fields we can now build the bar chart.

  • Add Selected State to Rows and drag the dimension value, so True is listed before False.
  • Add In Reference Box? to Rows, and again drag so the True is listed before False.
  • Add State to Display to Rows
  • Add Sales to Display to Columns and Sort descending
  • Add PR to Display to Columns
  • On the All Marks Card add Selected State to Colour
  • Then add In Reference Box? to the Detail shelf. Then click on the … icon to the left of the In Reference Box? pill on the marks card, and change to the Colour icon. This should result in 2 fields on the Colour shelf.
  • Adjust the colours accordingly.
  • Add Sales to Display and PR to Display to the Tooltip shelf and adjust.
  • Change the titles of the axis
  • Remove row banding
  • Uncheck Show Header against Selected State and In Reference Band?
  • Hide field labels for Rows against the State to Display column heading

Add this sheet onto the dashboard and you’re done 🙂

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

How have successful products performed?

Luke Stanke provided us with this week’s #WOW2022 challenge, to test our LoD knowledge (amongst other things).

From interacting with Luke’s published solution, I figured I needed to start by working out some core calculated fields, before I even attempted to build the viz.

Defining the calculations

Firstly, we just need to determine how many distinct products (ie Product Names) in total were ordered per Category and Sub-Category. As the viz we’re building is already at the required level of detail, the calculation we need is simple

Count Products

COUNTD([Product Name])

That’s the easy bit :-)…. we then need to determine how many products are profitable (for the relevant Category & Sub-Category) every year, in each of the 4 years. This is a bit more complex, and requires several steps (at least it did for me).

We need to know what the Profit is for each Product Name in each year. We can use an LoD for this.

Profit per Product & Year

{FIXED YEAR([Order Date]),[Product Name]:SUM([Profit])}

Popping this into a table view so we can check what’s going on…

We now need an indicator to know whether each value is profitable (ie +ve) or not. I want to return a 1 if it is and 0/null if not, and then I want to be able to ‘sum up’ all the 1’s, so I can conclude if the total is 4, the product is profitable every year. I need another LoD for this

Is Profitable per Year?

{FIXED YEAR([Order Date]), [Product Name]: SUM(INT([Profit per Product & Year]>0))}

[Profit per Product & Year]>0 returns a true or false, and so when true and wrapped within an INT, will return 1. When this field is added to the above view, it is further aggregated by SUM.

At the Product Name & Year level, it just returns the 1’s, 0’s or <nothing> as expected

but when we remove Year from the table (which is necessary for the viz we’ll be building), this field aggregates….

…and scrolling down we’ll find some rows where the value is 4, which means the Product Name has been profitable for every year.

Using this information, we can then create a Set of these products – Right click Product Name > create > set

Profitable Products

contains the set of Product Names where SUM(Is Profitable per Year?) = 4

Now we can identify the profitable products, we need to count how many there are

Count Successful Products

COUNTD(IF ([Profitable Products]) THEN [Product Name] END)

If the Product Name is in the Profitable Products set, then capture the Product Name, and count the distinct number of them.

And now we can compute the percentage of successful products

Pct. Successful Products

ZN([Count Successful Products]/[Count Products])

ZN means I’ll get a 0 for those cases when there aren’t any successful products. Format this to % with 0 dp.

Let’s see all these values in a different table now to verify we’re getting what we expect :

Great! So this has given us the data we need to build the bar chart part of the viz, but what about the barbell? This is based on profit ratio and all we actually need for this is the standard profit ratio calculation

Profit Ratio

SUM([Profit])/SUM([Sales])

formatted to % with 1 dp.

Building the viz

On a new sheet create a basic bar chart with Category and Sub-Category on Rows and Pct. Successful Products on Columns. Sort descending, label the bars and colour accordingly.

Now add Profit Ratio to Columns, and on the Profit Ratio marks card only..

  • change mark type to circle
  • remove labels
  • add Profitable Products set to Colour and adjust
  • edit the alias of the values in the colour legend (right click > edit alias)

To create the bar between the circles, add another instance of Profit Ratio to Columns, and on this marks card…

  • change mark type to line
  • remove labels
  • add Profitable Products to Path
  • change colour to grey

Now right click on the 2nd instance of the Profit Ratio pill in the Columns shelf and select Dual Axis.

(note – if you lose your bars at this point, change the mark type of the Pct. Successful Products mark cards to bar).

Right click on the Profit Ratio axis at the top and synchronise axis, then right click again and move marks to back, then finally, right click again and uncheck show header.

And that’s the core of the viz really. It needs some final formatting to remove column headers, column gridlines and to add row banding. I also chose to make the tooltips more relevant, so in my solution there are some additional fields on the marks cards to provide the relevant details and commentary.

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you visualise percent difference from a selected date?

Lorna Brown returned this week with her first challenge of 2022 which was aimed at practising tableau calculations, although when I completed the build, I realised I actually didn’t use any – oops! I’d ended up using an LoD based solution instead. As is often the case in Tableau, there can be multiple ways to solve a problem, so I created another solution involving table calcs. As I blog I will endeavour to provide both sets of calculations used and make it obvious when I do so. Hopefully, it’ll all make sense 🙂

Building the calculated fields

After connecting to the data, the first step I did was to create a parameter to store the ‘selected date’. I hard coded/defaulted this to 1st Nov 2020.

pSelectedDate

Date parameter defaulted to 01 Nov 2020

Show this on the worksheet, so you can test what happens when you manually change it later.

As I typically do, I’m going to work in a tabular format to sense check all the calculations we’re going to need before I start visualising the information graphically.

To start add Company then Date (set to exact date & discrete) to Rows then add Close to Text and change the aggregation to AVG.

I now want to work out what the value of the average Close is on the selected date (in this case 01 Nov 2020) for each company, and I want this value to be displayed against every row associated to that company.

Close on Selected Date per Company

LOD Option

{FIXED [Company]: AVG(IF [pSelectedDate]=DATETRUNC(‘month’,[Date]) THEN [Close] END)}

Table Calc Option

WINDOW_MAX(AVG(IF [pSelectedDate]=DATETRUNC(‘month’,[Date]) THEN [Close] END))

Drag this field onto the table. If you’re using the LoD option, change the aggregation to AVG (although as we’re working with the data at the lowest granularity, SUM gives the same value, it just makes ‘more sense’ to me). If you’re using the Table Calc option, change the table calculation so that it is computing by Date only. In the below screen shot, I have included both versions of the calculation which match each other. Scrolling down, you should be able to verify that the value displayed in the new field(s) for all rows for a specific company, matches the value in the first column for the selected date. Test by entering different dates in the parameter (though make sure you’re just entering 1st of month each time, as that is what our Date field contains).

Now that for each row we have an average Close value and the average Close value on a specific date, we can work out the percentage difference for each row.

% Diff from Selected Date

LOD option

(AVG([Close])-AVG([Close on Selected Date per Company]))/AVG([Close on Selected Date per Company])

Table Calc option

(AVG([Close])-[Close on Selected Date per Company])/[Close on Selected Date per Company]

There is a slight difference in the syntax as the table calc option is referring to a field that is a table calculation and which is already aggregated, so there’s no need to aggregate again (ie wrap in an AVG…. Tableau will error if you try to do this).

Format this to a Percentage with 0 dp.

Once again I am choosing to show both versions on the screenshot below. Again if you’re using the table calc version, you need to verify the calculation is set to compute by Date only.

So now we have the core fields we need to start building the viz.

I am now just going to refer to the calculated field names I’ve mentioned above and not differentiate between LoD or table calc version. The only thing to remember is when adding the table calc fields, set the compute by, and if you’re using the LoDs, set the aggregation to AVG (although SUM will work too due to the level of aggregation we’re at).

Building the viz

On a new sheet add Date to Columns and set to the continuous Month level or exact date (either will work in this instance, since we only have 1 date per month in the data set). Add % Diff from Selected Date to Rows. Add Company to Colour and adjust to suit (use the hue circle colour palette).

Add Company to the Label shelf and also add % Diff from Selected Date to the Label shelf.

Adjust the Label settings, so the labels only display at the end of the line.

To make the text match the mark colour, click Label >then expand the Font dialog by clicking the arrow to the right, and select Match Mark Colour.

To format the % value displayed, right-click on the % Diff from Selected Date pill on the Text shelf, and select format. In the formatting pane that appears on the left hand side, change the number formatting to custom ▲0.0%;▼0.0% ( I use this site to get the shapes I need – just copy and paste)

Now rearrange the text on the label so that company is displayed first.

To format the Date field within the tooltip and the axis, right click on the Date pill in Columns and select format. On the Format Date pane on the left hand side, set the Dates value on the Pane tab, in the Default section, to custom : mmm yy. Then click on the Axis tab and set the Dates value in the Scale section to the same custom mmm yy value.

The Tooltip

We need quite a few fields on the Tooltip to get the information required.

Add Close and Close on Selected Date per Company to Tooltip. We also need a couple more calculated fields

TOOLTIP – Selected Date

[pSelectedDate]

format this field to custom mmm yy

TOOLTIP – % Diff

[% Diff from Selected Date]

Format this to % with 0 dp.

Add both these to the Tooltip too, then adjust the wording on the tooltip as required.

The Reference Line

Add pSelectedDate to the Detail shelf. Then right click the date axis and Add Reference Line. Adjust the settings so there is a Line for the Entire Table, which references the pSelectedDate value and has a custom Label of Reference Date: <Value>. Set the tooltip to None and the Line to be dotted

Format the pSelectedDate value displayed, by right clicking on the pill on the Detail shelf and set the Dates format on the Pane tab to mmm yy.

The right click on the Reference Date: Nov 20 label itself and select Format, and change the alignment to the top.

Finally, remove the axes titles, remove the gridlines and all axes rulers and change the worksheet background colour. Again, format the reference line label, so there is 0 transparency.

Test the visualisation by manually changing the date of the parameter to the 1st of another month and seeing how the viz updates.

Building the interactivity

Add the sheet to a dashboard, then create a parameter action, the runs on select of a mark on your sheet. It should change (target) the pSelectedDate parameter passing in the Date field

The only problem with this, is that when selecting a mark, the line relating to the mark is highlighted while the others ‘fade’ into the background. To prevent this from happening :

  • Create a new calculated field called True which contains the value True
  • Create a new calculated field called False which contains the value False
  • Add both these fields to the Detail shelf on the line chart.
  • Add a Dashboard Filter action, which runs on select, and has the sheet on the dashboard as it’s source, and the sheet in the workbook as it’s target. Set to show all values when selection is cleared, and filter by selected fields, setting True = False.
And with that, you should have a functioning solution – there just may be a few further formatting tweaks to apply.

My published vizzes are here

Happy vizzin’! Stay Safe!

Donna

Can you recreate this difference chart?

Lorna created this challenge for #WOW2021 this week incorporating tips from the Speed Tipping session she and fellow WOW leader Ann Jackson had presented at TC21.

Defining the calculations

The requirements were to ensure there were only 7 calculated fields used, and no date hardcoding (including in the title – a feature I missed to start with). So let’s start by just going through the required calculations.

We need to identify the latest year in the data set

Current Year

YEAR({FIXED:MAX([Order Date])})

This uses an LoD (Level of Detail) calculation to identify the maximum date in the whole data set, which is 31st Dec 2021, and then extracts the Year of this ie 2021.

From this, we work out

Previous Year

[Current Year] – 1

Both of these fields return numbers, so automatically sit in the measures section of the left hand data pane (ie under the horizontal line). I want to treat these as dimensions, so I just drag the fields above the line.

We now need to create dedicated fields to store the Sales values for both years

CY Sales

IF YEAR([Order Date])=[Current Year] THEN [Sales] END

PY Sales

IF YEAR([Order Date])=[Previous Year] THEN [Sales] END

and with both of these, we can work out the

Difference

SUM([CY Sales])-SUM([PY Sales])

[TIP] This is custom formatted to △#,##0;▽#,##0.

I googled ‘UTF 8 triangles’ and used this link to find the suitable shapes which I just copied and pasted into the number format field.

We’re going to need to determine whether the difference is positive or not.

Is Loss?

MAX(0,[Difference]) =0

This is another [TIP] making use of the array function. If the Difference is negative, it will return 0 as this is the maximum of the two numbers. I’m not entirely sure if this is more efficient than simply writing Difference<=0, but I wanted to incorporate another of the tips presented.

The final calculation we need is another of the PY Sales field, as we need another distinct Measure Name value to display. I simply chose to duplicate the existing field to have a PY Sales (copy) field.

Building the viz

Add Category to Columns, Segment to Rows and then add CY Sales to Columns, which will create a horizontal bar chart. Then drag PY Sales to the CY Sales axis, and when the ‘two columns’ icon appears, drop the field.

This will automatically change the pills so Measure Values is on Columns and Measure Names is on Rows.

Swap the order of the pills on the Measure Values section on the left hand side, so PY Sales is listed before CY Sales.

Add Measure Names to the Colour shelf and adjust. Increase the width of the rows.

Check the Show Mark Labels option on the Label shelf and adjust alignment to display the text to the left

Increase the Size of the bars to the maximum size, and add a white border (via option on Colour shelf)

Add PY Sales (Copy) to Columns, and change the mark type to Gantt Bar. Remove Measure Names from the Colour shelf of this marks card, as it will automatically have been added. Instead add the Is Loss? field to Colour and adjust.

Add Difference to the Size shelf, then click on Size, and reduce it to as small as possible. Set the border of this mark to Automatic (it should become a little thicker).

Next add the Difference field to the Label shelf, align right and set the font colour to match mark colour.

Now make the chart dual axis, synchronise axis, and set the mark type of the Measure Names mark type back to a bar.

On the All marks card, add CY Sales, PY Sales and Difference to the Tooltip shelf. And add Current Year and Previous Year to the Detail shelf.

Adjust the Tooltip against the All marks card, so it is the same when you hover on all of the marks. And edit the title of the chart, referencing the Current Year and Previous Year fields.

The challenge has a ‘space’ between each Segment, and this is the final TIP I used.

On the Measure Values section on the left below the marks card, type in MIN(NULL). This will initially create a new ‘blank’ row between the bars and the gantt marks, which isn’t where we want the blank row to be.

To resolve this, simply click on the MIN(NULL) text in the chart and drag the text below the PY Sales (copy) text

And now you just need to uncheck Show Header against the Measure Names pill on Rows, and the Measure Values and PY Sales (copy) fields on the Columns. Then remove all row and column borders and gridlines and hide labels for rows and columns.

Hopefully you’ve got the final viz which you can now add to a dashboard. My published version is here.

Happy vizzin’! Stay Safe!

Donna

Filtering Table Calcs

It was Sean Miller’s turn this week to set the challenge, which had an added twist – to complete only using the web authoring feature in Tableau Public, something I’d never tried before.

Most challenges use a version of Superstore Sales which I usually already have on my laptop, due to the multiple instances of Tableau Desktop installed, but this time, I had to get the data from data.world. I saved this locally then logged into Tableau Public online and started the process to build a viz online, by clicking the Create a Viz (Beta) option.

When prompted I then chose to upload from computer, the data source file I’d downloaded, and once done I was presented with the online canvas to start building

Unsure as to whether there was any auto save facility online, I decided to save the workbook pretty much immediately, although I was first prompted to navigate to the Data Source tab and create an Extract before I could save.

Now I was ready to start building out the requirements.

I started with the trend chart. The grain of the chart was at the month level, so I first created a specific field to store the Order Date at this particular level

Order Date Month

DATE(DATETRUNC(‘month’,[Order Date]))

I then added this to the Columns shelf as a continuous (green) exact date, and added Sales to the Rows shelf. I changed the mark type to Area, set the Colour to a blue, dropping the opacity to 50% in an attempt to match Sean’s colouring and adjusted the tooltip.

I then added another instance of Sales to the Rows shelf next to the existing one, changing the mark type of this instance to Line and setting the Colour to a darker blue at 100% opacity. I then used the context menu on this pill to set a Quick Table Calculation of type Moving Average.

By default this sets the average to the 3-month rolling average as per the requirement. I adjusted the tooltip, made the chart dual axis and synchronised the axis, then removed Measure Names from the Colour shelf that had automatically been added.

My next step at this point was to work on the filtering requirement. To help with this, the first thing I did was to Duplicate as crosstab (right click on the chart tab). This created a tabular view of the data so I could work on getting the calculations I need for the filtering. I flipped the columns and rows, so my starting point was as below.

I chose to use parameters to capture the min and max dates that the user selects on the dashboard.

pMinDate

Date parameter defaulted to 01 Jan 1900

And I also created pMaxDate exactly the same way.

I then needed fields to store the relevant dates depending on whether a selection had been made or not

Min Date Selected

IF [pMinDate]=#1900-01-01# THEN {FIXED : MIN([Order Date Month])}
ELSE [pMinDate]
END

The FIXED statement sets the date to the minimum month in the data set, if the parameter is set to its default. The below doed similar to get the maximum date in the data set.

Max Date Selected

IF [pMaxDate]=#1900-01-01# THEN {FIXED : MAX([Order Date Month])}
ELSE [pMaxDate]
END

Using these dates, I then created a field to determine whether the month was within the min & max dates

Is Month Selected?

[Order Date Month]>=[Min Date Selected] AND
[Order Date Month]<= [Max Date Selected]

Pop this field onto the table view, and show the parameters and set them to some other values eg Min Date = 01 Jun 2018 and Max Date = 01 Sept 2018. You’ll also need to adjust the table calc setting of the Moving Average Sales pill to compute by Is Month Selected as well.

You’ll see True is stamped against the rows which match the date range. Now, if you make a note of the Moving Average values against these rows, and then add Is Month Selected? = True to the Filter shelf, you should get just the True rows displaying, BUT the Moving Average values are now different. This is because, with this type of field as a filter, the filter is being applied BEFORE the table calculation is being applied – the table calculation essentially has ‘no knowledge’ of the rest of the rows of data as they have been filtered out. Solving this issue is the crux of this challenge.

Remove Is Month Selected? from the filter shelf. We’re going to create a new field to filter by instead

FILTER

LOOKUP(MIN([Is Month Selected?]),0)

This field uses the LOOKUP table calculation, and is basically looking at the value of the Is Month Selected? field in it’s ‘own row’ (denoted by the 0 parameter).

Pop this onto the table view, and you can see that it’s essentially a duplicate of the Is Month Selected? field.

Now add the FILTER =True field to the Filter shelf instead, and this time the rows will once again filter as required, but the Moving Average values will be the same as those in the unfiltered list, which is what we need.

This works because the field we’re filtering by is a table calculation. When added to the filter shelf, these types of fields are applied at the ‘end’ of the processing; ie the table calculation is computed against all the rows in the data AND THEN the filter gets applied. As a result the Moving Average values are retained.

This is all part of Tableau’s Order Of Operations and is a fundamental concept when working with filters.

Armed with this knowledge, we can now apply the FILTER=True to the Filter shelf of the chart we built.

Now we can start to work on the bar charts, which will show Sales along with Average Monthly Sales. For this 2nd measure, I need to work out the number of months captured between the min & max dates.

No. Months

DATEDIFF(‘month’, [Min Date Selected],[Max Date Selected])+1

And with this I can now create

Avg Monthly Sales

SUM([Sales])/MIN([No. Months])

The bar chart viz can then be quickly created by adding Sub-Category to Rows and both Sales and Avg Monthly Sales to Columns, and sorting by Sales descending. The colours of each bar are adjusted to suit.

The bars also need to react to the dates selected, but we can simply add the Is Month Selected? = True to the Filter shelf for this, as none of the values displayed on this chart are reliant on table calculations.

To remove the gridlines from both charts, select Format > Workbook from the menu, scroll down the Format dialog displayed on the right and set Gridlines to Off

Finally, title both sheets appropriately, and then add to a dashboard. We need to add 3 actions to this dashboard – 2 parameter actions and 1 filter action.

Create a parameter action which will on Select, set the pMinDate parameter by passing the Minimum Order Date Month, and will reset back to 01 Jan 1900 when unselected.

Then create another instance, which sets the pMaxDate parameter instead, passing the Maximum Order Date Month.

Finally, create a Filter Action which on Select of the bar chart, filters the trend chart

And with that, the challenge should be complete. My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you use set actions to create a reference date range?

There’s a lot packed into the challenge this week, which was “set” by Lorna Brown and Erica Hughes to test our Set Action skills. How detailed this blog will go, I have yet to decide… I’ve got a couple of hours to get this nailed, so it could get quite brief as we get towards the end 🙂

I’ve got 6 sheets/charts making up this dashboard, so my intention is to summarise each one, and I’ll define the various calculations that are going to be needed as we go.

  • The overall summary table
  • The selected months summary table
  • The trend line
  • The donut chart
  • The top 3 states table
  • The map
  • Adding the interactivity

The overall summary table

This challenge is focused on understanding the Sales per month. Whilst its possible to use the built in aggregation features of a date field, I often prefer to create explicit date fields at the level I require, so it’s easier to reference. Therefore, the first field I created for this challenge was

Order Date To Plot

DATETRUNC(‘month’, [Order Date])

This essentially ‘groups’ every order placed in a month to be tagged with the 1st of the month. I custom formatted this field to MMM yyyy (ie Oct 21).

For the overall summary table, I need to capture the total sales of the whole data set, and I use a Fixed LoD calculation for this.

Total Sales

{FIXED: SUM([Sales])}

This field is formatted to $0.00M

NOTE – I actually named this field <space>Total Sales<space> as I want to display the name of the field (the measure name) in the summary table, but the ‘selected months’ summary table also has a Total Sales measure which is a different calculation (see later). Adding the <spaces> is a sneaky way to get two fields with what appears to be the same name. As this field when displayed will be centred, the <spaces> aren’t noticeable.

We also need to get the monthly average sales for the whole data set

Average Sales by Month

AVG({FIXED [Order Date To Plot]: SUM([Sales])})

Format this to to $0.0K

We can now build the summary table by adding Measure Names to the Filter shelf and selecting these 2 fields. The placing Measure Names on Rows and Measure Names and Measure Values on Text. Reorder the measures as required, hide Measure Names on Rows and format the Text as required.

Change the title of the sheet to Superstore Sales, ensure the tooltip doesn’t display and remove all gridlines / row banding etc.

The selected months summary table

The core requirement of this challenge is to make use of set actions, so we’re obviously going to need a set which will contain the dates (months) the user will select on the chart. This set will be based off of Order Date To Plot. Right click on the field > Create > Set. Name it Order Dates To Plot Set and by default, select all the months between Oct 20 to Jun 21 inclusive.

Later I’ll describe how the values of this set will get updated, but for now, we need to get some information relating to the sales in these selected months.

Firstly, we want the total sales for the months in this set.

Total Sales

IF [Order Date To Plot Set] THEN [Sales] END

The default format for this field is set to $ with 0 dp.

Note – this is the other ‘total sales’ field mentioned earlier. This field name has no leading/trailing spaces.

To get the average, I needed a field just to store each member of the set (ie each selected month)

Selected Dates

IF [Order Date To Plot Set] THEN [Order Date To Plot] END

and with this I can then work out

Average Sales

AVG({FIXED [Selected Dates]: SUM([Total Sales])})

The final measure required for this section is the change within the date range, which is basically comparing the value of sales at the first month in the selection with the sales in the final month selected. We need a few fields to get to this.

Firstly, we want to identify the first and last months

Min Selected Date

{FIXED:MIN(IF [Order Date To Plot Set] THEN [Order Date To Plot] END)}

If the date is in the set, then return the date and then take the minimum of all the dates, and store against all the rows in the data. Similarly we have

Max Selected Date

{FIXED:MAX(IF [Order Date To Plot Set] THEN [Order Date To Plot] END)}

Putting this info into a table, you can see how the calculations are working. The values for the Min & Max dates are the same across every row.

Next we need to get the Sales at the min & max points, and spread that value across all rows

Sales at Min Date

{FIXED: SUM(IF [Order Date To Plot]=[Min Selected Date] THEN [Sales] END)}

Sales at Max Date

{FIXED: SUM(IF [Order Date To Plot]=[Max Selected Date] THEN [Sales] END)}

Now we can work out the difference

Change within Date Range

([Sales at Max Date]-[Sales at Min Date])/[Sales at Min Date]

format this to a percentage set to 1 dp

Finally, we need to know the number of months in the set, which is displayed in the title of the monthly summary sheet.

Months in Set

{FIXED: COUNTD(IF [Order Date To Plot Set] THEN [Order Date To Plot] END)}

If the date is within the set, then capture the date, and the count the distinct set of dates captured.

Make this a discrete field (move from the measures section at the bottom of the data pane to the dimensions section at the top (above the line), and add to the tabular view

Now we can build the summary sheet.

Add Measure Names to the Filter shelf and this time filter by Total Sales, Average Sales and Change within Date Range. Add Measure Names to Rows and Measure Values to Text. Reorder the measures.

Format the Total Sales to be in $K, by selecting the Format option from the context menu of the Total Sales pill on the Measure Values shelf (hover on the pill and click the carrot/down arrow that appears – by formatting this way, we’re changing the display of this field for this sheet only).

Add Min Selected Date and Max Selected Date to the Detail shelf and set to be Exact Date. Format both these fields via the pill context menu to be the ‘March 2001’ format

Also add Months in Set to the Detail shelf.

Adjust the title of the sheet as below

Finally you need to set the background of the worksheet to the relevant purple (I used #8074a8), adjust the colours of all the fonts to white and adjust the size/style of the fonts in the table. Remove all gridlines/row banding etc, and you should have something like below

The Trend Line

By this point we’ve built all the calculated fields we need for this chart. This is a dual axis line chart, as we want the colour of the line for the selected dates to be different from the non selected ones, and we want to display a label for the highest sales in the selected timeframe.

  • Add Order Date to Plot to Columns, and set as a Continuous (green) pill set to Exact Date
  • Add Sales to Rows
  • Add Total Sales to Rows
  • Make the chart dual axis, and synchronise axis.
  • Adjust the colours of the Measure Names colour legend
  • On the Label shelf of the Total Sales marks card, set to label the maximum value only
  • On the All Marks Card, add Min Selected Date and Max Selected Date to the Detail shelf and set to Exact Date.
  • Right click on the Order Date To Plot axis and Add Reference Line
  • Create a reference band that starts at the constant Min Selected Date, ends at the constant Max Selected Date, is bounded by dotted lines and shaded between
  • Hide the Sales and Total Sales axis, format tooltips and adjust the row & column dividers.
  • Change the title and you should get to

The donut chart

Donut charts are 2 different sized pie charts on top of each other, created using a dual axis chart. On the Rows shelf type in MIN(0). Then type the same next to it. This gives you two axis and two marks cards.

We only care about information related to the selected dates for this chart, so we can add Order Date To Plot Set to the Filter shelf, which by default will just restrict the information to the data ‘in’ the set.

Change the mark type of the 1st MIN(0) marks card to Circle and add Sales to the Tooltip shelf. Adjust the size of the charts/mark.

Change the mark type of the 2nd MIN(0) marks card to Pie and add Sales to the Angle shelf. Add State to the Detail shelf. Sort the State field by Sales descending.

Note – the circles might look the same at this point, but if you hover over the bottom one, you should see that it’s segmented by State.

We need some new fields now to help us identify the top ranking states.

Sales Rank

RANK(SUM([Sales]))

This is a table calculation, so it’s best to see how this field will work in a table view – build one out as below, and set the table calculation on the Sales Rank field as shown

We’re now going to ‘group’ the ranks into the top 3 and everything else

Sales Rank Group

IF [Sales Rank]<=3 THEN [Sales Rank] ELSE 10 END

We can now use this Sales Rank Group field to colour the pie chart. On the 2nd MIN(0) marks card, add Sales Rank Group to the Colour shelf. Adjust the table calculation to compute using State as above, then change the field to be Discrete (blue). Adjust the colours to suit.

Now make the chart dual axis, and synchronise the axis. Adjust the size of the 1st MIN(0) circle to be smaller than the pie. If it’s not showing, right click on the right hand axis and move marks to back. Colour the circle white. Adjust tooltips to suit and hide axis, column/row dividers etc. Update the title. You should have

The top 3 states table

  • Add Order Date To Plot Set to Filter
  • Add State to Rows and Sales to Text and sort descending.
  • Add Sales Rank to Filter and set to At Most is 3. This will just show the top 3 states.
  • Add State to Text
  • Add a Percent of Total Quick Table Calculation to the existing Sales field that’s on the Text shelf (via the context menu of the pill)
  • Add another instance of Sales back onto the Text shelf
  • Adjust / format the font size and layout of the fields on the Text shelf
  • Add Sales Rank to the Size shelf and set to be discrete (blue) and set the mark type to be Text. Adjust the size of the marks – it’s likely it’ll need to be reversed and the range adjusted.
  • Hide the State field on Rows, adjust the font colours, remove row banding and row/column dividers. You should end up with…

The map

  • Add Order Date To Plot Set to the Filter shelf
  • Add State to Detail – this should create a map (edit locations to be US if need be – Map -> Edit Locations menu)
  • Add Sales to the Colour shelf
  • Edit the colour range to a suitable purple range ( I set the darkest colour of the range to #6c638f)
  • Adjust the map layers (Map -> Map Layers) so only the option highlighted below is selected.

Adding the interactivity

Add all the sheets to the dashboard, using vertical and horizontal containers to arrange the relevant layout. Then add a Set Action (Dashboard > Actions > Add Action > Change Set Values). The Set Action should be configured as below :

And fingers crossed, you should now be able to select marks on the trend line and see all the other charts, except the initial summary table, all update. My published viz is here.

Happy vizzin’! Stay Safe!

Donna

How much do top sub-categories contribute to sales?

A colourful #WorkoutWednesday challenge this week, courtesy of Ann Jackson incorporating pie charts, top N functionality and interactivity and a highlight table. Pie charts can cause much debate amongst the data viz community and if this one was just representing the multitude of sub-categories, it certainly wouldn’t be ideal. But when the core aim is to simply present 2 key measures (those in the top N against the rest), the pie is a familiar and effective visual. In this instance, the outer ring segmenting all the sub-categories provides additional context without detracting from the main purpose of the viz.

So lets build…

  • Creating the core calculations
  • Building the Pie Chart
  • Building the Highlight Table
  • Adding the Interactivity

Creating the core calculations

First up, we’re going to need a parameter to define the ‘Top N’. Create an integer parameter with a range from 1 to 17, that steps every 1 interval, and is defaulted to 5.

pTopN

Next we’re going to use a Set to capture the Sub-Categories that are in the Top N Sales. Right click on Sub-Category -> Create ->Set. Use the Top tab to define a set captures the Sum of Sales that is based on the pTopN parameter.

Now, we want to create a grouping of those in and out of the set, which will be used as part of the highlight table

Sub-Cat Group

IF [Sub-Category Set] THEN ‘IN TOP ‘ + STR([pTopN])
ELSE ‘ALL ELSE’
END

Pop all these fields out into a table so you can see what’s going on as you change the pTopN parameter. Sort the Sub-Category by Sales descending.

Now we need to identify the % value of Sales for the Sub-Categories that are in the Top N (this is the label on the darker segment of the central pie chart), so for that we need

Total Sales

{FIXED:SUM([Sales])}

Top N Sales (in hindsight, this should have been named Sales per Group or similar)

{FIXED [Sub-Category Set] : SUM([Sales])}

Top N Sales %

IF ATTR([Sub-Category Set]) THEN
SUM([Top N Sales])/SUM([Total Sales])
END

Format this to percentage with 0 dp.

Adding to the table, we can see the values

The final field we need in order to build the pie, is an additional one to store the label text

Label:SubText

IF [Sub-Category Set] THEN ‘TOP ‘ + STR([pTopN]) END

Building the Pie Chart

To achieve this we’re going to build a dual axis pie chart, where one pie is used to define the In/Out of Top N segmentation in the centre, and the other pie is used to create the outer ring.

Create an axis by typing in MIN(0) onto the Rows shelf, and then adding another instance of MIN(0) next to it. This will generate 2 marks cards, which is where the fields to build the pie charts will be placed.

In the first MIN(0) marks card, change the mark type to Pie, then add Top N Sales to the Angle shelf and Sub-Category Set to the Colour shelf. Adjust colours to suit. Then add Top N Sales % and Label:SubText to the Label shelf. Adjust size of the view and the chart to suit. Also remove all text from the Tooltip.

Positioning the text is a bit fiddly. If you click on the text so the cursor changes to a cross symbol, you can then drag it to a better location. However, when you change the Top N parameter, the text will move. You can go through each parameter value and reposition the text each time (which I did.. it wasn’t too onerous for 17 values), however I found when published to Tableau Public, the positioning wasn’t honoured. Ann’s solution was the same, so I didn’t get too hung up on this, although if anyone resolved it, I’d love to know!

Now on the 2nd MIN(0) marks card, again change the mark type to Pie, and this time add Sales to the Angle shelf and Sub-Category to Colour. Sort the Sub-Category field by Sales descending. Additionally add Sub-Category Set to the Detail shelf (this will be needed later on to make the interactivity work). Edit the colour palette to use the Hue Circle options. Adjust the size of the pie chart. Adjust the tooltip too.

Now make the chart dual axis and synchronize the axis. If the colourful chart is displayed ‘on top’, then right click on the right hand axis and select move marks to back. Adjust the sizes of both pies, so the colour wheel is slightly larger than the other one.

Now hide the axis, and remove all borders and gridlines.

Building the Highlight Table

I’ve built the highlight table as a bar chart. Start off by adding Sub-Category Set, Sub-Cat Group and Sub-Category to Rows. Sort Sub-Category by Sales descending. Then type in MIN(1) into the Columns shelf.

Now add subtotals via the Analysis > Totals > Add all Subtotals menu. This adds 2 additional rows to each section

But we don’t want the ‘grand total’, so click on the Sub-Category Set context menu, and uncheck Subtotals

To position the totals at the top, go to Analysis > Totals > Column Totals To Top

Then add Sub-Category to the Colour shelf, and adjust the colour of the Total bar to white

We now need to get some text onto those bars, but we need some additional calculations to help up with this. Firstly, we want to get the rank of the Sub-Category. We’ll use a table calculation for this

Sales Rank

RANK(SUM([Sales]))

We also need a way to identify the Total rows differently from the main Sub-Categories. I referred to this Tableau KB for help here, and subsequently created

Size

SIZE()

To see what this is doing, add Size to the Label shelf, and adjust the table calculation setting to compute by all fields except the Sub-Category Set. The size of the total rows is 1.

Based on this logic, we can then create

LABEL:Bar

IF [SIZE]=1 THEN ‘SUBTOTAL FOR GROUP’
ELSE ‘#’+STR([Sales Rank]) + ‘ ‘ + ATTR([Sub-Category])
END

Add this to the Label shelf instead of the Size field and adjust the table calc settings as above. Align left. Then add Sales to the Label shelf too and adjust so its on the same row. Adjust the tooltip too.

Now hide the Sub-Category Set and the Sub-Category fields. Right click on the ‘IN TOP x’ text and Rotate Label, then click on Sub-Cat Group text and Hide Field Labels for Rows. Format the header text to suit.

Hide the MIN(1) axis, and set columns and gridlines to None. Adjust the Row dividers to be darker

Adding the Interactivity

Add the 2 sheets onto a dashboard, and add a Highlight Dashboard Action, that on Hover of either of the charts, it highlights the other chart based on the Sub-Category Set only.

I think that’s covered everything. My published viz is here.

Happy vizzin’! Stay Safe!

Donna