Is Profit Ratio Influenced by Quantity?

For this week’s challenge, Ann asked we recreated a scatter plot that ‘on click’ became a connected scatter plot and also displayed an ‘insights’ box to the user.

Building out the data

The scatter plot is to show Profit Ratio by Quantity, so first up we need

Profit Ratio

SUM([Profit])/SUM([Sales])

formatted to percentage with 0 dp.

A mark is shown on the scatter plot per Category and Month/Year. When working with dates like this, I often prefer to created a dedicated field to use as an ‘exact date’ which I can then format. So I built

Month Year

DATE(DATETRUNC(‘month’,[Order Date]))

and formatted it to the March 2001 format (ie mmmm yyyy).

We’re also going to need a set based on the Category field, as that is going to drive the interactivity to help us connect the dots ‘on click’. So right click on Category and Create Set and just tick one of the categories displayed for now.

Selected Category

Let’s build these out into a table as follows

  • Order Date to Filter, filtered by years 2018 & 2019
  • Category, Selected Category & Month Year (exact date , discrete) onto Rows
  • Measure Names to Text and Filter to Quantity & Profit Ratio
  • Right click on Selected Category and select Show Set to display a selection control to easily allow the set values to be changed

To help drive the interactivity, we’re going to need to store the value of the Profit Ratio for the Selected Category in the set.

Selected PR

IF ATTR([Selected Category]) THEN [Profit Ratio] END

Format this to percentage 0 dp too, and add this to the table – it will only display the profit ratio against the rows associated to the category in the set

At this point, we’ve got what we need to build out the scatter plot, but I’m going to continue to create the fields needed for the ‘Insights’ piece.

These are looking for the months that contained the Max & Min values for the Profit Ratio and Quantity. I’m going to tackle this with Table Calcs.

Max PR per Category

WINDOW_MAX([Profit Ratio])

Adding this to the table, and setting the table calculation to compute by Month Year only, gives us the Max Profit Ratio for each Category

We need 3 more fields like this :

Max Qty per Category

WINDOW_MAX(SUM([Quantity]))

Min PR per Category

WINDOW_MIN([Profit Ratio])

Min Qty per Category

WINDOW_MIN(SUM([Quantity]))

Add all these to the table, setting the table calc in the same way as described above.

Now we know what the min & max values are, we need to know the month when these occurred.

Max PR Month

WINDOW_MAX(IF ATTR([Selected Category]) AND [Profit Ratio] = [Max PR per Category] THEN MIN([Month Year]) END)

Format this to March 2001 format.

If the category is that selected, and the profit ratio matches the maximum value, then retrieve the Month Year value, and duplicate that value across all the rows (this is what the outer WINDOW_MAX does).

Add this to the table in the Rows, and adjust the table calc settings for both the nested calculations to compute by Month Year only. The month for the selected category which had the highest profit ratio will be displayed against all rows for that category, and null for all other rows.

Again we need similar fields for the other months

Max Qty Month

WINDOW_MAX(IF ATTR([Selected Category]) AND SUM([Quantity]) = [Max Qty per Category] THEN MIN([Month Year]) END)

Min PR Month

WINDOW_MAX(IF ATTR([Selected Category]) AND [Profit Ratio] = [Min PR per Category] THEN MIN([Month Year]) END)

Min Qty Month

WINDOW_MAX(IF ATTR([Selected Category]) AND SUM([Quantity]) = [Min Qty per Category] THEN MIN([Month Year]) END)

Format all these too, and add to the table with the appropriate table calc settings applied.

If you now change the value in the the Selected Category set, you should see the table update.

Building the Scatter Plot

On a new sheet, build out the basic scatter plot by

  • Order Date to Filter, filtered by years 2018 & 2019
  • Profit Ratio to Rows
  • Quantity to Columns
  • Category to Colour and adjust accordingly
  • Month Year exact date, continuous to Detail
  • Change Shape to filled circle
  • Adjust the Tooltip to suit
  • Change the axis titles to be capitalised
  • Change the sheet title
  • Remove all row/column borders & gridlines. Leave the zero lines and set the axis rulers to be a sold grey line

Add Selected PR to Rows, and change the mark type of that card to line, and add Month Year to path. In the image below, I still have Furniture selected in my set.

Add Month Year to the Label shelf too and format so the text is smaller and coloured to match mark colour

Make the chart dual axis, synchronise the axis and remove the Measure Names from the colour shelf of each mark. Hide the Selected PR axis.

If you now change the selected set value, you can see how the other categories ‘join up’.

Building the Insights Box

Take a duplicate of the data table built originally, and add Selected Category to Filter. If you’ve still got an entry in your set, this should reduce the rows to just those records for that Category.

Remove the fields Profit Ratio, Quantity, Selected PR from the Measure Values section, and remove Selected Category from Rows.

Create a new field

Index = Size

INDEX()=SIZE()

and add this to the Filter shelf, setting the value to true.

The INDEX() function will number each row, the SIZE() function will number each row based on the total number of rows displayed (ie all rows will display the same number), so INDEX()=SIZE() will return true for the last row only

Now we’ve only got 1 row displayed, we’ve got the info we need to build the data in the Insights box by

  • Move Month Year to Detail
  • Category to Colour
  • Move Max PR Month, Max Qty Month, Min PR Month, Min Qty Month to Text
  • Move Max PR per Category, Max Qty per Category, Min PR Per Category, Min Qty per Category to Text

This will probably show duplicate instances of the values. This is because we didn’t fix the table calculation setting of the Index=Size field. Edit that to compute by Month Year

and then re-edit the Index=Size filter to just be true again, and expand the display so you can see all the values.

Now you can easily format the Text to the required display

I used this site to get the bullet point image from

Building the dashboard & interactivity

Add the scatter plot to the dashboard, and remove the legends that automatically get added.

Float the Insights sheet and position bottom right, remove the title. fit entire view, and also remove the colour legend which will have been added.

Add a dashboard Set Action to assign values to the Selected Category set and remove all values when the selection is cleared

Additionally, add a highlight action that highlights the Category field only (this keeps the lines highlighted on selection of a single mark).

Now practice clicking on and off marks on the scatter plot and you should see your lines & insights box being added and removed.

Finally add the title and header line by adding a vertical container at the top above the scatter plot.

Add a text box for the title, and underneath add a blank object.

Set the padding of the blank object to 0 and the background to black. Then edit the height to 4, and adjust the height of the container if it’s now too large.

Make adjustments to the position of the floating insights box if need be so it isn’t overlapping any axis.

And that should be it! My published viz is here.

I’ll be taking a break for a couple of weeks, so will be on catch up when I return from some R&R.

Happy vizzin’! Stay Safe!

Donna

What is the distribution of total orders by customer?

Luke Stanke set the challenge this week, and posted a ‘sneak peak’ on Twitter before the challenge was formally released

Challenges from Luke can sometimes be on the harder end of the scale, so with the bit of extra time available and only the gif in the tweet as a clue , I had a play to see if I could get close at all. And surprisingly I could, so once it was formally issued, it was just a case of tidying up some of my calcs.

Calculations

First up we need to establish the basic calculations

# Customers

COUNTD([Customer ID])

# Orders Per Customers

{FIXED [Segment],[Customer ID]: COUNTD([Order ID])}

Plotting these out onto a table with Segment we get

Note #Orders Per Customers needs to be set to be a discrete dimension.

So while this shows us a summary of the number of customers, we’ve only got 1 mark showing the summarised count. When plotting the chart we need something else in the view that will generate more marks. This is Customer ID.

At this point in order to help us build up our tabular view to ‘see’ what’s going on, I’ll filter the table to just show Segment = Consumer, and I’ll add Customer ID to Rows

As expected, our # Customers is now showing a count of 1 per row, but the data is now expanded as we have now got a row (ie a mark) per customer for each # Orders Per Customer ‘bucket’. But we still need a handle on the total customers in each ‘bucket’.

Customers Per Order Count

WINDOW_SUM([# Customers])

Adding this to the view and setting the table calc to compute based on Segment & Customer ID, we get the summarised value back again.

But we don’t want to actually show that number of marks; the number of circles to plot on the chart is dependent on a user parameter:

pMarkIndicator

Based on the requirements, if the number of customers is 15 and the user parameter is 5, then 3 circles should be drawn (15 / 5 =3 ), but if the number of customers was 14, only 2 circles should be drawn (14 / 5 = 2.8), ie the number of circles will always be set to the integer of the equal or lesser value (essentially the FLOOR() function). This can also be achieved by

Marks to Plot

INT([# Customers per Order Count]/[pMarkIndicator])

For some reason FLOOR can’t be used in the above as a table calculation is being used, but INT does the job just fine, and adding to the tabular view and adjust the table calculation accordingly we get

ie for the Consumer Segment, 6 customers have made 1 order in total, so based on batching the customers into groups of 5, this means 1 circle should be displayed. Whereas, 30 customers have made 3 orders in total, so 6 circles should be displayed.

But we can’t actually reduce the amount of rows (ie marks) displayed – we either have 1 row (by removing Customer ID) or a row per customer. But that’s fine, we don’t need to.

What we want is something in our data to group each row into the relevant batch size.

First up let’s generate an ID per row for each customer that restarts for each #Orders Per Customer.

Index

INDEX()

Add this as a discrete pill to Rows, and adjust the table calc

Now we have this, we calculate which ‘column’ each customer can sit in, a bit like what we would do if we were building a small multiple table, arranging objects in rows and columns (see here for an example of what I mean).

Cols

[Index]%([Marks to Plot])

This uses the modulo (%) notation which returns the remainder of the division sum. Lets put this on the view

For customers who have only ordered once, and where we’re only going to plot 1 mark, the Cols value is the same (0) for all rows.

Whereas for customers who have ordered twice, and where we want to plot 2 marks, the Cols value is either 0 or 1.

We’ve now got a value we can use to plot on an axis. We’re still going to plot a mark for each Customer ID, but some marks will be plotted in the same position, ie on top of each other, which therefore looks like just one mark.

Let’s show this more graphically, by duplicating the sheet, deleting some pills, moving some around, changing some to continuous, and setting the mark type to circle as below

Change the pMarksIndicator parameter and the number of circles will adjust as required.

So far, so good. We’ve got the right number of marks, it’s just not looking as nice and symmetrical as it should be.

We need to shift the marks to the left. But how far it shifts is dependent on whether we’ve plotted an odd or even number of marks.

If we have an odd number, the middle mark should be plotted at 0. If we have an even number the middle two marks should be plotted at -0.5 and +0.5 respectively. The calculation below will achieve this

Cols Shifted

IF [Marks to Plot]%2 = 0 //we’re even
THEN [Cols] – ([Marks to Plot]/2) + 0.5
ELSE //we’re odd
[Cols] – (([Marks to Plot]-1)/2)
END

To demonstrate this, I’ve added Cols Shifted along side Cols on the viz (this time make sure all the table calculation settings (including the nested calcs) are applied to compute based on Customer ID only which is different from the calcs above)..

Now you can see how it all works, you can remove the Cols and the Segment from the Filter shelf.

And now its just a case of applying the various formatting to clean up the display, and adding to the dashboard.

My published version is here.

Happy vizzin’! Stay Safe!

Donna

Can you compare a 3-day vs 14-day moving average and describe the latest trend?

This week for #WOW2020, Ann provided a table calculation feast of a challenge! This certainly is not for the faint-hearted! As well as cracking all the table calcs, the challenge features multiple views, measure swapping, parameters, BANs, filtering, sorting …. it’s got it all going on!

Ann hinted you’d probably want to start with the table, and even if there hadn’t been a table output in the display, this is what I would have done. If you’ve read enough of my blogs, you’ll know I often like to build up a ‘check data’ sheet, which just contains the data I need in tabular form as a quick reference. When working with table calculations this is an absolute must have!

So let’s build out that Check Data table to start with. I have a feeling this is going to be a lengthy blog 🙂

Initial Set up

First up, the requirements stated that the latest date would be 7 June, but I found records with a 8 June date. All the associated info for this date was null though, so I set a data source filter to exclude this. This means I wouldn’t get any issues if I needed to store the max date in a FIXED LoD calculation at any point.

I also found it easier to rename a couple of the measures provided to match the output, so rename PEOPLE_POSITIVE_NEW_CASES_COUNT to New Cases and PEOPLE_POSITIVE_CASES_COUNT to Reported Cases. I’ll refer to these renamed fields going forward.

Building all the Calculated Fields

To build out the table, we’re just going to focus on one State & County, as there’s a lot of data. So add Province State Name = Tennessee and County = Davidson to the Filter shelf.

Add Report Date (discrete exact date – blue pill) and New Cases & Reported to Rows. As you scroll down, you’ll see data starting to come in on 8 March.

We want to create our moving average calculations

3 Day Moving Avg

WINDOW_AVG(SUM([New Cases]), -2, 0)

14 Day Moving Avg

WINDOW_AVG(SUM([New Cases]), -13, 0)

Notice the number of rows to average over is 1 less than you might expect, as the current row is included, so the calculation is saying ‘current row’ and 2 | 13 previous rows.

Add these to the table, and adjust the table calculation so it is explicitly calculating by Report Date. This would have happened automatically, as the calculation would have been computing ‘down’ the table, but it’s best to fix the computation, so it doesn’t matter where the pill gets moved to in the view.

We now need to work out whether there is an increase or not between the 3-day and 14-day average.

Is Increase?

IF [3 Day Moving Avg] > [14 Day Moving Avg] THEN 1 ELSE 0 END

Is Decrease?

IF [3 Day Moving Avg] <= [14 Day Moving Avg] THEN 1 ELSE 0 END

I’m using 1s and 0s as it’s going to help with a later calculation.

NOTE – I’m assuming that if there is ‘no change’ it’ll be recorded as a decrease. This is how I interpreted the requirement, “ …whether it is an increase or a decrease (or no change)” and it wasn’t easy to find any matches anyway.

I also need some text to indicate the increase or decrease

Increase | Decrease

UPPER(IF [Is Increase?]=1 THEN ‘Increase’ ELSE ‘Decrease’ END)

The UPPER is used as that’s part of the tooltip formatting.

Let’s get these onto the view, always making sure the table calculations are set to Report Date.

We need to calculate the number of days that has been reported INCREASE in succession, and the number of days where successive DECREASE has been reported.

So first, let’s identify which rows match the previous row.

Match Prev Value?

LOOKUP([Is Increase?],-1) = [Is Increase?]

If the value of the Is Increase? field in the previous (-1) row is the same as the Is Increase? field in the current row, then this is true, else false.

Add to the view, and verify the table calculation for itself and all nested calculations being referenced, is set to Report Date.

We now have all the information we need to help us work out the number of days in the increase/decrease ‘trend’.

Days in Trend

IF (FIRST()=0) OR(NOT([Match Prev Value?])) THEN 1
ELSEIF [Increase | Decrease] = ‘INCREASE’ THEN ([Is Increase?]+PREVIOUS_VALUE([Is Increase?]))
ELSEIF [Increase | Decrease] = ‘DECREASE’ THEN ([Is Decrease? ]+PREVIOUS_VALUE([Is Decrease? ]))
END

If the row in the table is the very first entry (so there’s nothing previous to compare against), or the row in the table didn’t match it’s predecessor (ie there was a change), then we’re starting a new ‘trend run’, which obviously starts at 1.

Otherwise, if the current row we’re on indicates an increase, then we’ll add the value of the Is Increase? field (which is 1) to the previous value (which is also 1). PREVIOUS_VALUE works recursively though, so it essentially builds up a running sum, which gives our trend.

We ultimately do the same thing using the Is Decrease? column. This is why using 1 & 0s in the earlier calculation help.

Adding into the view, and setting the table calculation correctly, you should get something similar to this…

Finally, there’s one key field we need to add; something to help identify the latest row as we will need it for filtering in the table that’s displayed on the dashboard. Simply applying a standard ‘quick filter’ won’t work, as the table requires we show the 3-day & 14-day moving averages. A ‘quick filter’ to limit the data to the latest date (7th June), will show the wrong values, as the data related to the other days will be filtered out, so the table calc won’t have the information to correctly compute over.

We need to create another table calculation that we can use as a filter, and that due to Tableau’s ‘order of operations’ will apply later in the filtering process than a traditional quick filter.

Max Date

{FIXED : MAX([Report Date])}

The latest date in the whole data set.

Show Data for Latest Date

LOOKUP(MIN([Report Date]),0) = MIN([Max Date])

If the Report Date of the current row is the same as the maximum date in the whole data set, then return true.

We’ve now got all the core data components we need to create the various charts.

In the interest of time (my time in writing this out), I’m going to attempt not to describe the building of all the charts in too much detail, but just call out the useful bits you might need. If you’re attempting this challenge with the table calcs above, I’m assuming you know Tableau enough to not need everything defined to the lowest level.

The whole report is driven off a parameter which the user must enter a State – County combo.

You’ll need a calculated field to store the combo

State – County

[Province State Name] + ‘ – ‘ + [County]

and then create a parameter (State – County Parameter) off of this (right click, Create -> Parameter) which will create a string parameter with all the permutations.

When displaying on the dashboard, set this to be of type Type In

BAN

The BAN is a basic summary of the latest trend for the entered state county.

We need to filter the sheet to the value entered in the parameter

Is Selected State County?

[State – County Parameter] = [State – County]

Add this to the Filter shelf as true, along with the Show Data for Latest Date.

Add the relevant fields to the Text shelf to display the required text. The Report Date needs to be custom formatted to ddd, mmm d to get the Sun, Jun 7 display

Map

For the map, as well as filtering the latest date, we’re also going to need to filter just to the state only (not state & county) as above. So I created

Is Selected State?

LEFT([State – County Parameter], FIND([State – County Parameter],’-‘)-2) = [Province State Name]

This is unpicking the State – County combined string stored in the parameter, to just find the State part and compare to the Province State Name.

Build a filled map based on County and filter to the latest date and the selected state. I set the Map Layers to that below, which seems to match up

You’ll need to set both the Is Selected State County? and Increase|Decrease fields to the Colour shelf.

Bar & Line Chart

You’re going to need a few more calculated fields for this.

Moving Avg Selector

for the user to choose what the line should display. I’ve set it to an intger parameter that displays text

We then need a field to show on the display depending on what’s been selected in the parameter

Moving Avg to Display

If [Moving Avg Selector] = 3 THEN [3 Day Moving Avg] ELSE [14 Day Moving Avg] END

You’ll need a Dual Axis chart plotting New Cases and Moving Avg to Display against Report Date (continuous exact date)

The data only starts from 8th March, so I added Report Date to filter to start from 8th March. 8th March is also added as a constant reference line.

Table

Based on the State / County entered, the table is filtered to show the data for the latest date for all the counties in the state entered. Although not stated in the requirements, the first row is the county selected, with the rest ordered by Reported Cases.

You can get the selected county to the top, by adding Is Selected State County as a hidden field to the Rows, and moving ‘True’ to the top.

And that should be everything you need to build the dashboard, which is pretty much just stacking all the sheets one on top of each other in a single column.

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you build a dynamic date drilling chart?

Week 19 of #WOW2020 saw Ann provide this challenge, which she declared would be easier than Luke’s from the previous week.

The aim was to start with a chart by week, but on selection drill in to show a by day view, with the slight twist that all days between the first and last selected dates should display, even if the specific week in between wasn’t explicitly selected.

I knew this was going to involve set actions (although there was a clue on the ‘latest challenges‘ page – I tend to pick up the challenges through Twitter, so often bypass this page, but it’s worth a look for extra clues :-))

So on the face of it, the challenge seemed as if it should be ok, but it’s usually only when I start building that things can start to unravel.

One area that Ann hadn’t been explicit about in the requirements, was the behaviour in certain scenarios. I had to revise some of my initial attempts/calculations in order to match Ann’s workbook. This week I’m not going to go into all the wrong turns, but this is the behaviour I observed in Ann’s workbook, and so ultimately tried to replicate.

  1. When at the week level, if multiple points are selected, the days then displayed should start from the first day of the first week selected (which will be a Sunday as we’re working with how the US standardises a week start), and finish on the last day of the last week selected (which will be a Saturday).
  2. When at the week level, if a single point is selected then the 7 days in that week only should be displayed.
  3. When at the day level, if multiple points are selected, the days then displayed start from the first day of the week the first day is in and end on the last day of the week the last selected day is in (so you may get more days than actually selected). eg if the 1st day selected is a Tuesday and the last day selected is a Friday, the ‘drill in’, will start on the previous Sunday, and end on the following Saturday. This isn’t necessarily what you may expect to happen.
  4. Based on the above, when at the day level, a single day is selected, the ‘drill in’ will show the 7 days in the same week. Once you’re down to displaying the 7 days in the same week, clicking on any single day or selecting a couple of days, won’t appear to do anything as the same results are displayed.

Being aware of the above, will explain why some the calculations I end up with look the way they do.

Ok, let’s get on with the build…..

Ann stated that she’d hardcoded to the ‘last 52 weeks’, but essentially it looked like she was displaying data for all of 2019. This wasn’t really something the challenge was testing, so I simply started by adding a data source filter for the year 2019 (right click on data source -> Edit data source filters). I also had to set the date properties of my data source to set a week to start on a Sunday as I’m UK based, so my week’s are defaulted to start on Mondays (again right click on data source -> Date Properties). As a result of my year filter, I didn’t seem to have the same starting/ending dates as Ann, but as already stated, this wasn’t the main aim of the challenge, so I didn’t stress about it.

I decided quite early on that I was going to also use a Parameter Action to decide the level I was at (ie had I drilled in or not). I wasn’t entirely sure when I started how many ‘levels’ I might need, so chose to use an integer parameter for this. I ended up only needing 2 levels, so a boolean could have worked equally as well, or, as I type, I think I could have used this parameter to store the ‘date level’ (day or week) I want to display my dates at, which would have made some of my calculations easier to read. I’m not going to do this though.

Drill Down

Integer parameter set to 0 by default.

Now I want to define a date field that I’m going to use on the axis, that will vary depending on the ‘level’ we’re at.

Date to Plot

IF [Drill Down]>0 THEN DATETRUNC(‘day’, [Order Date]) ELSE
DATETRUNC(‘week’, [Order Date])
END

Note – I used >0 as I wasn’t sure if I’d have levels 0, 1 & 2… as it turns out I just used 0 & 1 in the end.

Add Date to Plot to Columns as a Continuous, Exact Date and Sales to Rows and we’ve got our starting point

Change the Drill Down parameter to 1 and the chart will change to display at the day level

Drilling Down on selection

On selection, we want to add the selected dates into a set, so first up, we need to define that set.

Right click on Date to Plot and Create -> Set

Selected Dates

Select some random dates so we can test with. These will get set properly later based on the Set Action we define on the dashboard.

Based on the dates in the set, we need to determine a min and a max date we can then use to restrict the dates being plotted on the chart.

Min Date

{FIXED:MIN(IF [Selected Dates] OR [Drill Down]=0 THEN DATETRUNC(‘week’,[Date to Plot]) END) }

This either gets the 1st day of the week based on the earliest date in the set, or the 1st day of the week of the whole data set (when we’re at the ‘starting’ level with Drill Down =0).

We also need

Max Date

IF [Drill Down]=0 THEN
{FIXED:MAX(DATEADD(‘day’,-1,[Date to Plot]))}
ELSE
DATEADD(‘week’,1,{FIXED:MAX(IF [Selected Dates] THEN DATETRUNC(‘week’,[Date to Plot]) END)})-1
END

If we’re at the starting level (Drill Down = 0) then we want the last day within the latest week in the data set (this field is used in the title display, so necessary to get this to display right), otherwise we need to get the last day of the week associated to the maximum date in the set.

Now we need to be able to restrict the dates displayed in the chart based on these

Dates to Include

[Order Date]>= [Min Date] AND [Order Date]<= [Max Date]

Add this to the Filter shelf and set to True.

Your display shouldn’t change, as we’re still at ‘level 0’, even though our set has random dates selected. If you now manually change Drill Down to 1, you should see a change

Setting up the dashboard actions

Ok, now we’ve got the basic idea, let’s get it all working properly with dashboard actions before we sort out all the other bits n bobs.

First up, let’s manually reset everything by setting the Drill Down parameter to 0 and emptying all the values selected in the set.

Add the sheet onto the dashboard, and create the set action, which is set to target the Selected Dates set and to Keep set values if you click on a blank area of the chart after selection.

However, this on it’s own won’t change the display. We need to set the Drill Down parameter to 1 too.

For this we need another field

Set Drill Down Level

1

Add this to the Detail shelf of the chart.

Then back on the dashboard, add a Parameter Action that targets the Drill Down parameter using the value stored in the Set Drill Down Level field

Now if you select the dates in the dashboard, you should get the desired behaviour, and if you select again, you should filter the days selected further (as per the behaviour described at the top of the page).

Resetting the display

The Reset button is actually another sheet.

I created a ‘fake’ bar chart by adding MIN(1) to Rows, and double clicking in the space below the Detail and Tooltip shelves on the Marks card, and typing the text ‘CLEAR SELECTION’. This creates a ‘pill’ without having it defined explicitly as a calculated field, and I added this to the Text shelf, and centred/formatted appropriately. I then set the axis to be fixed from 0 to 1 and hid it.

I set the colour to #a26dc2 and set the text to ‘match mark colour.

We only want this sheet to display, if we’ve ‘drilled down’, so I need

Show Reset

[Drill Down]=1

which is added to the Filter shelf and set to True.

I also need a parameter action off this sheet, to reset the Drill Down to 0 on selection. For this I need another field

Reset

0

which is added to the Detail shelf.

Depending on what you’ve already been playing around with, there’s a chance this sheet may already be empty. Get the dashboard into a state where you’ve drilled down to the day level, then add this sheet, and add another Parameter Action.

Reset is set to run off the ‘button’ sheet only, to target the Drill Down parameter by using the value in the Reset field.

You should now be able to test all this out and get the desired behaviour.

Adding Animations

To get the chart to transition between selections, we need to use the Animations functionality.

On the Format menu, select Animations and adjust the settings as you choose.

Play around and you should have the main features of this challenge now working

Dynamic Title

The title needs to change based on whether you’re at the weekly or daily level. It also needs to show total and average sales. So for all this I need

LABEL: Level

IF [Drill Down]>0 THEN ‘Day’ ELSE ‘Week’ END

LABEL: Subtitle Level

IF [Drill Down]>0 THEN ‘Daily’ ELSE ‘Weekly’ END

LABEL: Instruction

IF [Drill Down] = 0 THEN ‘SELECT WEEKS TO DRILL DOWN TO DAILY VIEW’
ELSE ‘CLEAR SELECTION USING BUTTON’
END

Total Sales

WINDOW_SUM(SUM([Sales]))

Avg Sales

WINDOW_AVG(SUM([Sales]))

All these fields are added to the Detail shelf of the main chart along with Min Date and Max Date, and then the title is edited and formatted accordingly to reference these.

Tooltips

In a similar manner, the tooltips also need adjusting, they just need to refer to the LABEL: Level field

Colour of the Line

Although not explicitly mentioned, the line colour seemed to change from a lighter shade at the weekly level to more intense at the daily level. To achieve this I created

COLOUR

[Drill Down]

and added to the Colour shelf. This gives a colour legend displaying either 1 or 0 depending at what state you are in the interaction. Set colour values accordingly.

Average Line

Add this to the chart simply by selecting Average Line from the Analytics pane and dragging then dropping onto the Table shelf that appears. Format appropriately.

And apart from other basic formatting to remove gridlines/ axis titles etc, that should be it.

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

How does the Sales Pipeline look?

Luke posted the challenge this week to build a Sales funnel using bar charts and table calculations.

The data provided was structured as follows

Each row identified a record in the pipeline along with its value. The last_stage field indicated what stage in the process the record was currently at. The value and last_stage are the only two bits of data needed for this challenge, along with the knowledge that the process moves through the stages in the following order :

  1. Prospect
  2. Lead
  3. Qualified
  4. Opportunity
  5. Negotiations
  6. Closed

which you can tell from the diagram anyway.

The challenge is that the value of any record currently at stage 2 (lead) or above, also needs to be incorporated into the total value of the previous stages.

When tackling these types of challenges involving table calculations, I start by creating a table view of all the data I need. So let’s get cracking…

Building up the data required

First up, I need a field to help drive the order of the stages, so I created

last_stage Order

CASE [last_stage]
WHEN ‘Prospect’ THEN 1
WHEN ‘Lead’ THEN 2
WHEN ‘Qualified’ THEN 3
WHEN ‘Opportunity’ THEN 4
WHEN ‘Negotiations’ THEN 5
ELSE 6
END

Add this to a view along with the value

This is essentially the data needed to build the Current Status column in the output.

Next we need to work out the Overall Funnel values. As Luke stated in the requirements, this is going to involve table calculations. What we want is a running sum but one that starts at the bottom at stage 6, and goes ‘up’ the table. But there is no Table Up option in Tableau, so we need to be creative.

We’ll still need a running sum though, so create this

Running Sum

RUNNING_SUM(SUM([value]))

Add this to the view, it will automatically be applied Table Down, which is what we need at this point – each row is the sum of the previous rows above it.

We also need the overall total captured against each row in the table.

Window Sum

WINDOW_SUM(SUM([value]))

Now we’re in a position to work out our ‘cumulative’ value or Overall Funnel, which is a calculation involving all 3 fields

Cumulative Value

[Window Sum]- ([Running Sum] – SUM([value]))

Note – for the values that will be displayed, I’ve applied formatting of $ to 0 dp.

Now onto the final Percent to Close calculation which is the value of records at stage closed, as a proportion of the cumulative value. So we need to get the closed value stored against every row

Closed Value

{FIXED : SUM(IF [last_stage order] = 6 THEN [value] END)}

so then we can determine

% To Close

SUM([Closed Value]) / [Cumulative Value]

which is formatted to percentage to 0 dp

So now we have all the data we need to build the viz. I like to save this sheet for future reference, and to double check as we start moving pills around to build the viz.

Building the viz

Start by duplicating the table sheet, and remove the calculations, so only the 3 necessary (Sales, Cumulative Value & % To Close) are listed.

Then move these 3 onto the Columns shelf, and in doing so, the viz should reformat as a bar chart automatically.

Set the colour of the bars to the dark green (#00646d) and tick the Show mark labels option on the Label shelf (you may need to expand the width of the rows to get the label to show.

The first 2 columns are displaying what we need, but the final one needs to show the bars ‘filling up’ to 100%. We need a dual axis for this.

Type into the Columns shelf Min(1) to create a fourth column. Uncheck show mark labels for this column only.

The set this field to be Dual axis, and synchronise the axis. It’s likely the marks will all change to circles, so reset the ‘All’ marks card back to bar. On the Min axis, right click and select Move marks to back, to ensure the % Closed values are sitting on top.

Now change the colour of the Min(1) field to light green (#7cadb2), untick Show Header against last_stage order & SUM([Value]) to hide the axis. Also right click on last_stage in the view, and select Hide field labels for rows.

Finally adjust the size of the labels displayed to 8pt. On the % To Close marks card, change the label alignment to left, and set the font colour to white. Uncheck Show Tooltips for all marks.

And you should now have the main display.

Oh, adjust the Size of the bars to suit.

When added to the dashboard, add a horizontal container above the viz, and use text boxes to display the column titles.

My published viz is here.

Happy vizzin’!

Donna

Can you build a dynamic scatter plot with proportional brushing?

It’s #data19 in Vegas, so this week’s #WorkoutWednesday challenge, set by Curtis was run ‘live’ at the Conference. Not being there, I had to wait patiently at home until it was released and build solo 😦

Given it was a live challenge, Curtis wanted to ensure it offered the right balance for those newer to Tableau & #WorkoutWednesday, while still giving the more experienced participants something to get their teeth into. And I think he managed this well.

Building the scatter plot

The scatter plot is built to be dynamic, with both the axes and displayed marks changing based on user selection. This is driven by 3 parameters

Set Level of Detail

String parameter, containing 3 values : Product, Manufacturer, Customer

(quick side note here…. the challenge states to use the Superstore Sales dataset that comes with v2019.1. To get the Manufacturer field, you need to use the provided Tableau data source and not the excel file – this always trips me up!).

Set X-Axis

String parameter containing 4 values : Quantity, Sales, Orders, Discount

(another note … the challenge says to use Profit, but the solution provided used Orders – I chose to use Orders too so I could verify a match)

Set Y-Axis

The same as above (just duplicate the parameter and rename).

So with the parameters in place, we need to build calculated fields to reference them.

LOD

CASE [Set Level of Detail]
WHEN ‘Customer’ THEN [Customer Name]
WHEN ‘Manufacturer’ THEN [Manufacturer]
WHEN ‘Product’ THEN [Product Name]
END

X-Axis

CASE [Set X-Axis]
WHEN ‘Quantity’ THEN SUM([Quantity])
WHEN ‘Sales’ THEN SUM([Sales])
WHEN ‘Orders’ THEN COUNTD([Order ID])
WHEN ‘Discount’ THEN SUM([Discount])
END

Y-Axis

CASE [Set Y-Axis]
WHEN ‘Quantity’ THEN SUM([Quantity])
WHEN ‘Sales’ THEN SUM([Sales])
WHEN ‘Orders’ THEN COUNTD([Order ID])
WHEN ‘Discount’ THEN SUM([Discount])
END

(basically duplicate X-Axis and edit & rename)

With these we can now build the basic scatter : [X-Axis] on Columns, [Y-Axis] on Rows and [LOD] on Detail :

Changing the parameters and the scatter will change too.

So now its just a case of formatting and adding a few additions to match the display.

First up, the solution has the X-Axis scale on the top, rather than the bottom. This is achieved by dual-axis.

Duplicate the [X-Axis] on the Columns shelf (Press Ctrl, click on the [X-Axis] pill and drag next to it.

Make dual-axis and synchronise

Edit the bottom axis, deleting the title of the axis, and changing the Tick Marks to None.

The solution also shows each axis is labelled based on the parameter selection. To do this, edit both the left and top axis to remove the titles, then add the [Set X-Axis] parameter to Columns and [Set Y-Axis] parameter to Rows.

Now right-click on the displayed Set X-Axis and Set Y-Axis and ‘hide field labels for columns/rows’, and right-click on Discount and ‘rotate label’.

Finally, adjust the tooltip to contain :

<Parameters.Set Level of Detail> : <LOD>

That’s the core viz built, it just needs some formatting applied, to

  • remove column and row lines
  • remove gridlines (the 0-lines are retained)
  • Change mark type to circle, adjust colour to suit and add border

Bar Charts & Proportional Brushing

When it’s all together on the dashboard, the selection of marks in the scatter should cause the bar charts to update to show the % of each variable accounted for in the selection. This concept is referred to as proportional brushing, and makes use of the Set Actions feature. Tableau’s own KB article describing Set Actions and proportional brushing can be seen here.

The first thing we need to do is build a set : right click on the [LOD] field and select Create -> Set. In the Set dialog, name the set Selected LOD and randomly select some values (not all)

We also need various calculated fields :

# LOD

COUNTD([LOD])

Just counts the number of distinct items in the user selected LOD. When bringing the Selected LOD set into play, we can the count the number of items in or out of the set, and using a quick table calculation, change to get the % of total. I tend to start using data tables at this point, to sense check the data I need before I viz :

I’ve turned on Grand Totals for columns too, just so I can check the numbers are what I expect.

Adding the [X-Axis] and [Y-Axis] values into the table, and finding % of total too

and we’ve got all the % needed to make up the bars. At this point, edit each table calculation and fix it to compute by the In/Out Selected LOD (so if you shift pills around later, you won’t change the values).

While we could build the viz at this point, I noticed that the tooltip on the bar chart displays the same information whether you hover the ‘in’ or the ‘out’ portion of the bar

With the data we currently have, we couldn’t do this, so we need a few more fields/calculations. We need to be able to get the total value of the LOD/measure against each row, as well as the number and % of the ‘In’ items.

We’ll use table calculations for this.

Total #LOD

WINDOW_SUM([# LOD])

# LOD Selected

ZN(WINDOW_MAX(IF ATTR([Selected LOD]) THEN COUNTD([LOD]) END))

This counts the number of items selected in the set, and the WINDOW_MAX statement will then ‘spread’ that value across both the rows (as only 1 row contains a value).

% LOD Selected

[# LOD Selected]/[Total # LOD]

This is formatted to 1 dp.

Adding into the ‘check’ table, you can see how the values with table calculations are matching the initial set

A similar set of steps are the repeated to get the X & Y axis values

So now I’m happy I’ve got all the fields needed to build the bars, we can, and this is done in 3 separate sheets (it can be done on a single sheet, but I noticed some spacing on Curtis’ solution that suggested it wasn’t).

LOD Bar

You can build from scratch, or as I tend to do, duplicate the check sheets, remove the pills I don’t want, and shuffle the rest around. You ultimately want :

  • [# LOD] as % of total on Columns
  • [Selected LOD] set on Colour (adjust colours, and make sure the Out is listed before In on the colour legend, so the darker values show first)
  • [Total LOD], [# LOD Selected] , [% LOD Selected] on Tooltip

The format to remove axis title, show axis ticks etc.

Name the sheet, then duplicate to create an X-Axis version, dragging the equivalent pills over the top, and do the same for the Y-Axis version. If you do this, just double check all the values are matching your check sheet, and the table calc settings haven’t been lost.

Putting it all together on the dashboard

Build the dashboard, and add the 4 sheets (the scatter and 3 bars) into their appropriate place, and expose the 3 parameters too.

To get the proportional brushing technique to apply to the bars, add a Set Action to change the value of the Selected LOD set when marks are selected in the scatter plot. The Set Action needs to be set to ‘remove all values from set’ when the marks are unselected.

And that’s it! My published version is here. Enjoy!

Happy vizzin’

Donna