Can you create a Monthly Driver Analysis dashboard?

November 23, 2023 Donna ColesLeave a comment

Global Recognition month continued this week for #WOW2023, and I was able to enlist Norbert Borbas to set the challenge this week, which was published in both Norbert’s native Hungarian, as well as English.

Norbert provided a challenge based on a solution he had implemented at his company, and involved the creation of 2 dashboards with interactivity between them both. There’s a fair amount going on with this one, so let’s get cracking.

Building the Sales KPI

For this viz, we need to get information about the latest year sales in conjunction with the previous year. Rather than hardcode any years relating to the data, I created

Latest Year

{MAX(YEAR([Order Date]))}

which for the data set I was using, returns 2022. Drag this field into the ‘dimensions’ section of the left hand data pane (ie above the line).

With this I then created

LY Sales

IF YEAR([Order Date]) = [Latest Year] THEN [Sales] END

to the sales for 2022. Format this to $ with 0 dp.

To get the sale for the previous year (ie 2021) I created

Previous Year

[Latest Year]-1

Drag this field into the ‘dimensions’ section of the left hand data pane (ie above the line), and then create

PY Sales

IF YEAR([Order Date]) = [Previous Year] THEN [Sales] END

Format this to $ with 0 dp.

We then needed the % difference between these values

% Diff Sales From PY

(SUM([LY Sales]) – SUM([PY Sales]))/SUM([PY Sales])

Apply custom formatting set to +0.0%;-0.0%;0.0% which explicitly adds the + sign in front of positive values.

To help with the interactivity that is required in the dashboard, I also created

Sales Label

‘Sales’

With all these, we can build the KPI ‘card’.

Add Sales Label to Detail, and LY Sales, PY Sales, Previous Year and % Diff Sales From PY to the Label/Text shelf.

Format the text so it is aligned middle centre, and the arrange the text as required

Set the view to Entire View and stop the Tooltip from displaying (uncheck show tooltip). Name the sheet Sales KPI.

Building the Profit KPI

Repeat similar steps as above, apply formatting to the fields as required. You’ll need

LY Profit

IF YEAR([Order Date]) = [Latest Year] THEN [Profit] END

PY Profit

IF YEAR([Order Date]) = [Previous Year] THEN [Profit] END

% Diff Profit From PY

(SUM([LY Profit]) – SUM([PY Profit]))/SUM([PY Profit])

and

Profit Label

‘Profit’

Name the sheet Profit KPI.

Building the Line Chart

The line chart needs to change based on whether the Sales KPI or the Profit KPI sheet has been selected. We need a parameter to capture this ‘decision’.

pMeasureToShow

string parameter defaulted to Sales

To the determine which actual value to display we need

Line – Measure to Display

IF [pMeasureToShow] = ‘Sales’ THEN [Sales] ELSE [Profit] END

format this to $ with 0dp.

I also created

Line – Measure to Display Axis (k)

IF [pMeasureToShow] = ‘Sales’ THEN [Sales] ELSE [Profit] END

ie the same field, but this was formatted to $ with 0 dp but display units of Thousands (k).

Having the two fields means that the axis can display in one format while the tooltip can show the more detailed value.

On a new sheet add Order Date to Columns and change to the discrete ‘month’ level (blue pill – May). Add Line – Measure to Display Axis (k) to Rows. Add Order Date to Colour. This will default to the Year level, and show all years from the data set, but we only want the latest 2 years. So create

Filter Years

YEAR([Order Date]) >= [Previous Year]

and add to the Filter shelf and set to True. Adjust colours to suit.

Add Order Date to the Label shelf and change to the Year level. By default the lines should be labelled. Edit the label and set the font to match mark colour. I also set the font to be Tableau Medium and bold. Adjust the order of the years in the colour legend, so 2022 is listed first which makes the line for 2022 sit ‘on top’ of the 2021 line.

Remove all gridlines/row column dividers, and set the axis lines to be bolder. Hide the Order Date label (right click > hide field labels for columns). Adjust the formatting of the Order Date axis, to display the months in an abbreviated form. Adjust the title of the y-axis to reference the pMeasureToShow parameter (right click the axis > edit).

Add Line – Measure to Display to the Tooltip shelf.

Adjust the Tooltip to display as

Finally, to help with the interactivity later, we will need

Month Order Date

DATEPART(‘month’, [Order Date])

This returns the number of the month ie 1 for January, 2 for February etc. Move this to the ‘dimensions’ section of the left hand data pane (drag above the line), and then add this to the Detail shelf. Change the field to be a discrete attribute

Name the sheet Line Chart.

Building the Symbol Chart

On a new sheet add Filter Years to the Filter shelf and set to True. Add Order Date to Columns and change to be at the discrete month level. Double click into the Rows shelf and manually type in MIN(0). Add Month Order Date to the Detail shelf.

We need to display coloured arrows depending on whether the change is up or down. For this we need

Symbol – Difference to Display is +ve

IF [pMeasureToShow] = ‘Sales’ THEN IIF([% Diff Sales From PY]>=0,TRUE,FALSE)
ELSE IIF([% Diff Profit From PY]>=0,TRUE,FALSE)
END

If the measure to display is Sales, and the difference in Sales from previous year is +ve, then return true, otherwise false, Else if the measure to display is Profit and the difference in Profit from the previous year is +ve, then return true, else false.

Change the mark type to Shape and then add this field to both the Colour shelf and the Shape shelf. Adjust colours and shapes accordingly.

Edit the axis and delete the title and set the major and minor tick marks to None. We need the axis to remain as we will need to ‘line up’ this chart with the line chart, and having a left hand axis will help.

Hide the months from showing (uncheck show header against the pill on Columns. Hide all gridlines, axis lines, zero lines & row/column dividers.

Name the sheet Symbol Chart.

Building the Main dashboard

Using horizontal and vertical layout containers, position the sheets in the required locations along with the title and the instructional text. Use background colours and inner & outer padding to give space between the objects.

For the line chart and symbol chart, these were placed in a vertical container, and the width of the ‘blank’ y-axis on the symbol chart widened to be in line with the axis on the line chart. The hierarchy of objects I used is pictured.

To make the Sales display on the line chart when the Sales KPI sheet is clicked, create a dashboard parameter action

Show Sales Line

On select of the Sales KPI sheet, set the pMeasureToShow parameter, passing in the value from the Sales Label field. When the selection is cleared, keep the parameter set to the current value.

And create a similar action to show the profit

Show Profit Line

On select of the Profit KPI sheet, set the pMeasureToShow parameter, passing in the value from the Profit Label field. When the selection is cleared, keep the parameter set to the current value.

We will need to return to this later, to add more interactivity, but for now we’ll move onto the analysis/drill down sheet.

Building the drill down table

We’re going to need a few more fields to build this type of display. Firstly, for the first bar chart column we need

Bar – LY Measure to Display

IF [pMeasureToShow] = ‘Sales’ THEN [LY Sales] ELSE [LY Profit] END

Bar – PY Measure to Display

IF [pMeasureToShow] = ‘Sales’ THEN [PY Sales] ELSE [PY Profit] END

Format both these fields to $ with 0dp.

Add State/Province to Rows and Bar – LY Measure to Display to Columns. Sort the states by the measure descending. Adjust the colour of the bar to suit. Add Bar – PY Measure to Display to the Detail shelf and add a reference line per cell that displays the average of this measure.

Show mark labels and adjust the font of the labels to be size 8pt. Widen each row, and align the State labels to the left, and change the font to be bold & black. Reduce the Size of the bars. Remove gridlines, but add row dividers.

Double click into the columns shelf and manually type MIN(0) to create a ‘fake’ axis and generate a MIN(0) marks card.

Create a new field

Measure Rank LY

RANK(SUM([Bar – LY Measure to Display]))

and add this to the Label shelf of the MIN(0) marks card. Adjust the table calculation so it is explicitly set to Compute by State/Province. Remove the Bar – PY Measure to Display field from the Detail shelf. Change the mark type to shape.

To determine what type of shape and what colour to apply, we need

Measure Rank PY

RANK(SUM([Bar -PY Measure to Display]))

and then

Measure Rank Change

IF [Measure Rank LY] < [Measure Rank PY] THEN ‘Up’ ELSEIF [Measure Rank LY] > [Measure Rank PY] THEN ‘Down’
ELSEIF ISNULL([Measure Rank LY]) THEN NULL
ELSE ‘N/A’
END

Add this field to both the Shape and the Colour shelf, and adjust the table calculation so it is explicitly set to Compute by State/Province.

Adjust the shapes, and use a transparent shape against the Null option (see here for details). Adjust colours to suit. Increase the size of the shape, and align the label to the left.

For the next column, create

Bar – Measure Difference

SUM([Bar – LY Measure to Display]) – SUM([Bar -PY Measure to Display])

and custom format to +”$”#,##0;-“$”#,##0

Add to the Columns shelf, and labels should automatically get added.

Create a field

Bar – Measure Diff is +ve

[Bar – Measure Difference] >=0

and add to the Colour shelf of this marks card. Adjust colours to suit.

For the final column, we need to separately identify the values when the YoY measure difference is positive from those that are negative, and then apply ranking to each of these fields. So we need

+ve Measure Diff

IF [Bar – Measure Diff is +ve] THEN [Bar – Measure Difference] END

-ve Measure Diff

IF NOT([Bar – Measure Diff is +ve]) THEN [Bar – Measure Difference]*-1 END

Note, as the difference in this instance is negative, the values returned will also be negative, but when it comes to ranking, we want the record with the biggest negative difference to be ranked 1st ie if one value had a difference of -10 and another had a value of -100, in typical ranking, -10 is ‘higher’ than -100, so -10 would be ranked 1 and -100 2. But we want -100 to be ranked 1. So by multiplying the values by -1 in the calculation we actually return values 10 and 100. So when we rank them later, 100 is ranked 1 as it is bigger than 10.

Ranke +ve Measure Diff

RANK_UNIQUE([+ve Measure Diff])

Rank -ve Measure Diff

RANK_UNIQUE([-ve Measure Diff])

We will be displaying the information for the positive and negative ranks in separate ‘columns’ which we can do with

Rank YoY X-axis

IIF([Bar – Measure Diff is +ve], 1,2)

Add this field to columns and change the mark type to Circle. Add Bar – Measure Diff is +ve to Colour. Add Rank +ve Measure Diff and Rank -ve Measure Diff to Label. Ensure the table calculations for both fields are explicitly set to Compute by State/Province. Increase the Size of the circle, and align the label to be middle centre using a bold white font.

Now we have all the information displayed, we need to sort the tooltips.

This sheet, will be accessed through interaction and will be ‘filtered’ to just a specific month. For now, we’ll ‘hardcode’ the month by adding Month Order Date to the Filter shelf and selecting 3 (for March).

On the All marks card, add Latest Year, Previous Year, Month Order Date, Measure Rank PY to the Tooltip shelf.

We will also need

TOOLTIP – Rank statement decrease

IF [Measure Rank Change] <> ‘Up’ THEN MIN([State/Province]) + “‘s ” + [pMeasureToShow] + ‘ Rank decreased vs. same month last year.’ END

TOOLTIP – Rank statement increase

IF [Measure Rank Change] = ‘Up’ THEN MIN([State/Province]) + “‘s ” + [pMeasureToShow] + ‘ Rank increased vs. same month last year.’ END

TOOLTIP – Rank YoY Statement Negative

IF NOT([Bar – Measure Diff is +ve]) THEN MIN([State/Province]) + ‘ is a negative (Rank: ‘ + STR([Rank -ve Measure Diff]) + ‘) contributor to the overall YOY increment in the selected month.’ END

TOOLTIP – Rank YoY Statement Positive

IF [Bar – Measure Diff is +ve] THEN MIN([State/Province]) + ‘ is a positive (Rank: ‘ + STR([Rank +ve Measure Diff]) + ‘) contributor to the overall YOY increment in the selected month.’ END

Add all 4 of these fields to the Tooltip shelf of the All marks card. Ensure all the table calculation fields are set to explicitly Compute by State/Province.

Now adjust the tooltip with all the relevant fields, applying colouring as required

Hide the axis and hide the null indicator. Hide the State/Province column label heading. Finally, remove the Month Order Date field from the Filter shelf. The tooltip will look a bit funny at this point, but that will get sorted later.

Name the sheet Drill Down Table.

Building the drill down dashboard

Again using vertical and horizontal containers, arrange the sheet on a dashboard along with the title. Use text boxes arranged in a horizontal container directly above the Drill Down Table sheet to display the column headings.

As I didn’t want to hardcode any years, I created the following parameters

pLatestYear

integer parameter defaulted to 2022 and with a display format that did not include thousand separators.

and

pPreviousYear

integer parameter defaulted to 2021 and with a display format that did not include thousand separators.

and

pMonth

integer parameter defaulted to 3 and with a display format that did not include thousand separators.

When building the column headings, I referenced all these parameters instead.

To ‘set’ these parameters, I added Previous Year and Latest Year to the Detail shelf of both the Line Chart and Symbol Chart sheets.

I then added 3 dashboard parameter actions to the main dashboard which on select of the Line Chart or Symbol Chart sheet, set the relevant parameter with the value from the appropriate field.

To ensure the drill down gets ‘filtered’ to the month selected on the main dashboard, add a dashboard filter action

Drill down

On select of the Line Chart or the Symbol Chart, target the Drill Down Table sheet on the Drill Down dashboard, passing the selected field of MONTH(Order Date) only. Exclude all values when selection is cleared.

The final step is to add a Navigation Button to the drill down dashboard which displays the text ‘Go back to landing page’ and navigates back to the main dashboard.

And hopefully, with all that, you have a completed interactive navigational dashboard! My published version is here.

Happy vizzin’!

Donna

Can you create a jitterfly chart?

May 15, 2023 Donna Coles1 Comment

This week it’s the Tableau Conference edition – and as I was at #data23, I’m a bit delayed in getting the challenge and solution posted – too much going on and too much sleep to catch up on over the weekend 🙂

It was great to catch up with the #WOW crew in person and to meet so many participants at the social nights out and at the hands on live session too. And thank you for all those who complimented me on this blog – it really does make the time and effort I put into it worthwhile, when I get to hear how much it has helped you all!

Here’s a few pics from the Tableau Conference, and a massive shout out to Chris McClellan, who made me a custom WOW t-shirt!

Ok, enough of the pre-amble, onto the challenge and solution. Lorna set this challenge to recreate what she referred to as a ‘jitterfly’ chart.

Modelling the data

The data needs to be downloaded from data world. This contains multiple files and it was expected to connect to the central_trend_2017_base.xslx file. This file contains multiple sheets. The Population – Females and Population – Males sheets need to be unioned together in the data source canvas.

Drag Population – Females onto the canvas, then drag Population- Males on too, and drop it when the Union option appears beneath the Population – Females

You’ll know you’ve done it right, if you only have a single object in the canvas and the union symbol is displayed

The data displays all years in separate columns – we need to transpose this, so we have a column for Year and a column for the population value.

Click on the first year column (2011), then scroll across to the last year column (2050). Hold down shift and click the last column, and all the columns in between should be highlighted/selected. Right-click on any of the selected columns and select Pivot.

Rename the Pivot Field Names column to Year and rename Pivot Field Values to Population.

To tidy things up a bit, hide the following fields (right click on column and hide) : Sheet, Table Name, Gss Code, Component. Also, change the data type of the Year field to be a number rather than a string. This should leave you with 5 columns

Navigate to Sheet 1 and in the left hand data pane, drag Age from the lower ‘measures’ section to above the line and into the dimensions section. Age isn’t going to be something we aggregate (sum/avg) etc – it is simply a categorical property of the record. This step isn’t critical, but I just like things to be neat :-).

Finally, though it isn’t stated in the requirements, the data relating to District = London needs to be excluded, as it is a summarised total of the other rows. To handle this I added this exclusion as a Data Source Filter, so once applied, I didn’t have to worry about it when I built the chart. Right click on the data source listed on the top left, and select Edit Data Source Filters, then add a condition to exclude the London District.

Creating the calculations

The chart requires a user to select two years to compare – I’ll refer to these as the Primary and Secondary year. We’ll create parameters to enable the selections.

pPrimaryYear

Integer parameter, defaulted to 2033 and displayed in ‘2033’ format (ie no commas). It should allow all other years to be selected (easiest way to get this populated is to right-click on the Year field in the data pane, and Create > Parameter). If you don’t do this, then use the Add values from button to select the Year field to populate the list.

pSecondaryYear

as above, but default to 2023 (the easiest way to create this parameter is to duplicate the first and just amend the name and default value).

With these parameters, we can then create the calculated fields needed to present the relevant values for each year and gender (as hinted by Lorna).

Population – Primary Male

IF [Year] = [pPrimaryYear] AND [Sex] = ‘male’ THEN [Population] END

format this to K with 1 dp

Population – Primary Female

IF [Year] = [pPrimaryYear] AND [Sex] = ‘female’ THEN [Population] END

format this to K with 1 dp

Population – Secondary Male

IF [Year] = [pSecondaryYear] AND [Sex] = ‘male’ THEN [Population] END

format this to K with 1 dp

Population – Secondary Female

IF [Year] = [pSecondaryYear] AND [Sex] = ‘female’ THEN [Population] END

format this to K with 1 dp

popping these into a table as below, you can see the results

The final calculation we need is for the age banding. We can’t use the in-built ‘bins’ function as the final ‘bin’ contains the ‘rest’ and not just 10 values. Also if we used ‘bins’ the ‘labels’ would be based on the data values and not a custom display as we have here.

Age Bracket

IF [Age] <= 10 THEN ‘<=10’
ELSEIF [Age] <=20 THEN ’11-20′
ELSEIF [Age] <=30 THEN ’21-30′
ELSEIF [Age] <=40 THEN ’31-40′
ELSEIF [Age]<=50 THEN ’41-50′
ELSEIF [Age]<=60 THEN ’51-60′
ELSEIF [Age]<=70 THEN ’61-70′
ELSEIF [Age]<=80 THEN ’71-80′
ELSE ’80+’
END

Note – I altered the calculation and label to be <=10 rather than just <10 as is shown in the solution

Building the Viz

On a new sheet, add Age Bracket to Rows, Population Primary Male to Columns and District to Detail. Change the mark type to circle. Manually re-sort the Age Bracket values so <=10 is listed at the top (just drag the value from the bottom to the top).

Drag Population Primary Female onto the canvas and drop it on the Population Primary Male axis when the double green column symbol appears

This will make the values for both male & female display on the same axis, and Measure Names and Measure Values automatically gets added to the viz. But the values are all displaying in the same direction (the positive axis).

To resolve this, double click into the Population Primary Female pill that is in the Measure Values box underneath the marks card and type in *-1 to the end of the pill and press return

The female values will then be displayed in the opposite direction. Adjust the colours of the Measure Names legend to suit.

To make the dots ‘jitter’, that is appear in a random vertical position, we need a measure (green pill) on the Rows so we generate a y-axis.

Now typically when I am creating jitter plots I use the undocumented RANDOM() function which generates a random number between 0 and 1. The function is undocumented, as it only works for some data sources (excel being one). Using RANDOM() was something I mentioned to several attendees of the Live WOW session at Tableau Conference.

However, due to a later requirement, you’ll need to use a different function instead – in this case INDEX(). For clarity I created an explicit calculated field for this

Jitter

INDEX()

INDEX()is a table calculation that creates a unique sequence number from 1 to n for each record in the table partition. In this case the partition is each Age Bracket. Add Jitter to Rows and adjust the table calculation setting so it is computing by District only. Set the fit of the chart to Fit Width to ensure you can see the display better.

Adjust the colour of the marks to 50% opacity.

To handle the detail displayed on the Tooltip we need to create some additional fields for the population values, these ones not split based on gender.

Population – Primary Year

IF [Year] = [pPrimaryYear] THEN [Population] END

format to K with 1 dp

Population – Secondary Year

IF [Year] = [pSecondaryYear] THEN [Population] END

format to K with 1 dp

Then

Population Change

(SUM([Population – Primary Year]) – SUM([Population – Secondary Year]))/SUM([Population – Primary Year])

custom format this as ▲0.0%;▼0.0%;0.0%

Add Sex, Population – Secondary Year, Population – Primary Year and Population Change to the Tooltip and adjust accordingly.

The coloured bands are based on the average for the ‘primary’ year and sex. Add Population Primary Female and Population Primary Male to the Detail shelf. Double click on Population Primary Female and type in *-1 to ensure you get the relevant negative value.

Right click on the ‘value’ x-axis and Add Reference Line.

Change the option to be a reference band per pane, and set a band to go from Population Primary Female *-1 : Average to a Constant of 0. Ensure no labels/tooltips display, and set the fill colour of the band accordingly.

Add another reference band for the which goes from constant 0 to the Average of Population Primary Male. Adjust fill colour to suit.

Add Population Secondary Male and Population Secondary Female to the Detail shelf. Double click into the Population Secondary Female pill and add *-1 to the end.

Right click on the ‘value’ x-axis and Add Reference Line.

Add a reference line that is the Average of the Population Secondary Female * -1 field. Adjust colour and thickness of line to suit (I used the middle thickness and line coloured at 80% transparency).

Repeat the same to add an average reference line for the Population Secondary Male field.

The final step is to add the age banding label into the centre of the viz.

Double click into the Columns shelf and type MIN(0). This will create a secondary axis with a new marks card. Remove all the pills except District from the MIN(0) marks card. Add Age Bracket to the Label shelf. Adjust the label properties as below – to label the Min/Max per pane; at the District field level, and label the maximum value only.

This positions the label in the same place on each age banding. This works because we have used INDEX() to control the jittering which means the maximum value is always the same for each bracket. If we had used RANDOM() to define the jittering, there would be no guarantee the same maximum value would have existed for every banding.

Reduce the opacity to 0% and size of the circle to be as small as possible on the MIN(0) marks card, and then make the chart dual axis and synchronise the axis.

Finally format the chart by

Hide the Age Bracket column (uncheck show header)
Hide the Jitter axis (uncheck show header)
Hide the MIN(0) axis (again uncheck show header)
Format the Value axis so the title is Population, and the scale is in 0K format and positive in both directions (custom format ,##0,”K”;#,##0,”K”) NOTE wrapping the K in “” ensures the display is retained when publishing to Tableau Public – thank you Deborah for the tip!)
Hide the nulls indicator (right click – hide indicator)
Remove all gridlines, zero line, axis ticks etc
Remove column dividers
Set Row dividers to white with the widest thickness
Set the chart to fit entire view

Add the sheet to a dashboard. Use a text object to display the title and sub-tile which should reference the pPrimaryYear and pSecondaryYear parameters. Float the parameter controls and resize to give the appearance of just the drop down option – this will take a bit of tweaking to get just right, and you may need to edit via Tableau Public to get the positioning right.

My published viz is here.

Happy vizzin’!

Donna

Can you create a normalised jitter plot?

February 9, 2023 Donna Coles1 Comment

Continuing the theme of alternative chart types, Kyle decided to challenge us to recreate this jitter plot inspired by an example from The Big Book of Dashboards.

I’ve built jitter plots in the past for #WorkoutWednesday challenges (the hidden RANDOM() function is your friend in this), but I wanted to see how far I could get without having to peak at my previous solutions.

So I connected to the baseball data provided, and cracked on, building the jitter in a single sheet using Measure Names on the columns. But then I got stuck when it came to labelling the tooltips…. surely this didn’t need a sheet per measure did it…

…maybe it did… so I proceeded to recreate as 4 separate sheets, and felt quite smug that I’d managed to make use of the ‘little used’ worksheet caption to provide the summary detail at the bottom of each measure. When I’d finished, I checked Kyle’s solution, as the summary values for the SLG & OPS measures seemed to be mixed up…. and what did I find…. he had managed to build the jitter within a single sheet as he had pivoted the data first! Argggghhhh! It just hadn’t crossed my mind, and Kyle had chosen not to drop that hint in the requirements…. hey ho! c’est la vie! I may well recreate with a pivoted version at a later date, but for now, I am blogging what I did…

My solution has ended up with a lot of calculated fields as a result, as equivalent fields needed to be created for every measure. Most of this was managed via duplicating and editing existing fields, so it actually wasn’t too onerous.

Building the calculated fields

We’ll start as usual by building out the fields required, and will focus on the BA measure initially.

Add Name and BA into a tabular view and Sort descending. Format the BA measure to be a number with 3 decimal places.

We need to know the rank of each player. Create a calculated field

Rank BA

RANK(SUM([BA]))

Add this to the view, and we should get the player rankings displayed from 1 downwards.

Now the requirement wants us to normalise the measures so they can be displayed on the same axis (or in my case, since it’s not a single chart I’m building), within the same axis range.

What this means is we want to plot the measure on a scale between 0 and 1 where 0 represents the lowest measure value, and 1 the highest. for this we need

Min BA

{FIXED :MIN([BA])}

and

Max BA

{FIXED :MAX([BA])}

The normalised value then becomes

Normalise BA

([BA] – [Min BA])/([Max BA]-[Min BA])

The difference between the current value and the lowest value, as a proportion of the range (ie the difference between the highest and lowest values).

Adding this to the table, you should see that the Normalise BA value for the highest ranked player is 1 and that for the lowest ranked is 0.

As part of the information displayed, we also need to know the percentile each player is in.

Percentile BA

RANK_PERCENTILE(SUM([BA]))

Format this to a percentage with 0 dp and add into the table.

Next we need to identify the player selected, so we’re going to create a parameter based off of the Name.

Select a Player

Right click Name -> create -> Parameter. This will open the parameter dialog and auto populate the list of options with the values from the Name field. Default the parameter to Julio Rodriguez.

We can then create a field to identify if the player is the one selected

Is Selected Player?

[Name] = [Select a Player]

Add this into the table, on the Rows before Name and sort so True is listed at the top (just easier to check the results).

So now we need to identify the rank and percentile of the selected player only

Selected Player BA Rank

WINDOW_MAX(IF ATTR([Is Selected Player]) THEN [Rank BA] END)

format this to a number with 0 dp

Selected Player BA Percentile

WINDOW_MAX(IF ATTR([Is Selected Player]) THEN [Percentile BA] END)

format this to a percentage with 0 dp.

The window_max function has the effect of ‘spreading’ the result over all the rows.

Finally we need to get a count of all the players

Count Players

{FIXED:COUNTD([Name])}

format this to a number with 0 dp.

Building the Jitter Plot

To build a jitter plot, we need to plot each mark against 2 axes. The Normalise BA measure is one axis, but we need to create ‘something’ for the other. This is the value to make the ‘jitter’ which is essentially an arbitrary value between 0 and 1 that we can plot the mark against, and means the marks don’t all end up in a single line on top of each other, and we can get a better ‘feel’ for the volume of data being represented.

Jitter

RANDOM()

The random() function is a ‘hidden’ function that will, as it’s name suggests, generate a random number. It is ‘hidden’ as it only works with some data sources. Excel for example is fine, but if you were connected to a Snowflake database, you can’t use it.

The nature of random, also means that you can’t guarantee the value it produces, and it will regenerate on data refresh, so if you’re looking to compare your solution directly, your dots will not be positioned exactly the same.

On a new sheet add Jitter to Columns and Normalise BA to Rows. Add Name to Detail and change the mark type to Circle.

Add Is Selected Player to Colour, adjust accordingly and add a border to the circle. I dropped the opacity to 70%. Order the colour legend, so True is listed first, to ensure this circle is always ‘on top’.

Then add Is Selected Player to Size. Edit the sizes so they are reversed and adjust the sizes until you’re happy.

To label just the selected player mark

Label:BA

IF [Is Selected Player] THEN [BA] END

format this with a custom number font ,##.000;-#,##.000

Add this to the Label shelf and adjust the font colour, and align centrally

Add Rank BA and BA to the Tooltip shelf and adjust tooltip to suit. You will need to adjust the table calculation setting of the Rank BA field so that it is computing by all the fields.

Add Selected Player BA Rank and Selected Player BA Percentile and Count Players to the Detail shelf. Adjust the table calculations as above (including any nested calcs), then show the worksheet caption (Worksheet -> Show Caption), and edit the caption to display the relevant text.

From the analytics pane, drag the Median with Quartiles option onto the canvas and drop it on the table / Normalise BA axis option. Remove the quartile reference lines (right cick axis -> remove reference line), and edit the median reference line to be a dashed line with no label.

Finally remove all gridlines/dividers/axes lines and hide the axes. Title the sheet as per the measure ie BA, and align centrally.

Format the Caption and the Title to have a light grey background and a slightly darker thin border.

Now, repeat all that for the other measures 🙂 This isn’t that bad. All the fields above labelled BA, need duplicating, renaming and updated to reference the next measure eg OBP.

Once done, duplicate the BA jitter plot sheet, and replace all the ‘BA’ related fields with the equivalent ones, by dragging the equivalent field and dropping it directly on top. Sense check the table calculation settings are all ok. You may need to update the text in the caption, as that seems to lose anything to do with the table calculation fields referenced when they get touched.

Ultimately you should end up with 4 sheets.

Putting it all together

On a dashboard, use a horizontal container to position all 4 sheets in side by side. Show the worksheet caption for each sheet. Reduce the outer padding for each sheet to 0, and add a thin border around each sheet.

Add a parameter action to drive the interactivity ‘on click’ of a circle

Select Player

On select of any of the source sheets, update the Select a Player parameter with the value from the Name field. Retain the selected value on ‘unclick’

To prevent the selection on click from being ‘highlighted’, and al the other marks ‘fading’, we need one final step.

Create new calculated fields

True

TRUE

False

FALSE

Add both these fields to the Detail shelf of each of the 4 sheets.

Then add a dashboard filter action for each sheet which on select, goes from the sheet on the dashboard to the worksheet itself, passing the selected fields of True = False. Show all values when unselected.

My published viz is here. Kyle’s solution with a lot less calculated fields, and only 2 sheets (1 for the jitter and 1 for the summary section at the bottom) is here. You will need to pivot the data via the data source pane first through 🙂 Next time, when I really feel something should be able to be done in 1 sheet, I’ll try to think a little longer…. upshot though, I impressed myself at the use of the caption for the summary – something I must consider using more often!

Happy vizzin’!

Donna

Can you build a funnel chart?

February 2, 2023 Donna Coles1 Comment

For my first #WOW2023 challenge, as an ‘official’ part of the #WOW crew, I challenged everyone to build this funnel chart. Inspiration came from a colleague who posed me the challenge of building something similar, ensuring the bars were a single entity when clicked on. In that example we’d just used Superstore data, but it wasn’t really reflective of how a funnel would typically work, so I sourced some dummy sales opportunity data and made the challenge more relevant to a real business scenario.

Building the Funnel Chart

I saw this as a table calculation exercise, though I have subsequently seen solutions that manage it all through LODs. I’ll obviously blog my solution.

Let’s start by getting the core data into a table, so we can see what we’re aiming for

The total value of all the Opportunities is $91.4M. Based on the sales pipeline process and the assumptions documented in the requirements, all of that value will have passed through the first stage – Prospecting.

There is $6.1M still sitting in Prospecting, so when we move onto the next stage, stage 2 -Qualification, the total amount that has passed through will be $91.4M – $6.1M = $85.3M

As there is $6.5M sitting in Qualification, as we move to stage 3, the total amount that has passed through will be $85.3M – $6.5M = $78.7M. A similar pattern repeats for Stage 4, but for Stage 5, Closed Won, the value that has ‘passed’ through that stage, is equivalent to itself.

If we amend the Value field to be a Running Total quick table calculation (and add Value back into the view too), we get

The Running Sum is computing down the table , with the value in each row being the sum of the preceding rows. But this is opposite to what we want. We want Stage 1 to have the biggest value, the total, and the values to get smaller as you look down the table.

To resolve this, edit the table calculation and change the Sort order to custom sort by Minimum Stage No Descending

This results in the running total to sum from the last row (Stage 6) up to the first row (Stage 1), and you can see the value against Stage 1 matches the total for the Value column.

However, this still isn’t quite what we need. While we need to consider the amount for Closed Lost in stages 1-4, we shouldn’t be including it in stage 5, Closed Won.

While excluding the Stage 6 from the view (Stage No on Filter shelf set to Exclude 6) , gives the correct value for Stage 5, we’ve now lost the amount for Closed Lost in the other rows

So, we need some additional calculations to help resolve this.

Amount Lost

{FIXED : SUM(IF [Stage No]=6 THEN [Value] END)}

This just captures the amount of Stage 6 and ‘spreads it across every row of data.

Let’s now get our running total field, baked into it’s own calculation, as this will help with other calcs

Cumulative Value Per Stage

RUNNING_SUM(SUM([Value]))

Replace the Value table calc field with this one, ensuring the table calculation settings are identical to those described above.

Now let’s add the Amount Lost onto the Cumulative Value Per Stage unless we’re at Stage 5, Closed Won

Total Amount Per Stage Inc Lost

IF MIN([Stage No]) = 5 THEN [Cumulative Value Per Stage]
ELSE [Cumulative Value Per Stage] + SUM([Amount Lost]) END

Pop this into the table (checking the table calc settings are as expected) and the total amount now stored against Stage 1, is the total value of all the Opportunities.

Ok, so now we know the values, how do we plot this in such a way that it represents a funnel. I chose to do this by first working out the proportion of the total value that the ‘reverse’ cumulative value represented.

Total Value

{FIXED:SUM([Value])}

meant I could determine

Proportion of Total

[Total Amount Per Stage Inc Lost] / SUM([Total Value])

I formatted this to percentage with 0 dp.

Adding these into the table

If we plotted this information on a bar chart, we’d get this

but we need to ‘centre’ each bar in relation to the first bar to give us the funnel effect. This means we need to shift each ‘bar’ to the right, by half the amount that is the difference between 100% and the Proportion of Total.

Position to Plot

(1 – [Proportion of Total])/2

We can’t represent this with a actual bar chart, but instead we can use a Gannt chart.

Stage No to Rows
Stage No to Filter and exclude Stage 6
Stage to Detail
Position to Plot to Columns, adjusting the table calculation as previously described
Change mark type to Gantt bar
Add Proportion of Total to Size (and verify the table calc is set properly)

Hey presto! A funnel!

To finalise

add Stage to Label and align centrally. Make the font bold and match mark colour.
add Cumulative Value per Stage to Colour and reverse the colour range (via the Edit Colours dialog)
Widen each row a bit.
Fix the Position to Plot axis to start at 0 and end at 1 to ensure the funnel is centred exactly.
Add Value and Total Amount Per Stage Inc Lost to the Tooltip and adjust the tooltip text accordingly (don’t forget to sense check the table calc settings!).
Hide the Position to Plot axis, and the Stage No column. Remove all gridlines, zero lines, axis rulers /row/column banding etc.

Building the KPIs

We need a few calculated fields to store the required numbers

Won

{FIXED:SUM(IF [Stage No]=5 THEN [Value] END)} / [Total Value]

formatted to a percentage with 0 dp.

Lost

{FIXED:SUM(IF [Stage No]=6 THEN [Value] END)} / [Total Value]

formatted to a percentage with 0 dp.

Outstanding

1-([Lost] + [Won])

formatted to a percentage with 0dp.

On a new sheet add Measure Names to Columns and Measure Values to Text. Add Measure Names to Filter and select Total Value, Won, Lost, Outstanding. Manually re-order the columns.

Format Total Value to be a number displayed in millions with 1 decimal place & prefixed with $.

Add Measure Names to Text and adjust the text as required. Align the text to be centred.

Remove the row banding, and hide the column heading.

Then arrange the two sheets on the dashboard, ensuring both are set to fit entire view.

My published viz is here.

Hope you enjoyed this

Happy vizzin’!

Donna

Which months have the higher number of orders?

July 18, 2019July 18, 2019 Donna ColesLeave a comment

For week 29 of #WorkoutWednesday2019, Luke Stanke set the challenge above (described here), which is comparing the overall average of orders placed per day against the average for each month.

On the face of it, this didn’t seem too bad (especially for a challenge set by Luke). Some days when I tackle these challenges, the path I take can be long and arduous with several false starts along the way; the result being more of a ‘happy accident’ than anything of real coherence. Luckily for me, since I’d promised to start blogging on these challenges, this wasn’t one of them.

My usual approach to any of these challenges is to take some time reviewing the published viz on Tableau Public; hovering over the various marks to understand what’s on the tooltips and seeing if I can get any clues into how the various objects (marks, views, legends, titles etc) have been rendered, whether axis are being sneakily used, understanding the interactivity at play etc.

My next step is to then get the figures right based on what’s presented. I typically like to create a ‘data’ sheet in the workbook – a tabular view of the data and associated calculations I have built, so I can easily sense check with the published viz whether my assumptions and computations are valid, and it provides a useful reference point if I’m trying to figure out what I did sometime later.

So I started this challenge the same way.

First up, I assumed the ‘line’ was probably a reference line, and hovering over the viz confirmed this. Everything else seemed pretty straightforward, so onto the figures.

1. Average no of orders per day per segment

This is what the line represents, and do to this I need to find the total number of orders placed per segment and the total number of days on which orders were placed for each segment. The average is then just count orders / count days.

Some people will create everything in a single calculated field, but I like to break things up to help me troubleshoot if things don’t quite work as intended, so I ended up with 3 calculated fields, and since the overall average was required at a level higher than the level of detail being displayed (which is month in this viz), I figured LoD calculations were the way to go.

Count Orders Per Segment

{FIXED [Segment]: COUNTD([Order ID])}

For each segment, count the number of distinct orders that exist.

Count Days Per Segment

{FIXED [Segment]: COUNTD([Order Date])}

For each segment, count the number of distinct days on which an order was placed.

Overall Avg Orders Per Day Per Segment

SUM([Count Orders per Segment]) / SUM([Count Days Per Segment])

Format this to 2dp.

2. Avg no of orders per day per segment per month

This is what each coloured ‘bar’ represents, and for this I needed to find the total number of orders placed per segment per month, and the total number of days in each month on which orders were placed for each segment. I chose to stick with LoDs again for this :

Count Orders Per Segment Per Month

{FIXED [Segment], MONTH([Order Date]): COUNTD([Order ID])}

For each segment and month, count the number of distinct orders that exist.

Count Days Per Segment Per Month

{FIXED [Segment], MONTH([Order Date]): COUNTD([Order Date])}

For each segment and month, count the number of distinct days that an order was placed.

Avg Orders Per Day Per Segment Per Month

SUM([Count Orders Per Segment Per Month]) / SUM([Count Days Per Segment Per Month])

Format this to 2dp.

Putting these fields out in a table, you can see the first three columns contain the same values for each segment even though the data is being displayed at the month level. The final 3 columns are the monthly figures. These numbers all reconcile back to the data displayed on the viz.

So I’m heading in the right direction, now onto the next bit.

The requirement is to “Label the bars with the percent difference between the monthly value and the overall value”, so I need another field…

3. % Difference

…which is basically the difference between column 3 and column 6 above, as a proportion of column 3

([Avg Orders Per Day Per Segment Per Month]-[Overall Avg Orders Per Day Per Segment])/[Overall Avg Orders Per Day Per Segment]

By default though, this shows 0s when added to the table as the numbers calculated are actually 0.38, -0.33 etc, so it needs formatting as a percentage to 0 decimal places. However, applying the standard ‘percentage’ number format won’t quite cut it for this challenge, as Luke has labelled the positive numbers with a + too. To get this, I need to apply custom formatting.

The easiest way to get this right I find, is to use one of the default number formatting options to set the number in whatever ‘main’ format you need, eg if you have a monetary value to display in £k, use the Currency (Custom) to get all the settings right. In this instance though I want Percentage set to 0 dp..

Once done, press ok to close the dialog box, then go to set the number format again, but this time choose Custom. The formatting ‘style’ applied previously will be shown

and it can then be modified to get the desired format, in this instance I change to

It’s often these little formatting tips that get thrown into the #WorkoutWednesday challenges that I love the most, although sometimes they can be tucked away and hard to find (or remember).

So at this point I now think I can start to build the viz, so adding the various pills I need, and adding a reference line I get this…

A line chart isn’t what I want though, and it’s not bars either, as they start from 0. The mark type I need is gantt

which makes me realise I need to create another field…

4. Difference

The gantt chart has the marks in the right position, but to make the ‘bars’ I need to alter the size, and that size is the difference between the mark position and reference line, which is

([Avg Orders Per Day Per Segment Per Month]-[Overall Avg Orders Per Day Per Segment])

Placing this field on size, gives me this but while the bars are the right height, they’re not in the right place.

Using gantt bars in this way is akin to the technique used in building waterfall charts, and is rectified simply by applying a multiple of -1 to the pill on the size shelf (in this instance [Difference]).

As a shortcut, I simply type this into the pill on the size shelf itself, which is a nifty little trick.

And voila! The viz is essentially there now. Just need to add colour and further formatting …

The ‘bars’ are coloured based on whether they’re above or below the line, ie whether the difference is positive or negative, so another field is needed :

5. COLOUR : Difference

IF [Difference]>=0 THEN ‘green’ ELSE ‘blue’ END

And popping this on the colour shelf, and adjusting the colours to suit, gives me

The main thing left now is the little formatting bits and pieces :

Borders : remove columns

Axis : remove title, set to be independent, and set not to start at zero

Reference Line : change the label font size, align left middle, and set the shading to have a white background and 100% opacity

Reference Line TootlTip (new feature in v2019.2): set to Custom as below

Note if I’d simply called my field ‘Average’ rather than ‘Overall Avg Orders Per Day Per Segment’, I wouldn’t have needed this step, though it’s always useful to try out the new ‘little’ features if you can 🙂

Label : add [% Difference] to label shelf, and format centre middle

Tooltip : add relevant fields and adjust formatting as appropriate

The final thing I noticed was the axis scale – my scale was 2dp due to the formatting of my [Avg Orders Per Segment Per Month] field. Luke’s axis was mixed – some scales at 1dp and some with 0.

I tried a few things, like formatting the axis to be Number (Standard) which has the effect of ‘automatically’ showing a number as a decimal or a whole numbers (something I recall from a very early WorkoutWednesday challenge a couple of years back). But this didn’t give me the desired effect. I ended up setting the axis format to be 1 dp. But this then meant the value on the tooltip also ended up displaying as 1dp, when I wanted it to be 2 *sigh*

To fix this I created a duplicate field of the measure being displayed (a copy of [Avg Orders Per Segment Per Month]), and formatted it to 2dp, and placed this on the tooltip instead.

I’m going to have to have a peak at Luke’s solution to figure out what magic he’s done here….

Very final step was to add to a dashboard, and add the title and my own custom footer.

So phew done! My published version is here.

If you’ve got this far, thank you for reading J I can’t guarantee all write ups will be to this level – it’ll partly depend on the challenge itself, and what path I head down to solve it.

I’m now off to have a peak at Luke’s challenge to figure out that pesky axis……. or I would if his workbook was downloadable 😦

Happy vizzin!

Donna

Donna + DataViz

ramblings in all things Tableau (and occasionally some other Stuff….)

Category: WorkoutWednesday

Can you create a Monthly Driver Analysis dashboard?

Can you create a jitterfly chart?

Can you create a normalised jitter plot?

Can you build a funnel chart?

Which months have the higher number of orders?