Data Normalisation Methods

Erica set this fun and incredibly useful challenge this week, based on the TC25 talk by Lorna Brown & Robbin Vernooij, to showcase different methods of normalising data when comparing measures which have drastically different scales.

Building the Raw Values chart

Add Sales to Rows. Then drag Quantity on to the canvas and drop the pill on the Sales axis (when you see the ‘2 column’ icon appear). This has the affect of adding the fields onto a shared axis, and the sheet will update to automatically reference Measure Names and Measure Values. Swap Quantity so it is displayed below Sales in the Measure Values section.

Add Region and Category to Detail and change the Mark type to Circle.

I’m going to incorporate the last requirement at this stage, as it helps with the build, so create parameters

pSelectedRegion

string parameter, defaulted to West

pSelectedCategory

string parameter, defaulted to Furntiture

show both these parameters on the sheet.

Create a new field

Is Selected Region & Category

[pSelectedCategory]=[Category] AND [pSelectedRegion]=[Region]

Add this field to Colour, and swap the values in the legend, so True is listed first. Then change the Region on the Detail shelf, so it is also on colour, by adjusting the icon to the left of the pill. Adjust the colours as required and then reduce the opacity on colour to 80%.

Manually update the entry in the pSelectedRegion parameter to each Region, so the True-<Region> colour combination can be updated to the dark grey.

Add Is Selected Region & Category to Size. Edit the size so they are reversed and the range in size is closer than the default. Once done, then manually adjust the dial on the Size shelf.

Show mark labels, selecting the option to only show the min & max values per cell and aligning middle right

Update the Tooltip. Then create fields True = TRUE and False = FALSE and add both of these to the Detail shelf. We’ll need these to disable the default highlighting later (adding now, as for all the other sheets, we’ll duplicate this one, so makes things easier).

Show the caption (Worksheet menu > show caption) and update the caption to reference the website Erica refers to. Then update the title of the sheet, and name the tab Raw or similar.

Building the Decimal Normalisation chart

Duplicate the Raw sheet, and name Decimal or similar. Update the title.

Create new fields

Sales – Decimal

SUM([Sales]) / 10^6

Quantity – Decimal

SUM([Quantity]) / 10^4

Drag Sales – Decimal onto the canvas and drop directly over the existing Sales pill in the Measure Values section, so it replaces it. Do the same with the Quantity – Decimal pill. Uncheck Show Labels.

Add constant reference line of 0 that displays as a black solid line at 100% opacity

Repeat and create a constant reference line with value of 1. Edit the axis and fix from -0.05 to 1.05 and remove the axis title.

Update the text in the caption.

Building the Max-Min Normalisation Chart

Duplicate the Decimal sheet and rename Max-Min or similar. Update the title.

Create new fields

Sales – Max-Min

(SUM([Sales])- WINDOW_MIN(SUM(Sales))) / (WINDOW_MAX(SUM(Sales)) – WINDOW_MIN(SUM(Sales)))

Quantity – Max-Min

(SUM([Quantity])- WINDOW_MIN(SUM([Quantity]))) / (WINDOW_MAX(SUM([Quantity])) – WINDOW_MIN(SUM([Quantity])))

Drag Sales – Max-Min onto the canvas and drop directly over the existing Sales – Decimal pill in the Measure Values section, so it replaces it. Do the same with the Quantity – Max-Min pill.

Adjust the table calculation setting for each of the measures so they are computing by Category, Region and Is Selected Region & Category.

Adjust the Tooltip if required. Right click on the bottom column headings and Edit Alias to update the text- you may not be able to rename Sales – Max-Min along xyz… just to ‘Sales’, so you may need to be creative and add spaces eg ‘ Sales ‘ or similar. Update the caption.

Building the Z-Score Normalisation Chart

Duplicate the Max-Min sheet, and name Z-Score or similar. Update the title.

Create new fields

Sales – Z-Score

(SUM([Sales]) – WINDOW_AVG(SUM([Sales]))) / WINDOW_STDEV(SUM([Sales]))

Quantity – Z-Score

(SUM([Quantity]) – WINDOW_AVG(SUM([Quantity]))) / WINDOW_STDEV(SUM([Quantity]))

Drag Sales – Z-Score onto the canvas and drop directly over the existing Sales – Max-Min pill in the Measure Values section, so it replaces it. Do the same with the Quantity – Z-Score pill.

Adjust the table calculation setting for each of the measures so they are computing by Category, Region and Is Selected Region & Category. Remove the reference line for the constant value of 1. Edit the axis, so the range is now Automatic rather than fixed.

As before, adjust the Tooltip again if required, edit the column labels using the alias feature, and update the caption.

Creating the dashboard and adding the interactivity

Add all 4 charts onto a dashboard, using a horizontal container to arrange the charts side by side. From the object context menu on the dashboard, select the option to show the caption

To disable the default highlighting ‘on click’ create a dashboard filter action based on the True/False method described here – you’ll need to create an action per sheet.

To set the parameters, create a parameter action

Set Category

On select of all the sheets, set the pSelectedCategory parameter passing in the value from the Category field.

Create another similar action called Set Region which sets the pSelectedRegion parameter with the value from the region field.

Finally, add a text section to the top right of the dashboard that references the pSelectedRegion and pSelectedCategory parameters.

And that should be it. My published via is here.

Happy vizzin’!

Donna

Can you display a ratio and volume on the same chart?

This week’s #WOW2025 challenge was a guest post by Kieran Adair and had two parts to it; one to show Sales and Profit Ratio for 2024 only on the same chart, and the other to show comparisons to 2023 data on the same chart.

I built the first one way, and had to make changes in order to build the second. I could have chosen to rebuild the first chart using the fields I ended up creating for the second, but I chose not to, so this will ultimately be 2 blogs in one, as for each chart I needed different fields.

Restricting the data

We only need data for 2023 and 2024 for this challenge, so apply a data source filter (right click data source > Add data source filter) and restrict to Years 2023 and 2024 only.

Building the 2024 chart

We will be plotting Profit Ratio so need to create

Profit Ratio

SUM([Profit])/SUM([Sales])

format this to % with 0 dp.

Add Sub-Category to Rows and Profit Ratio to Columns. Add Order Date to Filter and restrict to the Year 2024 only. Sort the bars descending |(just click the icon on the toolbar)

Change the Mark type to circle and set the colour to black. Widen the rows a little.

Add Sales to Columns. Change the mark type of the sales marks card to bar. Set the colour to white and add a black border.

Adjust the Tooltip on the All marks card and verify both measures are displayed whether hovering over a circle or a bar.

Edit the Sales axis and set to custom to ensure it is fixed to start at 0 and end ‘automatic’. Adjust the axis title to include the text [Bars◻].

Make the chart dual axis BUT DO NOT synchronise axis. Remove Measure Names from the Colour shelf of the All marks card. Right click on the Sales axis at the to and Move marks to back.

Edit the title of the Profit Ratio axis so that it includes the text [Circles ●]. Remove all gridlines and zero lines and hide the Sub-Category row heading (right click the text and hide field labels for rows). Update the title and name the sheet RAVE 2024 or similar.

Building the YoY chart

For this version, we need to create some additional calculated fields

Sales – 2024

IF YEAR([Order Date]) = 2024 THEN [Sales] END

Sales 2023

IF YEAR([Order Date]) = 2023 THEN [Sales] END

Profit – 2024

IF YEAR([Order Date]) = 2024 THEN [Profit] END

Profit 2023

IF YEAR([Order Date]) = 2023 THEN [Profit] END

PR – 2024

SUM([Profit – 2024])/SUM([Sales – 2024])

format to % with 0dp

PR – 2023

SUM([Profit – 2023])/SUM([Sales – 2023])

format to % with 0dp

On a new sheet, add Sub-Category to Rows and PR – 2024 to Columns. Sort descending. Change the mark type to circle.

Drag PR – 2023 onto the PR -2024 axis and release the mouse when the 2 green column symbol appears.

This will automatically update the viz to include Measure Names and Measure Values. Adjust the Colour legend so PR 2024 is black and PR 2023 is grey, and PR 2024 is listed first (so black circles are on top when the marks overlap)

Add Sales 2024 to Columns. Change the mark type of the Sales 2024 marks card to bar. Remove Measure Names from the Colour shelf of the Sales 2024 marks card. Set the colour to white and add a black border.

Add Sales – 2023 to the Detail shelf of the Sales 2024 marks card. Right click on the Sales 2024 axis and Add Reference Line per cell referencing the Sales 2023 value as below.

Update the title of the Sales – 2024 axis to Sales [Bars◻] and set the axis to be fixed from 0 to automatic. Update the title of the other axis (Value) to Profit Ratio [Circles ●].

On the All marks card add PR 2023, PR 2024, Sales – 2023 and Sales -2024 to the Tooltip shelf and update to suit.

Make the chart Dual Axis but once again DO NOT synchronise the axis. As before move the marks (bars) of the Sales axis to the back, and remove the Sub-Category label. Remove all gridlines and zero lines and update the title.

And that’s it. I added both my sheets to dashboards, juts to improve the presentation slightly. My published viz is here.

Happy vizzin’

Donna

Formatting & Intermediate Tableau Charts

Erica set this challenge primarily aimed at building a beautifully presented dashboard, with the requirement to consider the use of layout containers and padding. She threw in creating some very specific chart types too. The easiest way to blog this, is by chart type.

Building the Histogram

Add Quantity to Columns as continuous dimension (green unaggregated pill) and add Order ID as a measure using the CNT aggregation to Rows. The easiest way to do this is right click and drag Order ID from the left hand date pane and drop onto rows. When you release the mouse, the option to select the aggregation should be available.

Change the mark type to bar and adjust the colour. Edit the title of the y-axis and remove the title from the x-axis. Update the Tooltip.

Double -click into Columns and manually type ‘Quantity in Order’ (including the quotes). Right click on the first text displayed and hide field labels for columns. Adjust the font of the Quantity in Order label that remains.

Remove row and column dividers and column gridlines. Remove Row axis rulers.

Note, when you add to the dashboard , you may find you want to adjust the Size of the bars.

Building the Peas in a Pod chart

On a new sheet, add Category to Filter and select Technology. Add Order Date to Filter and select Years then choose 2022,2023 and 2024.

Rename the Sub-Category field to Sub-Cat and add to Rows. Add Sales to Columns. Change the mark type to circle. Add Order Date to Colour. By default it should display YEAR(Order Date). Adjust colours to suit. Widen each row a bit.

Add another instance of Sales to Columns.

On the Sale (2) marks card change the mark type to line and move YEAR(Order Date) to Path. Increase the size and adjust the colour so it’s a grey lozenge.

Make the chart dual axis and synchronise the axis. Right click the top axis and move marks to back. Adjust the Tooltip. Edit the title of the x-axis.

Hide the top axis. Remove row and column dividers. Remove row gridlines. Remove axis rulers for both columns and rows.

Note, when you add to the dashboard , you may find you want to adjust the Size of the circles and the line. I found it was best adjusted on the web after I published to Tableau Public.

Building the +/- Bar Chart

On a new sheet add Order Date to Filter and select Years then choose 2022,2023 and 2024. Add Order Date to Columns and select to be at the continuous month level (green pill, May 2015 format). Add Sales to Rows and change the mark type to bar.

Add a quick table calculation of Difference to the Sales pill.

Adjust the size of the bars (select manual over fixed and adjust the slider).

Create a new calculated field

Diff is +ve

ZN(SUM([Sales])) – LOOKUP(ZN(SUM([Sales])), -1) > 0

and add to the Colour shelf. Adjust colours to suit. Hide the null indicator. Adjust the Tooltip. Adjust the title of the x-axis.

Remove all gridlines and axis rulers. Remove the columns zero line. Set the rows zero line to be a continuous unbroken line.

Note – once again the size may need further adjusting once on the dashboard and/or after publishing.

Building the slope chart

Add Category to filter and select Office Supplies. Add Region to filter and select West. Add Order Date to filter and select Years then choose 2021 and 2024 only.

Add Order Date to Columns and Sales to Rows. Add Sub-Cat to Detail.

Add Sales to Colour then add a quick table calculation of Percentage Difference. This only sets a value against the 2024 marks though, whereas we want a value for the whole line for each Sub-Cat.

Double-click into the Sales pill on Colour to edit it, and wrap the whole calculation in a WINDOW_MAX() function – the whole calculation should look like

WINDOW_MAX((ZN(SUM([Sales])) – LOOKUP(ZN(SUM([Sales])), -1)) / ABS(LOOKUP(ZN(SUM([Sales])), -1)))

Adjust the colour legend. I set the start & end colours to #ff00ff (hot pink) and #5d6068 (dark grey) and then applied an upper limit to the range and centred at 0 as below.

Hide the Order Date heading at the top of the chart. Adjust the Tooltip.

Remove column gridlines, zero lines and axis rulers.

Create new fields

2021 Sales

IF YEAR([Order Date]) = 2021 THEN [Sales] END

and

2024 Sales

IF YEAR([Order Date]) = 2024 THEN [Sales] END

then create

% Difference

(SUM([2024 Sales]) – SUM([2021 Sales]))/SUM([2021 Sales])

Edit the Sort of the Sub-Cat pill on the Detail shelf, so it is sorting by % Difference ascending. This will ensure the lines are displayed overlapping in the expected manner.

Building the Bar-in-Bar Chart

On a new sheet, add Category to filter and select Furniture. Add Region to filter and select West. Add Order Date to filter and select Years then choose 2023 and 2024 only.

Create a new field

2023 Sales

IF YEAR([Order Date]) = 2023 THEN [Sales] END

Add Sub Cat to Rows and 2023 Sales to Columns. Add a sort to the Sub-Cat pill to sort by 2024 Sales descending. Add 2024 Sales to Columns. Make the chart dual axis and synchronise the axis. Change the mark type on the All marks card to bar. Remove Measure Names from the Colour shelf on the All marks card. Set the colour of the 2023 Sales marks card to light grey. Increase the width of each row, then reduce the size of the bar on the 2024 Sales marks card.

Create a new field

Sales Decreased

SUM([2024 Sales]) < SUM([2023 Sales])

and add to the Colour shelf of the 2024 Sales marks card. Adjust colours to suit.

In the solution, the Tooltip shows an indicator – I’m not sure if this was necessary, but I added it just in case

2024 Sales > 2023 Sales

IF [Sales Decreased] THEN ‘●’ END

Add this to the Tooltip shelf of the All marks card, along with the 2023 Sales and 2024 Sales fields. Adjust the Tooltip accordingly.

Hide the top axis. Remove the title of the x-axis.

Remove row and column dividers. Remove row gridlines and row axis rulers and ticks. Remove all zero lines.

Building the side-by-side bar chart

On a new sheet, add Category to filter and select Technology. Add Region to filter and select West. Add Order Date to filter and select Years then choose 2023 and 2024 only.

Add Sub Cat to Rows and Sales to Columns. Apply a Sort to Sub-Cat based on 2024 Sales descending.

Create a new field

Year

YEAR([Order Date])

And add to Rows and Colour. Adjust colour to suit. Widen each row.

Create new field

Diff is Neg Indicator

IF NOT([Diff is +ve]) THEN ‘●’ ELSE ” END

Add to Rows before Year and then adjust the table calculation setting so it is just computing by Year only.

Adjust the alignment of the Sub-Cat column so it is aligned middle right. Narrow the width of the Diff is Neg Indicator column to try to remove all the column heading text. If some still shows, rename the field so it is padded with some spaces at the front. Adjust the Tooltip.

Remove the x-axis title. Remove Column dividers. Adjust the row dividers so they are at level 1 and are partitioning each Sub Cat only and not splitting the Year column.

Remove all gridlines

Building the dashboard

It’s always hard to walk through the steps for placing objects on a dashboard in the specified places. My general rules are

  1. Start with a floating vertical container that is positioned 0,0 and set to the dashboard height and width. I name this Base.
  2. Then add tiled objects such as a text object for the title, blank objects, other containers, charts etc.
  3. When you add a container, add a blank object initially to help get everything into place. Remove once you have at least 2 objects side by side / on top of each other depending on the direction you’re organising.
  4. The item hierarchy shouldn’t have any containers of type Tiled listed.
  5. Try to name your containers to help maintenance in the future

Below is a picture of the item hierarchy I ended up with using this approach

I created a floating vertical container called Base, positioned 0,0 and 1200 x 850. Background set to None, no border and inner and outer padding all 0.

I added a text object to contain the title. Background set to None and no border. Outer padding set to 10 all round, and inner padding 0.

I added a blank object, which I renamed Horizontal divider. Background set to light grey, no border. Outer padding set to left and right 10 and top and bottom 0. Inner padding all 0. Height set to 2.

I added another Vertical container, which I renamed Body. Background set to None, no border and all inner and outer padding set to 0.

I added 3 horizontal containers on top of each other, and set the property of the Body vertical container to distribute contents evenly so each horizontal container was the same height.

1st horizontal container

I named Row 1 – Level 1. I set the background to the pale green. No border. Outer padding set to left & right 10, top & bottom 5. Inner padding all 0.

Into this I added a text field to describe the levels. Background of this was white, no border and outer padding set to 0 (so the green background disappears). Inner padding was set to top: 20 and 10 for the rest.

Next the Histogram chart. Border set to green. Background white. Outer padding right:5, rest 2. Inner padding set to 10 all round. Width of chart fixed to 380 px.

Next the Level 1 text object. No border, no background. Outer padding 4 all round; inner padding 0. Formatted text object to rotate text. Width of object set to 40 px.

2nd horizontal container

I named Row 2- Level 2. I set the background to the pale blue. No border. Outer padding set to left & right 10, top & bottom 5. Inner padding all 0.

Into this I added a text field to describe the challenge. Background of this was white, no border and outer padding set to 0 (so the blue background disappears). Inner padding was set to 10 all round. Width of object set to 380px.

Next the Peas in a Pod chart. Border set to blue. Background white. Outer padding right:5, rest 2. Inner padding set to 10 all round.

Next the +/- bar chart. Border set to blue. Background white. Outer padding right and left 5, top & bottom 2. Inner padding set to 10 all round. Width of object set to 380px.

Next the Level 2 text object. No border, no background. Outer padding 4 all round; inner padding 0. Formatted text object to rotate text. Width of object set to 40 px.

3rd horizontal container

I named Row 3- Level 3. I set the background to the pale purple. No border. Outer padding set to left & right 10, top & bottom 5. Inner padding all 0.

I added the Slope chart. Border set to purple. Background white. Outer padding right:5, rest 2. Inner padding set to 10 all round. Width of object set to 380px.

Next the bar-in -bar chart. Border set to purple. Background white. Outer padding right & left 5, top & bottom 2. Inner padding set to 10 all round.

Next the side-by-side bar chart. Border set to purple. Background white. Outer padding right and left 5, top & bottom 2. Inner padding set to 10 all round. Width of object set to 380px.

Next the Level 3 text object. No border, no background. Outer padding 4 all round; inner padding 0. Formatted text object to rotate text. Width of object set to 40 px.

It was a bit of trial and error to get the spacing as required, and a few calculations to work out how wide I wanted each chart to be, based on the width of the dashboard and the other items in each row.

Anyway, my published viz is here.

Happy vizzin’!

Donna

Can you compare salaries to the median?

For the first WOW challenge of 2024, I set the task of completing a viz related to HR salary data.

Creating the calculations

To start, we’ll build out all the necessary calculations required, and display them in tabular form. First create

Employee

[Forename] + ‘ ‘ + [Surname]

then add Employee ID and Employee to Rows, along with Department and Job Title. Add Pay Level as a discrete dimension (blue pill) to Rows too, then format Salary to £ with 0 dp, and add to Text.

We want to compare each employee’s salary with the median salary of all the employees in the same pay level. For this we need

Median Salary per Pay Level

{FIXED [Pay Level]: MEDIAN([Salary])}

Add this to the view too. You should see that the value is the same for those rows where the Pay Level is the same.

We need to compute the difference between these numbers (to draw a line from the median to the salary)

Difference From Median

SUM([Salary])- SUM([Median Salary per Pay Level])

and we also need to understand the salary as a proportion of the median

Salary / Median %

SUM([Salary])/SUM([Median Salary per Pay Level])

format this to % with 0 dp and add both these fields to the table

Each employee is given a scale value based on the proportion

Salary Scale

IF [Salary / Median %] > 1.2 THEN 6
ELSEIF [Salary / Median %] >1.1 AND [Salary / Median %] <=1.2 THEN 5

ELSEIF [Salary / Median %] >1 AND [Salary / Median %] <= 1.1 THEN 4

ELSEIF [Salary / Median %] > 0.9 AND [Salary / Median %] <= 1 THEN 3

ELSEIF [Salary / Median %] > 0.8 AND [Salary / Median %] <=0.9 THEN 2
ELSE 1
END

Set this to be discrete, then add to the view on Rows.

Finally we need the markers for the 80%, 90% etc of the median salary

80% Median

0.8 * [Median Salary per Pay Level]

90% Median

0.9 * [Median Salary per Pay Level]

110% Median

1.1 * [Median Salary per Pay Level]

120% Median

1.2 * [Median Salary per Pay Level]

Building the Viz

On a new sheet, add Employee ID, Employee, Department and Job Title to Rows. Add Salary to Columns and set the mark type to Circle. Set the Colour of the circle to suit (or use #64cdcc).

Add Median Salary per Pay Level, 80% Median, 90% Median, 100% Median, 120% Median to the Detail shelf. Add Pay Level, Salary Scale and Salary / Median % to Tooltip.

Format the axis (right click) and set the scale to be at £k with 0dp

This will also change the formatting of the salary on the tooltip, but we want it to be more detailed so create

Tooltip-Salary

[Salary]

and format this to £ with 0 dp. Add this to Tooltip too and adjust tooltip to match.

Add a reference line to the axis (right click axis > add reference line) for the Median Salary per Pay Level. Set it to be per cell and display a dashed line with no labels/tooltips displaying.

Add another reference line. This time set it to be a band at the cell level which ranges from the 80% Median to the 90% Median. Don’t display any labels or tooltips or lines and fill the band with the appropriate colour.

Repeat the above process, adding a reference band from 90% – 110% of the median, and a further band from 110% – 120% of the median.

Add another instance of Salary to Columns. Change the mark type of the 2nd Salary marks card to gantt and adjust the colour. Add Difference From Median to Size and decrease the size as small as possible. Double click into the Difference From Median pill on the size shelf and manually type in * -1 to the end.

Now make the chart dual axis and synchronise the axis. Right click on the top axis and move marks to back.

We want the user to be able to filter the chart, so add Department, Pay Level (select All values) and Salary Scale to the Filter shelf.

Finally, format the chart – make each row a bit wider; set the background colour of the worksheet to #f9f8f7; remove column dividers; set row dividers to a thick white line; remove all gridlines, zero lines and axis rulers. Reduce the size of the axis values and increase the width of the initial 3 columns so the text doesn’t wrap as much. Test your filters.

If all working, then add the sheet to a dashboard.

Building the Legend

To build the legend, I simply duplicated the core sheet, then filtered to a specific Employee ID and hid the Employee ID, Employee, Department and Job Title fields. I then edited the reference lines to add custom labels to label each line band, formatting the text to display to the top or bottom. I added an annotation to the Salary circle mark to label that as ‘Employee Salary’ which I then manually moved into position.

When I added this to the dashboard, I then floated a blank object over the top so the legend could not be interacted with.

My published viz is here.

Happy vizzin’!

Donna

Can you build an UpSet Plot with just one calculated field?

For this week’s #WOW2023 challenge , guest poster, Venkatesh Iyer asked us to create an UpSet Plot, with the added requirement of using just 1 calculated field.

To start with, I had to read up on what an UpSet plot was and looked through the blog post by Chris Love that was referenced in the challenge introduction. While this post gave me more clarity, it introduced more calculations than I was hoping for, so I started looking a bit wider for a bit of help. This YouTube video set me on my way.

Setting up the data

The requirements stated to limit to the Category of Furniture only, so after connecting to the Superstore data source, I added a Data Source Filter (right click data source > Edit Data Source filters) to restrict the information throughout the workbook just to the Furniture Category.

Doing this means I don’t have to keep adding the Category to the Filter shelf, and any FIXED LoDs I create will only be based on the subset of data that has been ‘pre-filtered’.

There are multiple charts in this challenge, and I used 5 sheets in total. Let’s start with the easy ones.

Building the Customer List

On a new sheet add Customer Name to Rows and Sales to Text. Format Sales to $ with 0dp, and widen each row. Remove row/column dividers and remove the ‘Customer Name’ column heading (right click and hide field labels for rows). Uncheck the Show Tooltips option from the Tooltip shelf. Name the sheet Customer List.

Building the Sub-Category Bar Chart

Add Sub-Category to Rows. Add Customer ID to Columns, then use the context menu to change the field to use the Count (Distinct) measure.

Note – I would typically create a specific calculated field containing the function COUNTD([Customer ID]), but as we only want a single calculated field in the solution, then this is the method to adopt.

Sort the resulting bar chart descending, and add Sub-Category to the Colour shelf and adjust to suit.

Widen each row and then click on the Label shelf and check the Show mark labels tick box. Align the labels middle left and format the font. Hide the axis and the Sub-Category column heading (uncheck show header). Remove all gridlines, row/column dividers/ zero lines etc. Uncheck the Show Tooltips option from the Tooltip shelf. Name the sheet Sub Cat Bar.

Identifying the groupings

Ok, now we’re at the point we need to identify the different ‘cohorts’ of customers based on what Sub-Categories they have purchased. Let’s build out a tabular ‘check’ sheet so we can see what we’re up to…

On a new sheet add Customer Name and Sub-Category to Rows. This simple table shows us that Aaron Bergman has at some point only ever bought Bookcases & Chairs, while Aaron Hawkins has purchased Chairs & Furnishings. These 2 customers are in different cohorts as they haven’t bought exactly the same combination of Sub-Categories. There are 15 different combinations in total.

Based on what I observed in the video, I can create a FIXED LoD calculated field to identify if a customer has bought Bookcases.

BC

{FIXED [Customer ID]: MAX(IIF([Sub-Category] = ‘Bookcases’,1,0))}

Pop this into the view, and we can see that there is a 1 reported against all the rows associated to each customer who bought Bookcases. So Aaron Bergman has a 1 against both rows, and Aaron Hawkins has 0 against both rows.

Creating similar calculated fields, specific for each of the 4 different Sub-Category values, and putting them into the table, we can see we have various combinations of 1s and 0s for each customer. Adam Shillingburg has bought all 4 types, so has 1’s across the board, while Adrian Sharni has only bought Furnishings, so has 3 0’s and a single 1.

Based on our understanding of what these fields are doing, we can combine what each one is doing into a single calculated string field.

Combo

STR({FIXED [Customer ID]: MAX(IIF([Sub-Category] = ‘Furnishings’,1,0))}) + ‘-‘ +
STR({FIXED [Customer ID]: MAX(IIF([Sub-Category] = ‘Chairs’,1,0))}) + ‘-‘ +
STR({FIXED [Customer ID]: MAX(IIF([Sub-Category] = ‘Tables’,1,0))}) + ‘-‘ +
STR({FIXED [Customer ID]: MAX(IIF([Sub-Category] = ‘Bookcases’,1,0))})

Note the order is based on the order of Sub-Category bar chart. Add this into the view on Rows (rejig the order of the measure values to match).

So with this one calculated field Combo, we now have a dimension we can use to count the customers against. The calculated fields I used to demonstrate the concept are now superfluous and can be deleted if you wish, if you remove the check sheet too. I chose to keep mine in for reference.

Building the Combo bar chart

ON a new sheet, add Combo to Columns and then add Customer ID to Rows, but as before, set it to use the COUNTD aggregation.

Sort the bar to be ascending. Check the Show mark labels option on the Label shelf and adjust the alignment to be bottom middle, and the font to be bold. Change the colour of the bar to suit. Hide the axis and the Combo heading (uncheck show header). Remove all gridlines, row/column dividers/ zero lines etc. Uncheck the Show Tooltips option from the Tooltip shelf. Name the sheet Combo Bar.

Building the dot plot

Add Sub-Category to Rows and Combo to Columns. Manually sort the Sub-Category rows so they are listed with Furnishings at the top then Chairs > Tables and Bookcases at the bottom. Sort the Combo field by COUNTD of Customer ID ascending

Double click into Columns and manually type MIN(0) to generate an axis. Change the mark type to circle. Add Sub-Category to Colour. Widen each row.

Add Sub-Category to Label. Then double click into the pill on the label shelf and manually change to add the LEFT function around the pill, so the pill becomes LEFT([Sub-Category],1) to get the initial. Again, this is typically something I would explicitly store in its own calculated field. Manually re-sort the rows again, as this seems to break the sorting.

Align the label middle centre and bold the font. Then add another instance of MIN(0) on Columns to create a 2nd marks card. Change this mark type to Line. Remove the field from the Text shelf, and move the Sub-Category pill from Colour to Path.

Make the chart dual axis and synchronise axis. Right click on the MIN(0) axis title at the top of the chart and move marks to back.

Then hide the axis and the Combo column and Sub-category row (uncheck show header). Remove all gridlines, row/column dividers/ zero lines etc. Adjust the size of the marks as required. Uncheck the Show Tooltips option from the Tooltip shelf. Name the sheet Dot Plot. .

Building the Legend

The quickest way to do this is to duplicate the dot plot sheet. Then remove Combo from the Columns. mark type of the line from line to circle and decrease the size to the smallest possible. Move Sub-Category from Detail to Label and align middle right.

Edit the label and add some spaces in front of the text to push the labels further to the right. Format the font to suit.

Adjust the MIN(0) fields to both be MIN(0.2) instead (just double click into the fields to edit). Then edit the axis to be fixed from 0 to 1. This forces the display to the left.

Hide the axis, and name the sheet Legend.

Adding the interactivity

Arrange the items on a dashboard, using layout containers and padding to organise the 4 main charts and ensure the dot plot aligns both vertically and horizontally with the other two bar charts.. The legend chart is a floating object.

Add a filter dashboard action to filter the customer list

Filter Customers

On select of the Combo Bar or Sub Cat Bar sheets, target the Customer List sheet, showing all values when the selections are cleared.

And that should be it! My published viz is here.

Happy vizzin’!

Donna

Can you build a dumbbell chart?

Week 32 of #WOW2022 and Kyle set this challenge to build a dumbbell chart (also sometimes referred to as a barbell chart, a gap chart, a connected dot plot, a DNA chart). I first built one of these a long time ago, and have built many since, so chose not to refer to Ryan Sleeper’s blog Kyle so generously provided. I figured I’d have a go myself from memory. Looking back now, I actually started by going down the same route a Ryan, but then switched my approach in order to handle the tooltip requirements.

So here’s what I did.

Defining the calculations required

I’m going to start with these, as it’s then easier just to explain how everything then gets built. In reality, some of these calculations came along mid-build.

I decided to create explicit fields to store the Ownership values for the 2015 & 2019 years ie

Ownership 2015

IF [Year]=2015 THEN [Ownership] END

Ownership 2019

IF [Year]=2019 THEN [Ownership] END

Both these fields I formatted to % with 0dp.

Given these, I can then work out the difference

Difference

SUM([Ownership 2019])-SUM([Ownership 2015])

also formatted to % with 0dp

and finally, I can work out whether the change exceeds 20% or not

Difference>0.2

[Difference]>0.2

These are all the fields required 🙂

Building the Dot Plot

Add Age to Columns as a discrete dimension (blue pill) and Ownership 2015 to Rows. This will create a bar chart.

Then drag Ownership 2019 onto the canvas, and release your mouse at the point it is over the Ownership 2015 axis and the 2 green columns symbol appears.

This will have the effect of adding Measure Names and Measure Values to the view.

Change the mark type to Circle and move Measure Names from Columns onto Colour. Adjust colours accordingly.

Show mark labels and adjust so they are formatted middle centre, and are bold. Increase the size of the circles so the label text is completely within the circle, and the font colour should automatically adjust to white on the darker circles and black on the lighter ones.

Add Ownership 2015, Ownership 2019 and Difference onto the Tooltip shelf, and adjust the tooltip to suit. Note – I chose to adjust the wording of the tooltip to not assume there was always an increase. I used the word ‘change’ instead and as a result applied the custom formatting of ▲0%;▼0% to the Difference field that was on the Tooltip shelf.

So my tooltip was

which generated a tooltip that looked like

Connecting the dots

Add another instance of Measure Values to Rows alongside the existing one. This will generate a 2nd Measure Values marks card.

On that marks card, remove Measure Names from Colour and add Difference>0.2 to the Colour shelf instead, and adjust the colours.

Change the mark type to Line and add Measure Names to the Path shelf.

Move the Difference pill from the Tooltip shelf to the Label shelf. Edit the label controls, so the label only displays at the start of the line, which should make the label only show at the bottom. Adjust the label alignment so it is bottom centre.

Delete all the text from the Tooltip of the ‘line’ marks card.

Now make the chart dual axis and synchronise the axis. You’ll notice the bars sit on top of the circles, and the difference text is not displaying under the circles.

To remedy this, first, right click on the right hand axis, and select move marks to back. The lines should now be behind the circles.

Then, on the lines marks card, edit the Label text and just add a single carriage return in the dialog box, so the text is shifted down.

Adding Age to be displayed at the top

The quickest way to do this is to double click in the space on the Columns shelf, next to the Age pill, and type in ‘Dummy’ or just ” if you prefer. This has the affect of adding another ‘dimension’ into the view and as a result the Age values immediately get moved to the top.

Now just remove all axis, all gridlines, all row & column dividers. Then uncheck show header against the Dummy pill and hide field labels for columns for the Age header (right click on the Age label in the view itself to get this option, not the pill). Then format the Age text to be a bit bigger and bolder, and also increase the Size of the lines mark type. And that’s your viz 🙂

Additional Note

If you had followed Ryan’s blog post, then you would still have needed to create separate calculated fields to store the ownership values for 2015 and 2019, but these would have had to been LOD fields eg

Ownership 2015

{FIXED [Age]:SUM(IF [Year]=2015 THEN [Ownership] END)}

and a similar one for Ownership 2019. The other 2 fields related to the difference would then need to reference these instead.

My published viz is here.

Happy vizzin’!

Donna

Can you highlight profits with Measure Names?

Luke Stanke provided his last challenge of #WOW2020 for the year with this viz to utilise Measure Names & Measure Values. This plots Cost against Sales per Sub-Category (the circles), which are then bounded by a lozenge. The Sales circle and the lozenge are coloured based on profitability, and the lozenges are labelled by Profit, although the position of the label is also based on profitability.

Obviously as the title of the challenge suggests, using Measure Names & Measure Values is key to this. As usual I’ll try to just pull out the main points I think you might find most tricky.

  • Building the core chart
  • Aligning the Sub-Categories against a horizontal line
  • Colouring the circles & lozenge
  • Positioning the labels
  • Formatting the labels
  • Creating a border around the white Cost circle

Building the core chart

You’ll need to create a calculated field to store the cost

Cost

SUM([Sales]) – SUM([Profit])

Once done, then add Sub-Category to Rows and sort by descending by Profit. Add Measure Values to Columns and filter just to Sales and Cost. Change the mark type to circle.

Add a 2nd instance of Measure Values to Columns and change the mark type to Line. Add Measure Names to Path.

Then make this chart dual axis and synchronise the axis. This will give the base chart.

Aligning the Sub-Categories against a horizontal line

Whenever you add a blue discrete pill to the rows or columns, you’ll get a ‘header’ pane, which means the label for the pill is by default positioned in the centre of a row, as demonstrated below if I add row dividers to the chart

The requirement is to show the row label aligned with a horizontal line. To achieve this, we need to add a MIN(0) field the Rows. Adding this green pill, creates an axis which has a 0 horizontal 0 line, which is positioned exactly in alignment with the row label

Removing the row dividers, and uncheck Show Header on the MIN(0) pill, and the right aligning the row label, we get the desired result.

Colouring the circles & lozenge

The circles are either coloured white for Cost or dark red or green for Sales depending on the profitability of the Sub-Category.

So we need a new calculated field

Is Profitable?

SUM([Profit])>0

This returns true or false.

On the circles Measure Names mark card, add Is Profitable to the Colour shelf. If it’s not already there, then add Measure Names to the Detail shelf, then select the detail icon and choose the colour icon to also add Measure Names to the Colour shelf

This has the effect of giving you 4 colour combinations

which can then be edited to set the 2 combinations relating to Cost to white, Sales & True to dark green and Sales & False to dark red (at this point, you probably want to be setting the background colour of the whole worksheet to light grey (right click on sheet, format, select the Shading icon)

For the lozenge, you just need to add the Is Profitable field the the Colour shelf of the line Measure Names card, and adjust the size of the to make it slightly bigger than the circles (you may need to move marks to back to get the lozenge to sit behind the circles).

Positioning the labels

So the tricky thing about the labels is where they are positioned – they are to the right for the profitable sub-categories, and to the left for the non-profitable sub-categories. When you add a label, you can’t provide that sort of conditional logic to control the positioning. However, we happen to have 2 sets of marks cards (as dual axis), and so can use different fields to apply the labelling.

Label Profit is +ve

IF [Is Profitable] THEN SUM(Profit) END

This will only return a value for profitable sub-categories, otherwise the field will store blank.

Label Profit is -ve

IF NOT([Is Profitable]) THEN SUM(Profit) END

Conversely, this will only return a value for the non-profitable sub-categories.

Add the Label Profit is +ve field to the Label shelf of the Circle marks card. Adjust the positioning to be middle right, and adjust the settings as below (tbh this was just a bit of trial and error to find this worked :-)). You may want to additionally add a few <spaces> to the start of the label text to add a bit of breathing room.

Then on the Line marks card, add the Label Profit is -ve field to the Label shelf, but this time choose to Label Line Ends, and select Label end of line. Note – you can try to apply the same settings as above, but I found for some reason it increased the size of the mark around the line end… weird…

Formatting the labels

Right click on the Label Profit is +ve/-ve fields created and custom format with

+”$”#,##0;-“$”#,##0

Creating a border around the white Cost circle

Finally, a sneaky requirement was to add a border around the white circle that represented Cost only. This implied that using the formatting options on the Colour shelf that allows a border to be added to the mark wasn’t going to work, as this will apply to all marks and not just to the Cost circle.

I handled this by creating a 2nd instance of the Cost field (right click Cost and select duplicate).

I then added this Cost (copy) field to the Measure Values section of the chart on the left hand side. This doesn’t appear to do much, since the mark is directly on top of an existing mark, but you will see the Colour legend associated to the Circle marks card change and add 2 new entries.

I changed the colours of the 2 new entries in the Colour legend to a dark grey, then added Measure Names to the Size shelf of the Circles marks card. I then adjusted the range and size of the circles so that the Cost measure is slightly smaller than the Cost (copy) measure. Ultimately all 3 circles are slightly different sizes, but the range is so narrow, it’s hard to tell.

And that’s it. A lovely looking viz with a fair bit going on under the bonnet, but nothing too complex! My published version is here.

Happy vizzin’! Stay Safe!

Donna