Can you build an UpSet Plot with just one calculated field?

For this week’s #WOW2023 challenge , guest poster, Venkatesh Iyer asked us to create an UpSet Plot, with the added requirement of using just 1 calculated field.

To start with, I had to read up on what an UpSet plot was and looked through the blog post by Chris Love that was referenced in the challenge introduction. While this post gave me more clarity, it introduced more calculations than I was hoping for, so I started looking a bit wider for a bit of help. This YouTube video set me on my way.

Setting up the data

The requirements stated to limit to the Category of Furniture only, so after connecting to the Superstore data source, I added a Data Source Filter (right click data source > Edit Data Source filters) to restrict the information throughout the workbook just to the Furniture Category.

Doing this means I don’t have to keep adding the Category to the Filter shelf, and any FIXED LoDs I create will only be based on the subset of data that has been ‘pre-filtered’.

There are multiple charts in this challenge, and I used 5 sheets in total. Let’s start with the easy ones.

Building the Customer List

On a new sheet add Customer Name to Rows and Sales to Text. Format Sales to $ with 0dp, and widen each row. Remove row/column dividers and remove the ‘Customer Name’ column heading (right click and hide field labels for rows). Uncheck the Show Tooltips option from the Tooltip shelf. Name the sheet Customer List.

Building the Sub-Category Bar Chart

Add Sub-Category to Rows. Add Customer ID to Columns, then use the context menu to change the field to use the Count (Distinct) measure.

Note – I would typically create a specific calculated field containing the function COUNTD([Customer ID]), but as we only want a single calculated field in the solution, then this is the method to adopt.

Sort the resulting bar chart descending, and add Sub-Category to the Colour shelf and adjust to suit.

Widen each row and then click on the Label shelf and check the Show mark labels tick box. Align the labels middle left and format the font. Hide the axis and the Sub-Category column heading (uncheck show header). Remove all gridlines, row/column dividers/ zero lines etc. Uncheck the Show Tooltips option from the Tooltip shelf. Name the sheet Sub Cat Bar.

Identifying the groupings

Ok, now we’re at the point we need to identify the different ‘cohorts’ of customers based on what Sub-Categories they have purchased. Let’s build out a tabular ‘check’ sheet so we can see what we’re up to…

On a new sheet add Customer Name and Sub-Category to Rows. This simple table shows us that Aaron Bergman has at some point only ever bought Bookcases & Chairs, while Aaron Hawkins has purchased Chairs & Furnishings. These 2 customers are in different cohorts as they haven’t bought exactly the same combination of Sub-Categories. There are 15 different combinations in total.

Based on what I observed in the video, I can create a FIXED LoD calculated field to identify if a customer has bought Bookcases.

BC

{FIXED [Customer ID]: MAX(IIF([Sub-Category] = ‘Bookcases’,1,0))}

Pop this into the view, and we can see that there is a 1 reported against all the rows associated to each customer who bought Bookcases. So Aaron Bergman has a 1 against both rows, and Aaron Hawkins has 0 against both rows.

Creating similar calculated fields, specific for each of the 4 different Sub-Category values, and putting them into the table, we can see we have various combinations of 1s and 0s for each customer. Adam Shillingburg has bought all 4 types, so has 1’s across the board, while Adrian Sharni has only bought Furnishings, so has 3 0’s and a single 1.

Based on our understanding of what these fields are doing, we can combine what each one is doing into a single calculated string field.

Combo

STR({FIXED [Customer ID]: MAX(IIF([Sub-Category] = ‘Furnishings’,1,0))}) + ‘-‘ +
STR({FIXED [Customer ID]: MAX(IIF([Sub-Category] = ‘Chairs’,1,0))}) + ‘-‘ +
STR({FIXED [Customer ID]: MAX(IIF([Sub-Category] = ‘Tables’,1,0))}) + ‘-‘ +
STR({FIXED [Customer ID]: MAX(IIF([Sub-Category] = ‘Bookcases’,1,0))})

Note the order is based on the order of Sub-Category bar chart. Add this into the view on Rows (rejig the order of the measure values to match).

So with this one calculated field Combo, we now have a dimension we can use to count the customers against. The calculated fields I used to demonstrate the concept are now superfluous and can be deleted if you wish, if you remove the check sheet too. I chose to keep mine in for reference.

Building the Combo bar chart

ON a new sheet, add Combo to Columns and then add Customer ID to Rows, but as before, set it to use the COUNTD aggregation.

Sort the bar to be ascending. Check the Show mark labels option on the Label shelf and adjust the alignment to be bottom middle, and the font to be bold. Change the colour of the bar to suit. Hide the axis and the Combo heading (uncheck show header). Remove all gridlines, row/column dividers/ zero lines etc. Uncheck the Show Tooltips option from the Tooltip shelf. Name the sheet Combo Bar.

Building the dot plot

Add Sub-Category to Rows and Combo to Columns. Manually sort the Sub-Category rows so they are listed with Furnishings at the top then Chairs > Tables and Bookcases at the bottom. Sort the Combo field by COUNTD of Customer ID ascending

Double click into Columns and manually type MIN(0) to generate an axis. Change the mark type to circle. Add Sub-Category to Colour. Widen each row.

Add Sub-Category to Label. Then double click into the pill on the label shelf and manually change to add the LEFT function around the pill, so the pill becomes LEFT([Sub-Category],1) to get the initial. Again, this is typically something I would explicitly store in its own calculated field. Manually re-sort the rows again, as this seems to break the sorting.

Align the label middle centre and bold the font. Then add another instance of MIN(0) on Columns to create a 2nd marks card. Change this mark type to Line. Remove the field from the Text shelf, and move the Sub-Category pill from Colour to Path.

Make the chart dual axis and synchronise axis. Right click on the MIN(0) axis title at the top of the chart and move marks to back.

Then hide the axis and the Combo column and Sub-category row (uncheck show header). Remove all gridlines, row/column dividers/ zero lines etc. Adjust the size of the marks as required. Uncheck the Show Tooltips option from the Tooltip shelf. Name the sheet Dot Plot. .

Building the Legend

The quickest way to do this is to duplicate the dot plot sheet. Then remove Combo from the Columns. mark type of the line from line to circle and decrease the size to the smallest possible. Move Sub-Category from Detail to Label and align middle right.

Edit the label and add some spaces in front of the text to push the labels further to the right. Format the font to suit.

Adjust the MIN(0) fields to both be MIN(0.2) instead (just double click into the fields to edit). Then edit the axis to be fixed from 0 to 1. This forces the display to the left.

Hide the axis, and name the sheet Legend.

Adding the interactivity

Arrange the items on a dashboard, using layout containers and padding to organise the 4 main charts and ensure the dot plot aligns both vertically and horizontally with the other two bar charts.. The legend chart is a floating object.

Add a filter dashboard action to filter the customer list

Filter Customers

On select of the Combo Bar or Sub Cat Bar sheets, target the Customer List sheet, showing all values when the selections are cleared.

And that should be it! My published viz is here.

Happy vizzin’!

Donna

Leave a comment