Can you create a scatter plot investigation?

This week’s #WOW2024 challenge was set by a guest poster, Robbin Vernooij, who wanted us to build a scatterplot with additional features to aid analysis. The main focus was on using Set Actions, so that’s what I used throughout the challenge, although parameters could be also be used.

Modelling the data

I took the simpler route when combining the data sources. After connecting to the Life Expectancy (lex.csv) data source, I deleted all the columns relating to the years except 2022 (Ctl Click to multi select the columns, and then right click and ‘hide’) . I then renamed the column from 2022 to Life Expectancy. The data source just contained 2 fields Country and Life Expectancy.

I then added the Co2 Pcap Cons.csv data source and related it via the Country field. Again I removed all the unnecessary year fields except the 2022 column, and renamed this to  Co2 Pcap Cons.

Building the Scatter Plot

On a new sheet, add Co2 Pcap Cons to  Columns and Life Expectancy to Rows. Add Country to Detail.

Hide the null indicator.

We need to identify a ‘selected’ country. We could use a parameter for this, but as mentioned above, I’ll use a set.

Selected Country Set

Right click on Country > Create  > Set.  Select a single country from the list (I chose Russia).

From this we need to determine the Life Expectancy and Co2 Pcap Con values for the selected country, but this value needs to be associated to every Country in the data set (ie every row of data), so we can use a FIXED LoD.

Selected Country Co2

{FIXED:SUM(IF [Selected Country Set] THEN [Co2 Pcap Cons] END)}

Selected Country Life Expectancy

{FIXED:SUM(IF [Selected Country Set] THEN [Life Expectancy] END)}

With these, we then want to define a min and max range for each measure so we can build the reference bands. The tolerance for this range wasn’t mentioned in the requirements, so I checked the solution to ensure I could validate other calculations later on.

Min Co2

[Selected Country Co2] – 1

Max Co2

[Selected Country Co2] + 1

Min Life Expectancy

[Selected Country Life Expectancy] – 4

Max Life Expectancy

[Selected Country Life Expectancy] + 4

Add all four fields to the Detail shelf.

Add a reference band to the Co2 Pcap Cons axis (right click axis > add reference line). Select band and set it to be from the Min Co2 field to the Mx Co2 field.

Repeat the steps for the Life Expectancy axis, to run from the Min Life Expectancy field to the Max Life Expectancy field.

In the example above, I have Russia as the selected country. We now want to identify all the countries that are falling within the bands.

Within Co2 Band

[Co2 Pcap Cons] <= [Max Co2] AND [Co2 Pcap Cons]>=[Min Co2]

Within Life Expectancy Band

[Life Expectancy]<= [Max Life Expectancy] AND [Life Expectancy] >= [Min Life Expectancy]

And with this, we create another set

Within Band Set

Select the Condition tab, and enter the formula

MIN([Within Co2 Band]) OR MIN([Within Life Expectancy Band])

Add Selected Country Set to Colour, to Size and to Shape. Adjust shape and size to suit. Then add Within Band Set to Detail and then adjust the icon to the left of the pill to the Colour icon, so 2 pills are now on the Colour shelf. Adjust the colours to suit.

Then create

Label – Country

IF [Selected Country Set] THEN [Country] END

And add to the Label shelf. Align bottom centre, and allow labels to overlap other marks.

Hide the Tooltip, Hide all the gridlines and row/column dividers. Format Co2 Pcap Cons and Life Expectancy to 1 dp. Name sheet Scatter or similar.

Building the Average Bar

On a new sheet, add Selected Country Set to Rows. Add Co2 Pcap Cons  and Life Expectancy to Columns and change the aggregation of both from SUM to AVG. Manually reorder the In/Out header so Out is listed first. Show the labels. Add Selected Country Set to Colour on the All marks card and adjust accordingly.

Double click into Columns and type MIN(0.0), then move the pill so it’s the first one listed. Change the mark type of the MIN(0.0) marks card to shape. Add Selected Country Set to shape and adjust.

Create a new field

Header Label

IF [Selected Country Set] THEN [Country] ELSE ‘All others’ END

Add this to the Label shelf of the MIN(0.0) marks card. Align the label middle left.

Edit the MIN(0.0) axis to be fixed from -5 to 1 to shift the display to the right

Then remove the axis title, and set the tick marks to None so the axis for this section is hidden

Add Header Label to the Tooltip on the All marks card, and update the tooltip. Remove all gridlines, row & column dividers and hide the Selected Country Set pill on Rows (uncheck show header). Name the sheet Avg Bar or similar.

Building the Count Bar

On a new sheet, add Within Band Set to Columns and lex.csv(Count) to Rows. Add Within Band Set to Colour and Country to Detail. Adjust Colour and tooltip. Name the sheet Count Bar or similar.

Adding the interactivity

Add the sheets to a dashboard and arrange accordingly, Add a dashboard set action

Select Country

On hover of the Scatter chart, target the Selected Country Set. Only allow single selection. Assign values to the set on hover, and retain the values in the set when the selection is cleared.

And hopefully that should be it. My published viz is here.

Happy vizzin’!

Donna

Can you find how many quarters it took for a customer to order again?

It was Lorna’s turn to set the challenge this week, and she opted for a gentle workout to soothe us into the Christmas holidays – I certainly appreciated it as I am super busy and super stressed trying to finalise Christmas presents and preparations!

I built this with 4 calculations in total. Let’s start by just building out a tabular view of the data, so we know what we’re looking for.

Add Customer ID and Order Date set to be a discrete (blue) exact date onto Rows.

To start with, we want to capture, against each customer, the earliest order date

First Purchase Date

{FIXED [Customer ID]: MIN([Order Date])}

Add this into Rows as a discrete exact date.

You can see that for every row associated to the same customer, the First Purchase Date matches the first row.

Now I want to identify the row that represents the first order for each customer

Order is first order

[Order Date] = [First Purchase Date]

Add this onto Rows and we can see the first row against each customer is True, while the rest are False.

I can now use this to identify the date of the 2nd order for each customer – it is the earliest date where the order isn’t the first order

Second Purchase Date

{FIXED [Customer ID]: MIN(IF NOT [Order is first order] THEN [Order Date] END)}

Add this onto Rows as a discrete exact date, and you can see that every row associated to the customer, now has the date that matches the 2nd Order Date listed.

Remove Order Date and Order is first order from the Rows and we have 1 row per customer with the 2 dates we’re interested in.

From this we can work out

Quarters Since First Purchase

DATEDIFF(‘quarter’, [First Purchase Date], [Second Purchase Date])

Add this onto Text, so you can validate the result.

Now we can build out the matrix.

On a new sheet add First Purchase Date to Rows and set to be at the discrete (blue) quarter level. Then add Quarters Since First Purchase to Columns and change to be a discrete Dimension.

This already gives us the shape we’re after. Right click on the Null value in the column heading and select Edit Alias and update the word to Lapsed.

Add Customer ID to the Colour shelf, and the click on the pill and change to be a Measure of Count Distinct. This will change how the viz is displayed. Adjust the colours accordingly.

Finally update the Tooltip to match the required text, and adjust the formatting so there are pale grey thin dotted row and column dividers on the pane only, but not the header.

Add the viz to a dashboard, and you’re done! My published viz is here.

Happy vizzin’!

Donna

Can you make a pie chart?

In his final challenge for 2022, Luke set this challenge asking us to recreate this pie chart. Although not mentioned in the challenge text, there was a hint on the splash page that map layers would be required.

I’ve only really used map layers in other #WOW challenges, and they actually involve maps. This challenge was obviously a bit different – utilising a functionality built for one purpose in an entirely different way. I remembered when map layers were first released there was a big buzz about the potential possibilities, and had seen some examples, but I’d never gotten round to trying out for myself, so this was the perfect opportunity (and one of the many plus points as to why I love doing #WOW challenges).

So where to start… good question. If you read up on the official Tableau KLs relating to map layers, it’s all about geography, and while the data source does have geographic data (State, City etc), they aren’t relevant in this case. In my ‘googling’ I found the following resources of use

The first 2 blogs helped me understand the need for the use of the MAKEPOINT function, Sam Parson’s Pies & Doughnuts viz helped me understand the calculation I’d need for the MAKEPOINT function, and the final blog post really helped with putting in all together.

Feel free to ignore the rest of this blog and use the above to help you out 🙂

Building the first map layer

The first step is to create the geometry field we need to base this off of.

Zero

MAKEPOINT(0,0)

Double click this field, and it will automatically add the point centrally onto a map with Longitude and Latitude fields automatically generated too.

This is a key step in getting things started and enabling the use of map layers which we’re going to utilise.

Add Segment to Colour and adjust the colours. Change the mark type to pie chart and add Sales to angle. Increase the size to be as large as possible.

However the size isn’t as big as we need. To increase further use Ctrl-Shift-B (windows) or Cmd-Shift-B (mac) to increase the size further (Ctrl-B / Cmd-B) to reduce. This trick I found in the Interworks blog above. All Tableau key shortcuts are listed here.

Add Segment to the Label shelf. This completes our lowest map layer.

Building the second map layer

Drag Zero onto the map canvas, and drop it over the Add a Marks layer section that displays. This will add a second marks card called Zero(2).

On the marks card that is named Zero, rename it to Outer Pie.

On the marks card named Zero (2) rename to White Circle. Change the mark type to Circle, change the Colour to white and increase the size to leave a narrow border of the coloured pie underneath.

Building the third map layer

Drag another instance of Zero onto the canvas and add another marks layer. Rename Zero (3) to Inner Pie. Change the mark type to Pie chart and add Segment to Colour and Sales to angle. Increase the Size so it’s just smaller than the white circle. Change the opacity of the colour to 70% and add a white border (Colour shelf).

Adding the labels in the pie chart

The simplest way to do this is just to label the inner pie chart with the required fields and then manually move the labels from outside the pie to the desired location. However if your data changed in some way, eg the proportion of the slices changed, the labels may not be where you wanted without further tweaks.

So instead I’ve added a 4th map layer.

Add Zero once again to the sheet and add a marks layer. Rename this marks card to Labels -Inner Pie. Change mark type to Pie chart and add Segment to Detail, Sales to Angle and Sales to Label. Create a new calculated field

Pct

SUM([Sales]) / TOTAL(SUM([Sales]))

Format to % with 0 dp and and add to Label. Adjust the fonts of the labels so the Sales value is larger.

Increase the size of the pie chart so the labels are positioned ‘nicely’ within the segments of the Inner pie

Reduce the opacity of the ‘label’ pie chart to 0% and set the mark layer to be disabled

Finishing up

Adjust the tooltips to display as required (you’ll need to add Pct to the Tooltip shelf on both the Outer and Inner Pie mark cards).

Then remove the map background via the Map menu -> Background Maps -> None. Hide all axis and remove all gridlines/zero lines/row/columns dividers. You should now be left with a ‘clean’ pie chart which can be added to a dashboard.

My published viz is here.

Happy vizzin’!

Donna

Which Sub-Categories are frequently ordered together?

Week 49 of 2021, and it was Ann Jackson’s turn to set her final #WOW challenge (sniff sniff!). Ann’s been setting challenges for the last 4 years, and we’re all going to miss her (I’m sure she may pop up as a guest challenger at some point in the future ….).

Ann decided to revisit her first ever challenge which I did complete here.

Obviously in 4 years, Tableau has moved on, and new features make building this concept a bit more straightforward. So let’s crack on.

Modelling the data

Over the years Ann has adapted her style which includes capitalisation of text. Whilst the challenge can be built with the Tableau provided Superstore Sales data source, Ann very kindly provided her own version which already contained the capitalised fields. I used this version.

To do this type of ‘basket analysis’ you need to use 2 instances of the data source which you model using relationships as follows :

Add a relation of Order ID = Order ID and Sub-Category <> Sub-Category

This results in a data pane which lists a set of fields from the Ann Jackson Superstore data source and a set of the same fields from the Ann Jackson Superstore1 data source. In this blog, I’ll reference these data sources as AJS and AJS1. If you examine a single order with multiple sub-categories you’ll see how the data is being related

The first 2 columns above are from the AJS data source – the order contains 4 different sub-categories. The last 2 columns are from the AJS1 data source, and you can see that for each AJS.Sub-Category, only 3 AJS1.Sub-Category records are listed. This allows us to see the combination of what was ordered with what.

Building the matrix

Add AJS.Sub-Category to Rows and AJS1.Sub-Category to Columns and you have the start of your matrix.

The number of orders is displayed in each cell, which is

# Orders

COUNTD([Order ID])

Add this onto Text

This adds Null column, which is the number of orders containing a single Sub-Category.

Right-click on the word Null in the column to select the column, and Edit Alias to rename the Null to NONE.

To remove the top right hand side of the matrix, I indexed each row/column.

Sub Cat Index

INDEX()

Sub Cat 2 Index

INDEX()

Add Sub Cat Index to Rows, change to discrete (blue pill) and move to be in front of AJS.Sub-Category. Change the table calculation so that it is computing by the AJS.Sub-Category field. Each row should now be numbered from 1 to 17.

Now add Sub Cat 2 Index to Columns, again change to discrete and move to be in front of the AJS1.Sub-Category pill. Edit the table calculation to verify it is computing by the AJS1.Sub-Category field. All the columns should now be numbered 1-18.

We want to remove the data for the records where the Row Index is greater then the Column Index

FILTER

[Sub Cat Index]>=[Sub Cat 2 Index]

Add this field to the Filter shelf, and set to true.

To colour the cells, add # Orders to the Colour shelf, and change the mark type to Square. Then edit the colour palette to use the Colour Brewer Blue Purple colour palette Ann advises (she’s used these palettes in the past, so I already have them installed, but you should be able to get them quickly from here).

Format the text to match mark colour and to be aligned centre. I also set the font to be Tableau Medium and bold. Set the column and row border divider lines to white.

Now uncheck show header against the two Sub Cat Index fields, and hide labels for column/ rows on the other column/row headings to remove them. Rotate the label for the columns and adjust the alignment, Adjust the font on both the column and row headers to bold and you should have the finished visual.

The Tooltips

We need to calculate the average sales contribution per order. We need 2 calculations for this, as we need to show a value for the AJS.Sub-Category on the rows and a value for the AJS1.Sub-Category displayed on the columns.

Average Sales (Rows)

SUM([Sales])/[# Orders]

Average Sales (Cols)

SUM([Sales (Ann Jackson Superstore1)])/[# Orders]

Add both of these to the Tooltip shelf. It’ll make a Null row appear – just right click and exclude.

Ann’s also been particular about some of the text displayed, so we need calculated fields to apply the required logic.

TOOLTIP – Sub Cat 2 Title

“& ” + [Sub-Category (Ann Jackson Superstore1)]

TOOLTIP- Sub Cat Text

IF [Sub Cat 2 Index]=1 THEN “SUB-CATEGORY” ELSE “SUB-CATEGORIES” END

TOOLTIP – Sub Cat 2

IF NOT(ISNULL([Sub-Category (Ann Jackson Superstore1)])) THEN [Sub-Category (Ann Jackson Superstore1)] + “:” END

Add all three fields to the Tooltip shelf, and adjust the tooltip as below

And with that, you should now have the fully completed challenge. My published version is here.

Finally, once again I want to thank Ann for all her contributions to the #WOW community over the last 4 years. I will miss her capitals and bold colour palettes – I’ll be storing away her pre-capitalised version of Superstore for sure :-)!

Happy vizzin’!

Donna

Can you toggle between AND|OR filtering logic?

Week 49 of #WOW2020 was a guest challenge by Sam Epley to demonstrate the ability to toggle between filtering using AND logic and filtering using OR logic. Sam stated that it had been something he and colleagues had been pondering for a while but v2020.2 provided a possible solution.

So the first thing I did was a quick check on what functionality had been released in v2020.2 which included set controls. I also noticed when examining the published solution, that if I hovered over the Select Values filter controls, a small set icon would appear, which indicated these filters were indeed based on set controls

This blog will focus on

  • Select a Field for Slicer x parameter
  • Select Values for Slicer x filter
  • Building the Logic

Select a Field for Slicer x parameter

I had a slight ‘moment’ with this, until I realised I just had to copy the ╚ character and paste it into the parameter list I created

You’ll need to create 3 instances of this parameter (or create once, duplicate and rename).

Select Values for Slicer x filter

Note – what I describe here will need to be duplicated for each slicer.

There’s a chance I may well have been a bit long-winded in how I went about this, but it worked for me…

First up, I extracted the value selected in the Select a Field parameter, only returning a value if it contained the special ╚ character. So this would return the word ‘Region’ or ‘Segment’ but would return NULL if ‘Location’ or ‘Customer’ had been selected.

Selected Slicer 1

IF CONTAINS([Select a Field for Slicer 1:],’╚ ‘) THEN REPLACE([Select a Field for Slicer 1:],’╚ ‘,”)
ELSE NULL
END

I then needed to ‘map’ this value to the actual field in the data source, so I could get a handle on the actual values associated to the field

Selected Slicer 1 Values

IF [Selected Slicer 1] = ‘Region’ THEN [Region]
ELSEIF [Selected Slicer 1] = ‘State’ THEN [State]
ELSEIF [Selected Slicer 1] = ‘Category’ THEN [Category]
ELSEIF [Selected Slicer 1] = ‘Sub-Category’ THEN [Sub-Category]
ELSEIF [Selected Slicer 1] = ‘Segment’ THEN [Segment]
ELSEIF [Selected Slicer 1] = ‘Ship Mode’ THEN [Ship Mode]
ELSE ‘No Selection’
END

From this I could create a set to store the values, by right-clicking on the Selected Slicer 1 Values field and choosing Create -> Set, and selecting the Use all option.

The values in the set can then be accessible on the sheet foe selection by right-clicking on the set field and choosing Show Set

Building the Logic

To help with this, you can build out a basic view by Region that returns In or Out for each of the sets

As you can see when everything is set to ‘None Selected’, everything is In.

As we change the filter options, you can see the values change to be In and Out

For the AND logic to work, we’re looking for the rows of data where every In/Out column is In.

For the OR logic to work, we’re looking for the rows of data where In exists in at least one column.

However, when there is a No Selection option selected, all values are In the set, and when we’re using the OR logic, we don’t want these.

So we need to identify when No Selection is selected in the filter

Slicer 1 is Not Selected?

[Selected Slicer 1 Values] = ‘No Selection’

and we need this for the other slicers too.

Then we need to build a new calculated field that is going to handle all this logic, which is driven by a pLogic parameter which simply contains the values of AND and OR.

In Combined Set

IF [pLogic]=’AND’ THEN [Select Values for Slicer 1:] AND [Select Values for Slicer 2:] AND [Select Values for Slicer 3:]
ELSE //It’s OR
IF [Slicer 2 Is Not Selected?] AND [Slicer 3 Is Not Selected?] THEN [Select Values for Slicer 1:]
ELSEIF [Slicer 1 Is Not Selected?] AND [Slicer 3 Is Not Selected?] THEN [Select Values for Slicer 2:]
ELSEIF [Slicer 1 Is Not Selected?] AND [Slicer 2 Is Not Selected?] THEN [Select Values for Slicer 3:]
ELSEIF [Slicer 3 Is Not Selected?] THEN [Select Values for Slicer 1:] OR [Select Values for Slicer 2:]
ELSEIF [Slicer 2 Is Not Selected?] THEN [Select Values for Slicer 1:] OR [Select Values for Slicer 3:]
ELSEIF [Slicer 1 Is Not Selected?] THEN [Select Values for Slicer 2:] OR [Select Values for Slicer 3:]
ELSE [Select Values for Slicer 1:] OR [Select Values for Slicer 2:] OR [Select Values for Slicer 3:]
END
END

This field returns a true or false against each row in the data set, and can be used to build the bar chart display.

This covers the most ‘complex’ bit of this challenge – below shows what one of the charts looks like

If you’re not familiar with how to use measure swapping, which is another feature of this challenge, then check out a previous blog I wrote.

I also created a field to add to the Tooltip to show a $ symbol in the event the measure selected was Sum of Sales.

My published solution is here.

Happy vizzin’! Stay Safe!

Donna