Connected Scatterplots

Sean set this challenge this week, to build a connected scatterplot to allow additional insights to be gained.

We need to show a circle per country related to the specified year, and then show the data for all years if a country is specified from the drop down, or ‘clicked on’ by the user; these data points are all then connected.

Let’s start by building up the various parameters and calculated fields needed to help with this.

Setting up the data

For the user inputs, I used parameters

pYear

Integer parameter, defaulted to 2000, displayed in a format so no thousand separators are shown. I populated the list using the values from the Year field.

pCountry

string parameter defaulted to All. I populated the list of entries by first adding values from the Country field. I then manually added an All entry to the bottom of the list and dragged it to the top. I could then set All as the default value.

On a new sheet, show these parameters.

We’re going to use a dual axis chart to display the viz, and for this, we’re going to get the relevant measures for the specific Year and for the specific Country.

To see what I’m aiming for, lets’ build out the data in a table. Add Country to Columns and Year to Rows. Display the values of Fertility Rate and Life Expectancy. This just gives us all the data points

But we only want the points related to the pYear (2000) or if pCountry if it’s not All (in this case Afghanistan).

So we create

Fertility Rate for Year

[Year] = [pYear] THEN [Fertility Rate] END

Life Expectancy for Year

IF [Year] = [pYear] THEN [Life Expectancy] END

format these to 1 dp and then add to the table. The fields only contain values for the specified pYear.

Create

Fertility Rate for Country

IF [Country] = [pCountry] THEN [Fertility Rate] END

Life Expectancy for Country

IF [Country] = [pCountry] THEN [Life Expectancy] END

format these to 1 dp and also add to the table. We now have these entries only existing for the selected pCountry.

If pCountry is set to All, the Fertility Rate for Country and Life Expectancy for Country are empty for every Country.

We now have the basics we need to build the viz.

Building the ScatterPlot

On a new sheet, show the parameters, then add Fertility Rate for Year to Columns and Life Expectancy for Year to Rows and Country to Detail. Change the mark type to circle. Adjust the Tooltip.

Create a new field

All Countries

[pCountry] = ‘All’

and add to the Colour shelf. Adjust the colours so when pCountry = All, the All Countries colour legend is True and displays a darker instance of a colour, as opposed to when pCountry is set to something else, and the All Countries colour legend is false.

Now add Fertility Rate for Country to Columns and Life Expectancy for Country to Rows. Change both fields to be Dimensions. On the 2nd marks card, add Year to Detail. and remove the All Countries field from the Colour shelf. Change the mark type to line and move Year to Path. Set the colour accordingly.

Then set both the Rows and Columns to be Dual Axis and synchronise both axis. Remove Measure Names from the colour shelf on the all marks card.

Adjust the Tooltip of the 2nd (line) marks card.

Add Year to the Label shelf of the 2nd marks card and update so it just displays for the Min & Max value. Adjust font size and style.

On the first marks card (the circle) add Country to Label and adjust so it only displays when selected.

Hide the right and top axis. Remove row & column dividers. Hide the null indicator and update the title of the axes. Name the sheet Scatterplot or similar.

Building the Dashboard

Add the sheet to a dashboard, and float the parameters into a suitable location. Add a floating text box that references the pYear parameter and position bottom left of the chart. Add a parameter action to update the pCountry parameter when a circle is clicked.

Click Country

On select of the Scatterplot Viz, set the pCountry parameter, passing in the value from the Country field. When cleared, set the parameter back to All.

Finally, if you click a circle to select a Country, you’ll find that the circles ‘fade out’ more than what you want – you want them to look the same as the colour when a country is selected via the dropdown. Essentially, you want all the circles to be ‘highlighted’ on click. To do this, create a new field

HL

“Highlight”

and add this to the Detail shelf on the scatterplot sheet. Then on the dashboard, add a Highlight action

HL marks on click

On select of the scatterplot viz, target itself, highlighting the HL selected field

And that should be it. My published viz is here.

Happy vizzin’!

Donna

Is Profit Ratio Influenced by Quantity?

For this week’s challenge, Ann asked we recreated a scatter plot that ‘on click’ became a connected scatter plot and also displayed an ‘insights’ box to the user.

Building out the data

The scatter plot is to show Profit Ratio by Quantity, so first up we need

Profit Ratio

SUM([Profit])/SUM([Sales])

formatted to percentage with 0 dp.

A mark is shown on the scatter plot per Category and Month/Year. When working with dates like this, I often prefer to created a dedicated field to use as an ‘exact date’ which I can then format. So I built

Month Year

DATE(DATETRUNC(‘month’,[Order Date]))

and formatted it to the March 2001 format (ie mmmm yyyy).

We’re also going to need a set based on the Category field, as that is going to drive the interactivity to help us connect the dots ‘on click’. So right click on Category and Create Set and just tick one of the categories displayed for now.

Selected Category

Let’s build these out into a table as follows

  • Order Date to Filter, filtered by years 2018 & 2019
  • Category, Selected Category & Month Year (exact date , discrete) onto Rows
  • Measure Names to Text and Filter to Quantity & Profit Ratio
  • Right click on Selected Category and select Show Set to display a selection control to easily allow the set values to be changed

To help drive the interactivity, we’re going to need to store the value of the Profit Ratio for the Selected Category in the set.

Selected PR

IF ATTR([Selected Category]) THEN [Profit Ratio] END

Format this to percentage 0 dp too, and add this to the table – it will only display the profit ratio against the rows associated to the category in the set

At this point, we’ve got what we need to build out the scatter plot, but I’m going to continue to create the fields needed for the ‘Insights’ piece.

These are looking for the months that contained the Max & Min values for the Profit Ratio and Quantity. I’m going to tackle this with Table Calcs.

Max PR per Category

WINDOW_MAX([Profit Ratio])

Adding this to the table, and setting the table calculation to compute by Month Year only, gives us the Max Profit Ratio for each Category

We need 3 more fields like this :

Max Qty per Category

WINDOW_MAX(SUM([Quantity]))

Min PR per Category

WINDOW_MIN([Profit Ratio])

Min Qty per Category

WINDOW_MIN(SUM([Quantity]))

Add all these to the table, setting the table calc in the same way as described above.

Now we know what the min & max values are, we need to know the month when these occurred.

Max PR Month

WINDOW_MAX(IF ATTR([Selected Category]) AND [Profit Ratio] = [Max PR per Category] THEN MIN([Month Year]) END)

Format this to March 2001 format.

If the category is that selected, and the profit ratio matches the maximum value, then retrieve the Month Year value, and duplicate that value across all the rows (this is what the outer WINDOW_MAX does).

Add this to the table in the Rows, and adjust the table calc settings for both the nested calculations to compute by Month Year only. The month for the selected category which had the highest profit ratio will be displayed against all rows for that category, and null for all other rows.

Again we need similar fields for the other months

Max Qty Month

WINDOW_MAX(IF ATTR([Selected Category]) AND SUM([Quantity]) = [Max Qty per Category] THEN MIN([Month Year]) END)

Min PR Month

WINDOW_MAX(IF ATTR([Selected Category]) AND [Profit Ratio] = [Min PR per Category] THEN MIN([Month Year]) END)

Min Qty Month

WINDOW_MAX(IF ATTR([Selected Category]) AND SUM([Quantity]) = [Min Qty per Category] THEN MIN([Month Year]) END)

Format all these too, and add to the table with the appropriate table calc settings applied.

If you now change the value in the the Selected Category set, you should see the table update.

Building the Scatter Plot

On a new sheet, build out the basic scatter plot by

  • Order Date to Filter, filtered by years 2018 & 2019
  • Profit Ratio to Rows
  • Quantity to Columns
  • Category to Colour and adjust accordingly
  • Month Year exact date, continuous to Detail
  • Change Shape to filled circle
  • Adjust the Tooltip to suit
  • Change the axis titles to be capitalised
  • Change the sheet title
  • Remove all row/column borders & gridlines. Leave the zero lines and set the axis rulers to be a sold grey line

Add Selected PR to Rows, and change the mark type of that card to line, and add Month Year to path. In the image below, I still have Furniture selected in my set.

Add Month Year to the Label shelf too and format so the text is smaller and coloured to match mark colour

Make the chart dual axis, synchronise the axis and remove the Measure Names from the colour shelf of each mark. Hide the Selected PR axis.

If you now change the selected set value, you can see how the other categories ‘join up’.

Building the Insights Box

Take a duplicate of the data table built originally, and add Selected Category to Filter. If you’ve still got an entry in your set, this should reduce the rows to just those records for that Category.

Remove the fields Profit Ratio, Quantity, Selected PR from the Measure Values section, and remove Selected Category from Rows.

Create a new field

Index = Size

INDEX()=SIZE()

and add this to the Filter shelf, setting the value to true.

The INDEX() function will number each row, the SIZE() function will number each row based on the total number of rows displayed (ie all rows will display the same number), so INDEX()=SIZE() will return true for the last row only

Now we’ve only got 1 row displayed, we’ve got the info we need to build the data in the Insights box by

  • Move Month Year to Detail
  • Category to Colour
  • Move Max PR Month, Max Qty Month, Min PR Month, Min Qty Month to Text
  • Move Max PR per Category, Max Qty per Category, Min PR Per Category, Min Qty per Category to Text

This will probably show duplicate instances of the values. This is because we didn’t fix the table calculation setting of the Index=Size field. Edit that to compute by Month Year

and then re-edit the Index=Size filter to just be true again, and expand the display so you can see all the values.

Now you can easily format the Text to the required display

I used this site to get the bullet point image from

Building the dashboard & interactivity

Add the scatter plot to the dashboard, and remove the legends that automatically get added.

Float the Insights sheet and position bottom right, remove the title. fit entire view, and also remove the colour legend which will have been added.

Add a dashboard Set Action to assign values to the Selected Category set and remove all values when the selection is cleared

Additionally, add a highlight action that highlights the Category field only (this keeps the lines highlighted on selection of a single mark).

Now practice clicking on and off marks on the scatter plot and you should see your lines & insights box being added and removed.

Finally add the title and header line by adding a vertical container at the top above the scatter plot.

Add a text box for the title, and underneath add a blank object.

Set the padding of the blank object to 0 and the background to black. Then edit the height to 4, and adjust the height of the container if it’s now too large.

Make adjustments to the position of the floating insights box if need be so it isn’t overlapping any axis.

And that should be it! My published viz is here.

I’ll be taking a break for a couple of weeks, so will be on catch up when I return from some R&R.

Happy vizzin’! Stay Safe!

Donna