
The challenge involves the following key components
- modelling the data
- building the scatter chart
- displaying the year in the chart
- highlighting the regions
Modelling the Data
The data provided consisted of 4 files
- Life Expectancy
- Population
- Income
- Region Mapping
The first 3 files are stored as a matrix of Country (rows) by Year (columns) with the appropriate measure in the intersection. This isn’t the ideal format for Tableau so the data needs to pivoted to show 3 columns – Country, Year and the relevant measure (life expectancy, population or income depending on which file is referenced).
In Tableau Desktop, connect to the Life Expectancy file. If you find the column names all seem to nave generic labels, then right click on the ‘object’ in the data source pane, and select Field names are in the first row.

This will change so all the years are now listed as column headings


Your data will be reshaped into 3 columns

I then renamed each column as
- Country
- Year
- Life Expectancy
I also chose to change the datatype of the Year column to be a number (whole), as I knew I’d be needing to relate the data on this field later, and working with numeric data is more efficient than strings.
Now I want to bring in the next file – Population (via the Add connection option). But before following these steps, just read on a bit, as I hit a hurdle, which at the point of writing I’m still a bit unclear about…..
I added the Population file into the data source pane. By default it showed the Edit Relationship dialog, but I had the same issues with the column headings.

So I closed the dialog, and fixed the headings by setting the Field names are the first row option. Trying to apply the pivot at this point wasn’t available – no Pivot option when I selected all the columns. So I then manually edited the relationship and set County = country (population)


I’m not sure whether this is possible (Google hasn’t shown me that it isn’t, but it hasn’t shown me that it is either…). I’ve tried using csv and excel versions of the file…..
So until I see the official solution (or I find out otherwise), I carried out the following approach instead :
I removed Population object, so I was just left with the Life Expectancy object. I then went to sheet 1, right clicked on the data source and selected Extract Data -> Extract and when prompted saved the file as Life Expectancy.hyper.
I then started a new instance of Desktop, and connected to the Population file, and went through the steps outlined above – set the Field names are first row, selected all the years to pivot, renamed the fields, set the datatype of the Year field to a whole number, then extracted and saved as Population.hyper.
I then repeated the exact same process with the Income file.
Once I had the 3 hyper files created, I then instantiated another instance of Desktop, and connected to the Life Expectancy.hyper file.
This automatically added an object called Extract into the pane. I renamed this (via right click) to DS-Life Expectancy. I also found my field names hadn’t been retained, so I renamed these too.

I then added the Population.hyper file. I set the relationships as below (once again I’d lost field names 😦 )

I then renamed the object & columns – note it won’t let you add fields with the same name as in the other object in the data pane, so I suffixed the columns

The process was once again repeated to add the Income data

and finally the Region data was also added matching Country to Name

And now, after all that, I have the data in the format I need to build the scatter. Phew!
[Side Note] – whilst writing this blog, I have been messaging my fellow #WOW participant Rosario Gauna. We often check in with each other, if we’ve noticed something odd with the challenges, to sense check whether its an issue with the data/the requirement, or just our own personal interpretation/development. In this instance we confirmed we were both using v2020.4.2, BUT Rosario had been able to do exactly what I set out to do at the start of this blog!! She had a pivot option on her 2nd connection. We’ve tried to figure it out, but have been unable to… I am using Windows, but Rosario is on a Mac, so there might be something there….. It will be interesting to understand if anyone else encountered my problem…. I really didn’t think the modelling part of the challenge was intended to be as fiddly as I’ve described above.
Building the Scatter Chart
The data we have goes beyond 2021, so we need to filter. Add Year to the Filter shelf and set the condition as below

Set both these filters to Apply to All Using this Datasource as they’re relevant across all sheets that will be built.
Now build the scatter by
- Add Year to the Pages shelf
- Add Income to Columns and change to Dimension (rather than Sum) – still leave to be Continuous
- Add Life Expectancy to Rows and change to Dimension (rather than Sum) – still leave to be Continuous
- Add Country to Detail
If you change the Year on the pages control to 2021 you should get something like

If you examine the Income axis on the solution, you’ll see the scale isn’t uniform. This is because it’s using a logarithmic scale instead, which you can set by right-clicking on the Income axis -> Edit Axis and selecting the relevant checkbox.

Now we can add Population to Size, change the mark type to circle, and we also want to colour by the region, but we need a new field for this to get it in the format we need for tooltips etc:
Region
UPPER(REPLACE([Eight Regions],’_’,’ ‘))
Add this to Colour and adjust accordingly, reducing the transparency a bit. Format the tooltips and you should have the main chart

Displaying the Year in the Chart
On a new sheet build a very simple view that just shows the Year in the Text shelf, that has been formatted to a very large (72pt) point, centred in the middle, and a pale grey colour.

On the dashboard, add this Year sheet. Then when you add the Scatter sheet, add it as a Floating object and position directly on top of the Year sheet. The Year sheet won’t be visible, so navigate to your Scatter sheet, right click on the canvas and format. Set the background of the worksheet to None ie transparent

If you return to the dashboard, the Year sheet should now be visible.
Note, as you add more objects to your dashboard, you will have to move things around a bit to get the layout as desired.
Highlighting the Regions
Create a simple table that displays the Regions in a row as below (the Show Header on the Region pill on the Columns shelf is unchecked).

Add this to the dashboard, then add a Highlight Dashboard Action that sources from the Region view and affects/targets the Scatter chart.

The final comment to make is that you can’t remove the ‘speed control’ from the page control display. However, when published to Tableau Public, it doesn’t show, so I assume is not enabled (or broken) there.

Hopefully this now gives you what’s needed to rebuild this fabulous viz. My published version is here.
Please do let me know if you had issues with your data modelling like I did!
Happy vizzin’! Stay Safe!
Donna