Visualising Ranges

This week’s challenge was set by Sean and was a recreation of a PowerBI WOW challenge set a few weeks ago. The data set just contains 2 pieces of information – a datetime field and a glucose measurement.

Creating the calculations

On a new sheet, add Device Timestamp to Rows as a discrete (blue) pill at the hour level.

Then create the following fields, which are aggregations of the glucose levels at each hourly time period.

Minimum

MIN([Glucose mg/dL])

5th Percentile

PERCENTILE([Glucose mg/dL],0.05)

25th Percentile

PERCENTILE([Glucose mg/dL],0.25)

Median

PERCENTILE([Glucose mg/dL],0.5)

75th percentile

PERCENTILE([Glucose mg/dL],0.75)

95th percentile

PERCENTILE([Glucose mg/dL],0.95)

Maximum

MAX([Glucose mg/dL])

Add all these fields to the table and then format the Hour field to use the 12hr format by default

Building the Viz

On a new sheet, add Device Timestamp as a continuous (green) pill at the hour level to Columns. Then add Median to Rows. Set the mark type to line and set the path to be a dashed line and colour  black. Format the Device Timestamp pill so the default format on the pane is 12hr format (axis should remain as numeric).

Add Minimum to Rows to create a second marks card. Change the mark type to Area and set the colour to be 100% opacity.

Then drag 5th Percentile to the Minimum axis and drop when the ‘2 green column’ symbol appears.

This will automatically add Measure Values to Rows instead and Measure Names to Colour. Add 25th Percentile, 75th Percentile, 95th Percentile and Maximum to the axis too. Reorder the fields in the Measure Values section so the fields are listed in the appropriate order from Min to Max. Adjust the Colours accordingly. Unstack the marks (Analysis -> Stack Marks -> Off).

Add Measure Names to Label and align middle right. Allow labels to overlap marks.

Add Measure Names to the Label on the Median marks card. Also align middle right and format to be underlined.

Make the chart dual axis, synchronise the axis, and then right click on the right hand axis and move marks to back.

Hide the right hand axis (uncheck show header). Edit the Device Timestamp axis to fix from 0 to 26, set the axis to display a value every 3 hours and remove the title.

Add constant Reference Lines to the left hand axis for the values 80 and 180

Edit the left hand axis title. Remove row and column dividers. Add axis rulers. Add a title.

Add to a dashboard and then publish. My published version is here.

Note – the vertical grid lines that appear on the Tableau Public version are a nuance of Tableau Public. I originally recreated on Desktop thinking it was a requirement of the challenge, by adding multiple constant reference lines for each hour. But when I checked Sean’s solution after publishing, I found he didn’t have any and the intention was not to have any.

Happy vizzin’!

Donna

Can you build a reference box?

Erica Hughes set her first #WOW challenge this week, which is all about reference lines. The intention by the #WOW crew is to use this initial challenge as the basis for the next few weeks, so it’s going to be interesting to see how that works out.

But back to this week. We’re building a basic Sales vs Profit Ratio scatter plot by State, but then need various calculated fields to plot the reference lines and the reference box.

  • Building the basic chart
  • Adding the dotted reference lines
  • Adding the reference box
  • Making the y-axis symmetrical
  • Highlighting the selected state

Building the basic chart

First up we need to create our favourite calculated field

Profit Ratio

SUM([Profit]) / SUM( [Sales])

format to percentage with 0 dp.

Create the basic scatter plot as below

Remove all gridlines, zero lines and axis rulers and tick marks.

Adding the dotted reference lines

These lines are the median values for the Profit Ratio and Sales.

Sales – Median

WINDOW_MEDIAN(SUM([Sales]))

format to Custom Currency with 1 dp, $ prefix and display units to Thousands (K).

PR- Median

WINDOW_MEDIAN([Profit Ratio])

format to percentage with 0 dp.

Add both these fields to the Detail shelf, and adjust the table calculation settings of both fields so they are computing by State

Add a reference line to the Sales axis (I do this by right clicking on the axis > Add Reference Line, but you can also drag from the Analytics pane).

Set the reference line to be against the Entire Table, using the Sales – Median field, no label, and the tooltip set to custom as below. Set the line to be dotted.

Apply a similar reference line to the Profit Ratio axis, using the PR – Median value instead.

Adding the reference box

I have to admit, this bit took a bit of trial and error. I knew it was going to involve a combination of bands and lines and colouring above and below, but things didn’t always go to plan. It was this post by Jonathan Allenby that helped.

The block is basically bounded by the 25th & 75th percentile of the Sales and Profit Ratio values. So lets create them

Sales – 25th Percentile

WINDOW_PERCENTILE(SUM([Sales]),0.25)

Sales – 75th Percentile

WINDOW_PERCENTILE(SUM([Sales]),0.75)

Add these fields to the Detail shelf and set the table calculation on both to compute by State.

Add a reference line again to the Sales axis, and this time select Band that goes from the Sales – 25th Percentile to the Sales – 75th Percentile and fill based on the colour stated in the requirements.

Now add a reference line to the Profit Ratio axis. This time select Distribution.

Select the computation to be Percentiles and enter 25 as the value. Set Label, Tooltip and Line to None. Now this is the odd bit… set the Fill to the colour white, then check the Fill Below box, which then seems to set the Fill option to ‘No Fill’. Doing this seems to have the desired effect. Trying this in any other order, or manually selecting the Fill option to be No Fill, just doesn’t seem to work…

Now repeat the process adding another reference line on the Profit Ratio axis, and this time create a distribution for the 75th percentile. This time, after setting the Fill to white, check the Fill Above checkbox. Weirdly this time, the fill on my laptop got set the Grey Light, although it was correctly displaying as white on the screen. I manually changed it to ‘No Fill’.

It feels like there’s something a bit ‘buggy’ with all this, which might explain while my initial attempts were failing. I’ll be interested in knowing if you see this behaviour too (I created on v 2021.4.2).

Making the y-axis symmetrical

This was a ‘bonus’ feature, but is again achieved with a reference line. What we’re looking for here is to understand what is the maximum absolute value so we can determine where to place a hidden reference line – ie if the maximum profit ratio is 20% and the minimum is -30%, the maximum absolute value is 30%, and we’d need a reference line +30%, so the axis extends from -30% to +30%. Conversely if the maximum profit ratio is 40% and the minimum is -25%, the maximum absolute value is 40% and we need a reference line at -40% for symmetry.

I’ve encapsulated all this logic within one field below

Ref Line – Profit Ratio Symmetry

IF MAX(WINDOW_MAX([Profit Ratio]), ABS(WINDOW_MIN([Profit Ratio]))) = WINDOW_MAX([Profit Ratio]) THEN -1*WINDOW_MAX([Profit Ratio]) ELSE -1*WINDOW_MIN([Profit Ratio])
END

The statement MAX(WINDOW_MAX([Profit Ratio]), ABS(WINDOW_MIN([Profit Ratio]))) is returning the maximum of the max and min values (ie MAX(20, ABS(-30)) = MAX(20,30) = 30, MAX(40, ABS(-25)) = MAX(40,25) = 40.

Add this onto the Detail shelf and add another reference line to the Profit Ratio axis, setting it to use the Minimum of the Ref Line – Profit Ratio Symmetry field

Test how the field works, by selecting a few of the State marks towards the top and exclude; the axis should change, but still be symmetrical.

Highlighting the selected state

For this we need a parameter

pSelectedState

string parameter defaulted to Virginia

Then we need a field to identify which state has been captured

Selected State

[State]=[pSelectedState]

which returns a boolean. Change the mark type to Circle and add Selected State to the Colour shelf, and adjust accordingly (add a border to the circle too via the Colour shelf). Also add Selected State to the Size shelf and again tweak.

Once the sheet has been added to the dashboard, create a dashboard parameter action that passes the State field into the pSelectedState parameter on Select of the chart.

The final step to add is to stop all the other marks from ‘fading’ out when a State mark is selected. This is achieved by creating the following calculated fields

True

True

False

False

Add both of these to the Detail shelf of the scatter chart.

On the dashboard, create a dashboard filter action as below, which passes selected fields setting True = False, which can never be true and prevents the mark from being highlighted.

And with that, you should have the desired dashboard. I’m interested to see if this matches Erica’s solution (it’s likely I’ll start with the provided workbook next week, rather than mine, just in case there are some discrepancies – eg I’ve used a lot of table calcs… LoDs may have been possible…)

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you make Spine Charts?

Sean Miller provided the challenge for this week, resurrecting a challenge originally set by Emma Whyte in 2017. Revisiting these older challenges is great fun, as often newer product features provide a different way of solving. For me, I also like the fact I know I’ve already solved it once, and have my own work to reference if I get stuck – ha ha!

Sean hinted that this wasn’t a challenge to ‘overthink’ – no table calcs or LoDs required. You need to be able to display average responses per question per university alongside the overall average response for the question. Simply filtering by university isn’t going to cut it, as the quick filter will immediately eliminate all the data that isn’t associated to the selected university, which means you can’t compute an ‘overall average’ without using LoDs.

The key to this challenge is to use a parameter to drive the University selection. Create this by right clicking on the University field -> Create > Parameter. This will create the parameter dialog box, prepopulated with all the university values. Set the default to University of Liverpool.

pUniversity

With this, we can now create calculated fields to store the values associated to the selected university only.

Sample Size

AVG(IF [University]=[pUniversity ]THEN [Sample Size] END)

Note 1, there is already field called Sample Size in the data set. The actual name of this field is <space>Sample Size<space> which Tableau sees as a different name. In hindsight I should have just renamed the original field, so I could then have ‘Sample Size‘. Be mindful of this when I refer to the field later; unless I call it out, I’m referring to my version.

Note 2, I chose to apply the AVG aggregation within the calc rather than changing the default aggregation on the pill when added to the view. There was a reason I did this, but I can’t recall what it was, and think it wasn’t necessary in the end….

University Avg

AVG(IF [University] = [pUniversity ] THEN [% Agree] END)

formatted to percentage, 1 dp

We can also then define the overall average for comparison

Overall Avg

AVG([% Agree])

formatted to percentage, 1 dp

and with that can calculate the variance between the two

Delta

([University Avg]) – ([Overall Avg])

This is custom formatted to 0.00%▲;0.00%▼ (I use this site to get the arrow characters)

And then we need a field to define how the mark needs to be coloured

Colour

[Delta]>=0

We can put these all out in a view

  • Question Number (which I renamed to No) on Rows
  • Question Text on Rows
  • Sample Size on Rows (set to be a discrete blue pill)
  • University Avg on Rows (set to be discrete)
  • Overall Avg on Rows (set to be discrete)
  • Delta on Rows (set to be discrete)
  • University Avg on Columns (continuous green pill)
  • Change Mark Type to Circle
  • Add Colour to the Colour shelf and adjust

Now we need to work on adding the various lines and bands on the chart. This is all managed by adding reference lines (or bands).

Drag % Agree to the Detail shelf, and change to be AVG.

The drag the same field % Agree on Detail again, this time change to MIN. Repeat again, and change to MAX.

Right click on the University Avg axis > Add Reference Line. Create a line, per pane using the AVG(% Agree) field.

Add another reference line (by right clicking on the axis again). This time create a band that starts at MIN(% Agree) and ends at MAX(% Agree). Set the Fill colour to light grey.

We need to create some new fields for the quartile values.

Lower Quartile

PERCENTILE([% Agree],0.25)

Upper Quartile

PERCENTILE([% Agree],0.75)

Add both these fields to the Detail shelf again, then add another reference line (band) similar to that above, but referencing the quartile fields. Set the Fill colour to be a darker grey.

Adjust the formatting and set the tooltips and you’ve got the main chart…. well almost…

In the solution, the first column, the question no, is not labelled. I couldn’t figure out how to do this, which is why I relabelled to simply No. I tried various things, including using text boxes as column headings on the dashboard, but the layout just didn’t work.

BUT I’ve now found out how to do it… because I googled, and I didn’t yesterday when I was building 😦 Andy Kriebel explains it all here. He searches for a ‘zero width space’ character on this site , and then copies the resulting ‘character image’ displayed and pastes into the label of a calculated field. Watch the video to see it in action, but I’ve noted the steps here, just as much for my own benefit when I can’t remember what to do in future… I can see this type of feature cropping up often 🙂

The legend utilises a lot of the concepts above, but we don’t what the mark changing with each university selection. So let’s just hardcode

Legend Avg

AVG(IF [University] = ‘Middlesex University’ THEN [% Agree] END)

and we’ll need a dedicated field for the colour

Legend Colour

[Legend Avg] – [Overall Avg] >=0

The legend sheet can then be built just by plotting the Legend Avg pill on the Columns shelf, with a mark type of circle, the Legend Colour on the Colour shelf, and the same pills used in the reference lines above on the Detail shelf.

When adding the reference lines and bands this time, you will need to add labels and format their position.

The quartile band, also has dotted lines indicating the end of the band, which you can apply as part of the band properties. However the quartile band also has one label left aligned, while the other is right aligned. For a single reference band, the labels can either both be formatted left aligned or both right aligned. To resolve this, don’t add a label to the ‘band to’ section. Create another reference line for the Upper Quartile value, and you can then format the label of this independently.

My published via associated to this challenge is here.

My version based on the original challenge from 2017 is here. The requirement was a bit more complicated it would seem, and it looks like I utilised FIXED LODs quite heavily.

Happy vizzin’! Stay Safe!

Donna

Can you find all counties within n miles of a selected county?

I’ve been on annual leave for a few weeks. I’ve managed to catch up on all the challenges but haven’t blogged a solution for a while. It’s been a real struggle to get back into things to be honest – back to work, back to school, back to football clubs for my kids, and I’m wondering where I found the time before 😦

Trying to think about when/if I was going to get this blog out has caused me a bit of stress, which I don’t need, and so I need to change my mindset a bit…try to relax a bit … if I don’t manage to post a blog, just accept it and move on. I’ve also got to try to reduce the time it takes me to blog. I’m a ‘detail’ person, so often end up documenting to such a low level, but for my own sanity, I’m going to have to make an effort to be a bit briefer. I can’t guarantee I’ll stick to this though.. will just have to see how things pan out.

Going forward, I’m going to try to focus on the points that I think are key to the challenge, or those I found a bit tricksy.

So onto this week’s challenge. Sean Miller returned as guest poster with a COVID-19 related distance challenge.

Note – I connected to the provided hyper extract file rather than the csv file.

The core points I’m going to discuss are

  • Identify counties within n miles
  • How to select a county without set actions
  • How to handle ‘All’ counties being selected
  • Colouring based on ‘percentile’ of hospital beds
  • How to only show county borders of counties within selected range

Identify counties within n miles

The number of miles is stored within an integer parameter, n miles, that is defaulted to 100.

A county to act as the ‘start point’ is stored in a set, Selected County, based on the County, State field.

The long/lat coordinates of this selected county need to be captured.

Selected County Lat

{FIXED : AVG(IF [Selected County] THEN [Latitude] END)}

Selected County Long

{FIXED : AVG(IF [Selected County] THEN [Longitude] END)}

By using a FIXED LoD calculation, the values are stored against every ‘row’ in the data set.

With these, the starting point can be determined

Selected County Start Point

MAKEPOINT([Selected County Lat],[Selected County Long])

The position of every other county – the ‘destination’ / end point – also needs to be determined

End Point

MAKEPOINT([Latitude],[Longitude])

With these, the distance between can be computed

Distance

DISTANCE([Selected County Start Point],[End Point],’miles’)

And the counties can then be restricted by

Within n miles

[Distance]<= [n miles]

which can be used as a filter and set to True.

How to select a county without set actions

This is managed via the set control feature; right click on the Selected County set and choose Show Set to display the list of counties with the option to select which ones are in or out of the set. Change the display of the control via the dropdown arrow (top right) to be Single value dropdown which automatically provides as ‘All’ option or the ability to select a single set only.

How to handle ‘All’ counties being selected

When ‘All’ is chosen via the Set Control selector, this has the effect of adding all the State, County values into the set, which means we don’t really have a starting state. So the n miles parameter is essentially redundant. But we need to make the Within n miles filter understand this.

We can manage this by first identifying how many values we have in the set

Count Selected States

{FIXED : COUNTD(IF [Selected County] THEN [County, State] END)}

There will either be 1 or if All is selected, then they’ll be 3000+, and we use the FIXED LoD, so the total is stored against all rows.

We can then update our Within n miles filter to consider this value

Within n miles

([Count Selected States]=1 AND [Distance]<= [n miles]) OR [Count Selected States] > 1

This now returns true if either one State, County is selected and the other records are within n miles OR all the states are selected (the count is > 1).

Colouring based on ‘percentile’ of hospital beds

This caused me a little bit of headscratching. I assumed ‘percentile’ would be based on a percentage of the total num_staffed_beds (note I simply renamed this field to Hospital Beds), associated to the state,counties being displayed (ie within n miles). But after building the calculation I thought, and adding it to colour and choosing the Green-Gold colour range, I didn’t get the same spread as shown in the solution.

I messaged Sean to question whether I’d made the right assumption, but while waiting for a response, I did an online search for “Tableau Percentile Rank”, and quickly spotted that a Table Calculation exists. As you can guess, I don’t use this particular calculation very much at all 🙂

So to colour, add Hospital Beds , and simply add a Percentile table calculation

How to only show county borders of counties within selected range

When working with Maps, you can use the options within the Map Layers menu to select which features of the map should be displayed. One of these options is State/Province Borders, which you might think is what is needed.

However, this will show all the state borders, which is evident if you zoom out a bit, including those for states that don’t have counties within n miles.

This isn’t the requirement – we only want to show the borders of the states which have counties within the desired range. So instead, we don’t want to show any State/Province Borders via the Map Layers. And we’ll utilise a dual axis instead, by first identifying the states that are within n miles

States within n miles

If [Within n miles] THEN [State] END

Duplicate the Latitude (generated) field on the Rows, remove all the fields on that mark, and then add the States within n miles to the Detail shelf. Set the colour of this mark to white, and swap the Latitude (generated) fields if needed, so the States are on the top, as below.

Then make dual axis 🙂

And that’s the key points that I hope will help you solve this challenge. My published version is here, which is available to be downloaded. If there’s anything I haven’t covered that you’re not sure about, feel free to contact me.

Happy vizzin’! Stay safe!

Donna