Visualising Ranges

This week’s challenge was set by Sean and was a recreation of a PowerBI WOW challenge set a few weeks ago. The data set just contains 2 pieces of information – a datetime field and a glucose measurement.

Creating the calculations

On a new sheet, add Device Timestamp to Rows as a discrete (blue) pill at the hour level.

Then create the following fields, which are aggregations of the glucose levels at each hourly time period.

Minimum

MIN([Glucose mg/dL])

5th Percentile

PERCENTILE([Glucose mg/dL],0.05)

25th Percentile

PERCENTILE([Glucose mg/dL],0.25)

Median

PERCENTILE([Glucose mg/dL],0.5)

75th percentile

PERCENTILE([Glucose mg/dL],0.75)

95th percentile

PERCENTILE([Glucose mg/dL],0.95)

Maximum

MAX([Glucose mg/dL])

Add all these fields to the table and then format the Hour field to use the 12hr format by default

Building the Viz

On a new sheet, add Device Timestamp as a continuous (green) pill at the hour level to Columns. Then add Median to Rows. Set the mark type to line and set the path to be a dashed line and colour  black. Format the Device Timestamp pill so the default format on the pane is 12hr format (axis should remain as numeric).

Add Minimum to Rows to create a second marks card. Change the mark type to Area and set the colour to be 100% opacity.

Then drag 5th Percentile to the Minimum axis and drop when the ‘2 green column’ symbol appears.

This will automatically add Measure Values to Rows instead and Measure Names to Colour. Add 25th Percentile, 75th Percentile, 95th Percentile and Maximum to the axis too. Reorder the fields in the Measure Values section so the fields are listed in the appropriate order from Min to Max. Adjust the Colours accordingly. Unstack the marks (Analysis -> Stack Marks -> Off).

Add Measure Names to Label and align middle right. Allow labels to overlap marks.

Add Measure Names to the Label on the Median marks card. Also align middle right and format to be underlined.

Make the chart dual axis, synchronise the axis, and then right click on the right hand axis and move marks to back.

Hide the right hand axis (uncheck show header). Edit the Device Timestamp axis to fix from 0 to 26, set the axis to display a value every 3 hours and remove the title.

Add constant Reference Lines to the left hand axis for the values 80 and 180

Edit the left hand axis title. Remove row and column dividers. Add axis rulers. Add a title.

Add to a dashboard and then publish. My published version is here.

Note – the vertical grid lines that appear on the Tableau Public version are a nuance of Tableau Public. I originally recreated on Desktop thinking it was a requirement of the challenge, by adding multiple constant reference lines for each hour. But when I checked Sean’s solution after publishing, I found he didn’t have any and the intention was not to have any.

Happy vizzin’!

Donna

Can you compare salaries to the median?

For the first WOW challenge of 2024, I set the task of completing a viz related to HR salary data.

Creating the calculations

To start, we’ll build out all the necessary calculations required, and display them in tabular form. First create

Employee

[Forename] + ‘ ‘ + [Surname]

then add Employee ID and Employee to Rows, along with Department and Job Title. Add Pay Level as a discrete dimension (blue pill) to Rows too, then format Salary to £ with 0 dp, and add to Text.

We want to compare each employee’s salary with the median salary of all the employees in the same pay level. For this we need

Median Salary per Pay Level

{FIXED [Pay Level]: MEDIAN([Salary])}

Add this to the view too. You should see that the value is the same for those rows where the Pay Level is the same.

We need to compute the difference between these numbers (to draw a line from the median to the salary)

Difference From Median

SUM([Salary])- SUM([Median Salary per Pay Level])

and we also need to understand the salary as a proportion of the median

Salary / Median %

SUM([Salary])/SUM([Median Salary per Pay Level])

format this to % with 0 dp and add both these fields to the table

Each employee is given a scale value based on the proportion

Salary Scale

IF [Salary / Median %] > 1.2 THEN 6
ELSEIF [Salary / Median %] >1.1 AND [Salary / Median %] <=1.2 THEN 5

ELSEIF [Salary / Median %] >1 AND [Salary / Median %] <= 1.1 THEN 4

ELSEIF [Salary / Median %] > 0.9 AND [Salary / Median %] <= 1 THEN 3

ELSEIF [Salary / Median %] > 0.8 AND [Salary / Median %] <=0.9 THEN 2
ELSE 1
END

Set this to be discrete, then add to the view on Rows.

Finally we need the markers for the 80%, 90% etc of the median salary

80% Median

0.8 * [Median Salary per Pay Level]

90% Median

0.9 * [Median Salary per Pay Level]

110% Median

1.1 * [Median Salary per Pay Level]

120% Median

1.2 * [Median Salary per Pay Level]

Building the Viz

On a new sheet, add Employee ID, Employee, Department and Job Title to Rows. Add Salary to Columns and set the mark type to Circle. Set the Colour of the circle to suit (or use #64cdcc).

Add Median Salary per Pay Level, 80% Median, 90% Median, 100% Median, 120% Median to the Detail shelf. Add Pay Level, Salary Scale and Salary / Median % to Tooltip.

Format the axis (right click) and set the scale to be at £k with 0dp

This will also change the formatting of the salary on the tooltip, but we want it to be more detailed so create

Tooltip-Salary

[Salary]

and format this to £ with 0 dp. Add this to Tooltip too and adjust tooltip to match.

Add a reference line to the axis (right click axis > add reference line) for the Median Salary per Pay Level. Set it to be per cell and display a dashed line with no labels/tooltips displaying.

Add another reference line. This time set it to be a band at the cell level which ranges from the 80% Median to the 90% Median. Don’t display any labels or tooltips or lines and fill the band with the appropriate colour.

Repeat the above process, adding a reference band from 90% – 110% of the median, and a further band from 110% – 120% of the median.

Add another instance of Salary to Columns. Change the mark type of the 2nd Salary marks card to gantt and adjust the colour. Add Difference From Median to Size and decrease the size as small as possible. Double click into the Difference From Median pill on the size shelf and manually type in * -1 to the end.

Now make the chart dual axis and synchronise the axis. Right click on the top axis and move marks to back.

We want the user to be able to filter the chart, so add Department, Pay Level (select All values) and Salary Scale to the Filter shelf.

Finally, format the chart – make each row a bit wider; set the background colour of the worksheet to #f9f8f7; remove column dividers; set row dividers to a thick white line; remove all gridlines, zero lines and axis rulers. Reduce the size of the axis values and increase the width of the initial 3 columns so the text doesn’t wrap as much. Test your filters.

If all working, then add the sheet to a dashboard.

Building the Legend

To build the legend, I simply duplicated the core sheet, then filtered to a specific Employee ID and hid the Employee ID, Employee, Department and Job Title fields. I then edited the reference lines to add custom labels to label each line band, formatting the text to display to the top or bottom. I added an annotation to the Salary circle mark to label that as ‘Employee Salary’ which I then manually moved into position.

When I added this to the dashboard, I then floated a blank object over the top so the legend could not be interacted with.

My published viz is here.

Happy vizzin’!

Donna

Can you create a normalised jitter plot?

Continuing the theme of alternative chart types, Kyle decided to challenge us to recreate this jitter plot inspired by an example from The Big Book of Dashboards.

I’ve built jitter plots in the past for #WorkoutWednesday challenges (the hidden RANDOM() function is your friend in this), but I wanted to see how far I could get without having to peak at my previous solutions.

So I connected to the baseball data provided, and cracked on, building the jitter in a single sheet using Measure Names on the columns. But then I got stuck when it came to labelling the tooltips…. surely this didn’t need a sheet per measure did it…

…maybe it did… so I proceeded to recreate as 4 separate sheets, and felt quite smug that I’d managed to make use of the ‘little used’ worksheet caption to provide the summary detail at the bottom of each measure. When I’d finished, I checked Kyle’s solution, as the summary values for the SLG & OPS measures seemed to be mixed up…. and what did I find…. he had managed to build the jitter within a single sheet as he had pivoted the data first! Argggghhhh! It just hadn’t crossed my mind, and Kyle had chosen not to drop that hint in the requirements…. hey ho! c’est la vie! I may well recreate with a pivoted version at a later date, but for now, I am blogging what I did…

My solution has ended up with a lot of calculated fields as a result, as equivalent fields needed to be created for every measure. Most of this was managed via duplicating and editing existing fields, so it actually wasn’t too onerous.

Building the calculated fields

We’ll start as usual by building out the fields required, and will focus on the BA measure initially.

Add Name and BA into a tabular view and Sort descending. Format the BA measure to be a number with 3 decimal places.

We need to know the rank of each player. Create a calculated field

Rank BA

RANK(SUM([BA]))

Add this to the view, and we should get the player rankings displayed from 1 downwards.

Now the requirement wants us to normalise the measures so they can be displayed on the same axis (or in my case, since it’s not a single chart I’m building), within the same axis range.

What this means is we want to plot the measure on a scale between 0 and 1 where 0 represents the lowest measure value, and 1 the highest. for this we need

Min BA

{FIXED :MIN([BA])}

and

Max BA

{FIXED :MAX([BA])}

The normalised value then becomes

Normalise BA

([BA] – [Min BA])/([Max BA]-[Min BA])

The difference between the current value and the lowest value, as a proportion of the range (ie the difference between the highest and lowest values).

Adding this to the table, you should see that the Normalise BA value for the highest ranked player is 1 and that for the lowest ranked is 0.

As part of the information displayed, we also need to know the percentile each player is in.

Percentile BA

RANK_PERCENTILE(SUM([BA]))

Format this to a percentage with 0 dp and add into the table.

Next we need to identify the player selected, so we’re going to create a parameter based off of the Name.

Select a Player

Right click Name -> create -> Parameter. This will open the parameter dialog and auto populate the list of options with the values from the Name field. Default the parameter to Julio Rodriguez.

We can then create a field to identify if the player is the one selected

Is Selected Player?

[Name] = [Select a Player]

Add this into the table, on the Rows before Name and sort so True is listed at the top (just easier to check the results).

So now we need to identify the rank and percentile of the selected player only

Selected Player BA Rank

WINDOW_MAX(IF ATTR([Is Selected Player]) THEN [Rank BA] END)

format this to a number with 0 dp

Selected Player BA Percentile

WINDOW_MAX(IF ATTR([Is Selected Player]) THEN [Percentile BA] END)

format this to a percentage with 0 dp.

The window_max function has the effect of ‘spreading’ the result over all the rows.

Finally we need to get a count of all the players

Count Players

{FIXED:COUNTD([Name])}

format this to a number with 0 dp.

Building the Jitter Plot

To build a jitter plot, we need to plot each mark against 2 axes. The Normalise BA measure is one axis, but we need to create ‘something’ for the other. This is the value to make the ‘jitter’ which is essentially an arbitrary value between 0 and 1 that we can plot the mark against, and means the marks don’t all end up in a single line on top of each other, and we can get a better ‘feel’ for the volume of data being represented.

Jitter

RANDOM()

The random() function is a ‘hidden’ function that will, as it’s name suggests, generate a random number. It is ‘hidden’ as it only works with some data sources. Excel for example is fine, but if you were connected to a Snowflake database, you can’t use it.

The nature of random, also means that you can’t guarantee the value it produces, and it will regenerate on data refresh, so if you’re looking to compare your solution directly, your dots will not be positioned exactly the same.

On a new sheet add Jitter to Columns and Normalise BA to Rows. Add Name to Detail and change the mark type to Circle.

Add Is Selected Player to Colour, adjust accordingly and add a border to the circle. I dropped the opacity to 70%. Order the colour legend, so True is listed first, to ensure this circle is always ‘on top’.

Then add Is Selected Player to Size. Edit the sizes so they are reversed and adjust the sizes until you’re happy.

To label just the selected player mark

Label:BA

IF [Is Selected Player] THEN [BA] END

format this with a custom number font ,##.000;-#,##.000

Add this to the Label shelf and adjust the font colour, and align centrally

Add Rank BA and BA to the Tooltip shelf and adjust tooltip to suit. You will need to adjust the table calculation setting of the Rank BA field so that it is computing by all the fields.

Add Selected Player BA Rank and Selected Player BA Percentile and Count Players to the Detail shelf. Adjust the table calculations as above (including any nested calcs), then show the worksheet caption (Worksheet -> Show Caption), and edit the caption to display the relevant text.

From the analytics pane, drag the Median with Quartiles option onto the canvas and drop it on the table / Normalise BA axis option. Remove the quartile reference lines (right cick axis -> remove reference line), and edit the median reference line to be a dashed line with no label.

Finally remove all gridlines/dividers/axes lines and hide the axes. Title the sheet as per the measure ie BA, and align centrally.

Format the Caption and the Title to have a light grey background and a slightly darker thin border.

Now, repeat all that for the other measures 🙂 This isn’t that bad. All the fields above labelled BA, need duplicating, renaming and updated to reference the next measure eg OBP.

Once done, duplicate the BA jitter plot sheet, and replace all the ‘BA’ related fields with the equivalent ones, by dragging the equivalent field and dropping it directly on top. Sense check the table calculation settings are all ok. You may need to update the text in the caption, as that seems to lose anything to do with the table calculation fields referenced when they get touched.

Ultimately you should end up with 4 sheets.

Putting it all together

On a dashboard, use a horizontal container to position all 4 sheets in side by side. Show the worksheet caption for each sheet. Reduce the outer padding for each sheet to 0, and add a thin border around each sheet.

Add a parameter action to drive the interactivity ‘on click’ of a circle

Select Player

On select of any of the source sheets, update the Select a Player parameter with the value from the Name field. Retain the selected value on ‘unclick’

To prevent the selection on click from being ‘highlighted’, and al the other marks ‘fading’, we need one final step.

Create new calculated fields

True

TRUE

False

FALSE

Add both these fields to the Detail shelf of each of the 4 sheets.

Then add a dashboard filter action for each sheet which on select, goes from the sheet on the dashboard to the worksheet itself, passing the selected fields of True = False. Show all values when unselected.

My published viz is here. Kyle’s solution with a lot less calculated fields, and only 2 sheets (1 for the jitter and 1 for the summary section at the bottom) is here. You will need to pivot the data via the data source pane first through 🙂 Next time, when I really feel something should be able to be done in 1 sheet, I’ll try to think a little longer…. upshot though, I impressed myself at the use of the caption for the summary – something I must consider using more often!

Happy vizzin’!

Donna

Let’s build a trellis chart!

It only seems like yesterday I was writing a solution guide, and I’m back at it again. This week Sean asked us to recreate this challenge to build a small multiple / trellis chart using table calculations only.

A note on the data

After downloading and connecting to the provided data source, I found the dates weren’t coming through as intended – they’d been transposed from dd/mm/yyyy to a mm/dd/yyyy so consequently the only dates I was getting were for the first 12 days in January for every year. Rather than trying to solve this at source, I just created a new field which transposed the Date field back so it behaved as I expected

Date Corrected

(MAKEDATE(YEAR([Date]), DAY([Date]),MONTH([Date])))

You may not need to do this if the data pulls in correctly.

Filters

There are two filters that should be applied, which can either be added as data source filters (right click data source > Edit data source filters) or can be applied to the Filter shelf on any sheets created. Ultimately, this challenge only requires 1 sheet, but when building and verifying logic, I tend to have additional ‘check sheets’. I therefore added the filters below to the Filter shelf of the first sheet I started working with, but set them to apply to all worksheets using the data source (right click pill once it’s on the filter shelf -> Apply to Worksheets -> All using this data source).

Gender : All

Date Corrected : starting date = 01 Jan 2012

Setting up the data

As is good practice when working with table calculations, I start by building out the calculations I need and validating them in a tabular format before I build any vizzes. So let’s do that.

All the countries are displayed in capital letters, so we need

Country UPPER

UPPER([Country Name])

Add this to Rows

Additionally, for the purpose of validation and performance only, add this field to the Filter shelf too and just filter to Australia and Austria.

If you haven’t already added them as data source filters, apply the filters mentioned in the section above to this sheet too and set to apply to all worksheets using the data source.

Add Date Corrected to Rows as a discrete (blue pill) exact date. Format the date so it displays in month year format.

Add Unemployment Rate to Text. Format this number to 1 decimal place and add a % as a suffix.

Now for the table calcs

Median

WINDOW_MEDIAN(SUM([Unemployment Rate]))

Format this to display as a % using the same option as above. Add this to the table and set to compute using Date Corrected

You should find that your median value only differs by country.

Now we work out

Variance

SUM([Unemployment Rate]) – [Median]

Format this to display as a % and add to the table, setting the table calc to compute by Date Corrected again. This is the measure that will be used to plot the trend line against.

We also need to display the range of Unemployment rates for each country – ie we need to work out the minimum and maximum values.

Max Unemployment Rate

WINDOW_MAX(MAX([Unemployment Rate]))

Min Unemployment Rate

WINDOW_MIN(MIN([Unemployment Rate]))

Format both of these to display as 5 with 1dp, and add to the table, verifying the calculations are computing by Date Corrected once again. Verify you get the same values for all the rows associated to a single country.

Now we know the calculations are as expected, we can start to build out the viz.

Building the core chart

To start with we’ll just focus on getting the line chart with the associated text displayed for the two countries Australia & Austria. So on a new sheet add Country (UPPER) to Filter and filter to these selections. The other filters should automatically add.

Add Country UPPER and Date Corrected (green continuous exact date) to Columns and Variance to Rows. Set Variance to Compute By Date Corrected.

Add Unemployment Rate to the Tooltip and adjust the text to match.

To add the country title and other displayed text, we’re going to use a ‘fake axis’ and plot a mark at a central date. On hovering over the solution, October 2016 seems to be the appropriate date selected. So we need

Title Position to Plot

IF [Date Corrected] = #2016-10-01# THEN 1 END

Add this to Rows in front of the existing pill. Change the mark type of this measure only to a circle and re-Size to make it as small as possible and adjust the Colour Opacity to 0%. This will make the mark ‘disappear’.

Add Country UPPER, Median, Max Unemployment Rate and Min Unemployment Rate to the Label shelf of this marks card. Ensure all the table calculation fields are set to compute by Date Corrected. Adjust the text as required, and align centre. Ensure the Tooltip is blank for this marks card.

Change the colour of the variance line to grey, then remove all gridlines, row dividers and axis. Set the Column Dividers to be a thick white line (this will help provide a separator between the small multiples later).

Creating the trellis

There are multiple blog posts about creating trellis charts. My go to post has always been this one by Chris Love. It’s a more complex solution that dynamically flexes the number of rows and columns based on the number of members in the dimension you’re visualising. There have also been other Workout Wednesday challenges involving trellis charts, which I’ve blogged about too (see here).

Ultimately we’re aiming to determine a ‘grid position’ for each member of our dimension. In this case the dimension is Country UPPER and its a static list of 36 values, which we can display in a 6 x 6 grid. So Australia needs to be in row 1 column 1, Austria in row 1 column 2….. Costa Rica in row 2 column 1… USA in row 6 column.

As our members are static, the calculations we can use for this can be a bit simpler than those in Chris’ blog.

Firstly, let’s get our data in a tabular layout so we can ‘see’ the values as we go.

Duplicate the data sheet we built up, then move Measure Name and Date Corrected from Rows/Columns to the Detail shelf. Remove the Country UPPER field from the Filter shelf. You should have something like below, showing 1 row per country

Double click into the Rows shelf and type in INDEX(), then change the resulting pill to discrete (blue). You will see that index numbers every row. It’s a table calculation and although working as expected, let’s explicitly set it to compute using County UPPER.

Let’s now create our grid position values.

Cols

FLOAT(INT((INDEX()-1)%6))

This takes the Index value and subtracts 1, and returns the remainder when divided by 6 (%6=modulus of 6 – ie 6%6=0, 7%6=1). 6 is the number of columns we want.

Rows

FLOAT(INT((INDEX()-1)/6))

This takes the Index value and subtracts 1, and returns the integer part of the value when divided by 6. Again we’re using 6 as this is the number of rows we want to display.

Add these to the table, set to be discrete (blue) and compute using Country UPPER.

You can see that the first 6 countries are all in the same row (row 0) but different columns (0-5).

Now that’s understood, we can create the small multiples on the viz.

Duplicate the sheet we created further above which displays the trend graph for Australia & Austria. As we’re now going to make the changes to create the charts for every country, if things go a bit screwy, you can always get back to this one to try again :-).

Add Cols to Columns. Set to discrete and compute using Country UPPER. Add Rows to Rows and do the same thing. Move Country UPPER from Columns to the Detail shelf on the All marks card. Then remove Country UPPER from the Filter shelf.

Hopefully everything worked as expected and you have

Final step is to uncheck Show Header against the Cols and Rows pills so they don’t display and you can add to a dashboard.

My published viz is here.

Happy vizzin’!

Donna

Can you create a jittered box plot?

It was Kyle’s turn to set this jittered box plot challenge this week. While it may sound complicated, this is quite a straightforward challenge this week, made more so as Kyle very kindly provides references to other blogs which help you.

So let’s build…

Firstly you need to download the 2 excel files Kyle provides, then relate them via the Team field (this should happen automatically).

On a new sheet add Team to Columns, Age to Rows and Player Code to Detail. Change the mark type to circle, and reduce the Size slightly.

From the Analytics pane, drag box plot onto the sheet and drop onto the Cell image that displays.

Create a new field which is the key field for the jittering functionality

Jitter

RANDOM()

This just generates a random number between 0 and 1.

Add this to Columns and change the view to Entire View so its not all squashed up.

We’ve got our jittered box plot. Now we just have to add in the additional functionality required.

Drag the Playoffs field so that it’s above the line in the data pane on the right hand side (ie change it from a measure to a dimension). Then right click > Aliases to alias to 0 and 1 values to the labels required

Add this field to the Columns in front of the Team field. Re-order so ‘Playoff Teams’ is listed first (I just click on the field name and drag it to the left).

We also need to sort the data based on the median value per team. We need a new field for this.

Median Age per Team

{FIXED [Team]:MEDIAN([Age])}

Add a sort to the Team field, so it sorts by the Median Age per Team descending

Add League to the Filter shelf and set to AL.

Format the worksheet and set the Column Banding to be level 0 and band size 1 to shade the sections as required.

Then adjust the format of the gridlines to remove all row and column gridlines.

Finally, add Name to the Tooltip and adjust accordingly. Then hide the Jitter axes (uncheck Show Header) and adjust the Age axis so it is fixed to start at 18.

You can now add this to a dashboard and you’re done! My published viz is here.

Happy vizzin!

Donna

Adding Detail & Context

It was Sean Miller’s turn to expand on last week’s #WOW2022 challenge, by adding an additional viz to the scatter plot (challenge details here).

The assumption is you should be able to build on the challenge solution you have built in the previous week. My solution to the Reference Box challenge is here. I adopted (and blogged about) a table calc solution. However, the published solution used LoDs. Both methods achieved the desired result, but I decided when starting this challenge, that I would build this ‘extension’ using LoDs too, so anyone who uses the published solution as a starting point, gets help via this blog.

So if you used my previous blog to build the challenge, you’ll need to first create the LoD equivalent of the calculated fields we used (note I did not change the scatter plot viz to use these fields) :

PR- 25th Percentile LoD

{PERCENTILE({FIXED [State]: SUM([Profit])/SUM([Sales]) },0.25)}

PR-75th Percentile LoD

{PERCENTILE({FIXED [State]: SUM([Profit])/SUM([Sales]) },0.75)}

Sales- 25th Percentile LoD

{PERCENTILE({ FIXED [State]: SUM([Sales]) },0.25)}

Sales- 75th Percentile LoD

{PERCENTILE({ FIXED [State]: SUM([Sales]) },0.75)}

Note – It’s also worth reiterating at this point, that I will be referencing calculated fields/parameters/objects in this blog created as part of my initial challenge.

In order to build the bar chart, we need to categorise each State into a grouping; the selected/highlighted state, the states in the reference box, all other states.

To identify the states in the reference box, we need to use FIXED LoDs to get the value of the Sales and Profit Ratio at the State level.

Sales by State

{FIXED [State]: SUM([Sales])}

PR by State

{FIXED [State]: SUM([Profit])/SUM([Sales])}

We can then use these fields along with the LoDs further above to determine whether a State is in the reference box

In Reference Box?

[PR by State]>=[PR-25th Percentile LOD] AND
[PR by State]<= [PR-75th Percentile LOD] AND [Sales by State]>=[Sales – 25th Percentile LOD] AND
[Sales by State]<= [Sales – 75th Percentile LOD]

This simply returns a boolean.

Now we can categorise each State

State to Display

IF [Selected State] THEN [State]
ELSEIF [In Reference Box?] THEN [State]
ELSE ‘Other (median)’
END

And with the above, we can then define the measures we want to show

Sales to Display

{FIXED [State To Display]: MEDIAN([Sales by State])}

PR to Display

{FIXED [State To Display]: MEDIAN([PR by State])}

And with these fields we can now build the bar chart.

  • Add Selected State to Rows and drag the dimension value, so True is listed before False.
  • Add In Reference Box? to Rows, and again drag so the True is listed before False.
  • Add State to Display to Rows
  • Add Sales to Display to Columns and Sort descending
  • Add PR to Display to Columns
  • On the All Marks Card add Selected State to Colour
  • Then add In Reference Box? to the Detail shelf. Then click on the … icon to the left of the In Reference Box? pill on the marks card, and change to the Colour icon. This should result in 2 fields on the Colour shelf.
  • Adjust the colours accordingly.
  • Add Sales to Display and PR to Display to the Tooltip shelf and adjust.
  • Change the titles of the axis
  • Remove row banding
  • Uncheck Show Header against Selected State and In Reference Band?
  • Hide field labels for Rows against the State to Display column heading

Add this sheet onto the dashboard and you’re done 🙂

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you build a reference box?

Erica Hughes set her first #WOW challenge this week, which is all about reference lines. The intention by the #WOW crew is to use this initial challenge as the basis for the next few weeks, so it’s going to be interesting to see how that works out.

But back to this week. We’re building a basic Sales vs Profit Ratio scatter plot by State, but then need various calculated fields to plot the reference lines and the reference box.

  • Building the basic chart
  • Adding the dotted reference lines
  • Adding the reference box
  • Making the y-axis symmetrical
  • Highlighting the selected state

Building the basic chart

First up we need to create our favourite calculated field

Profit Ratio

SUM([Profit]) / SUM( [Sales])

format to percentage with 0 dp.

Create the basic scatter plot as below

Remove all gridlines, zero lines and axis rulers and tick marks.

Adding the dotted reference lines

These lines are the median values for the Profit Ratio and Sales.

Sales – Median

WINDOW_MEDIAN(SUM([Sales]))

format to Custom Currency with 1 dp, $ prefix and display units to Thousands (K).

PR- Median

WINDOW_MEDIAN([Profit Ratio])

format to percentage with 0 dp.

Add both these fields to the Detail shelf, and adjust the table calculation settings of both fields so they are computing by State

Add a reference line to the Sales axis (I do this by right clicking on the axis > Add Reference Line, but you can also drag from the Analytics pane).

Set the reference line to be against the Entire Table, using the Sales – Median field, no label, and the tooltip set to custom as below. Set the line to be dotted.

Apply a similar reference line to the Profit Ratio axis, using the PR – Median value instead.

Adding the reference box

I have to admit, this bit took a bit of trial and error. I knew it was going to involve a combination of bands and lines and colouring above and below, but things didn’t always go to plan. It was this post by Jonathan Allenby that helped.

The block is basically bounded by the 25th & 75th percentile of the Sales and Profit Ratio values. So lets create them

Sales – 25th Percentile

WINDOW_PERCENTILE(SUM([Sales]),0.25)

Sales – 75th Percentile

WINDOW_PERCENTILE(SUM([Sales]),0.75)

Add these fields to the Detail shelf and set the table calculation on both to compute by State.

Add a reference line again to the Sales axis, and this time select Band that goes from the Sales – 25th Percentile to the Sales – 75th Percentile and fill based on the colour stated in the requirements.

Now add a reference line to the Profit Ratio axis. This time select Distribution.

Select the computation to be Percentiles and enter 25 as the value. Set Label, Tooltip and Line to None. Now this is the odd bit… set the Fill to the colour white, then check the Fill Below box, which then seems to set the Fill option to ‘No Fill’. Doing this seems to have the desired effect. Trying this in any other order, or manually selecting the Fill option to be No Fill, just doesn’t seem to work…

Now repeat the process adding another reference line on the Profit Ratio axis, and this time create a distribution for the 75th percentile. This time, after setting the Fill to white, check the Fill Above checkbox. Weirdly this time, the fill on my laptop got set the Grey Light, although it was correctly displaying as white on the screen. I manually changed it to ‘No Fill’.

It feels like there’s something a bit ‘buggy’ with all this, which might explain while my initial attempts were failing. I’ll be interested in knowing if you see this behaviour too (I created on v 2021.4.2).

Making the y-axis symmetrical

This was a ‘bonus’ feature, but is again achieved with a reference line. What we’re looking for here is to understand what is the maximum absolute value so we can determine where to place a hidden reference line – ie if the maximum profit ratio is 20% and the minimum is -30%, the maximum absolute value is 30%, and we’d need a reference line +30%, so the axis extends from -30% to +30%. Conversely if the maximum profit ratio is 40% and the minimum is -25%, the maximum absolute value is 40% and we need a reference line at -40% for symmetry.

I’ve encapsulated all this logic within one field below

Ref Line – Profit Ratio Symmetry

IF MAX(WINDOW_MAX([Profit Ratio]), ABS(WINDOW_MIN([Profit Ratio]))) = WINDOW_MAX([Profit Ratio]) THEN -1*WINDOW_MAX([Profit Ratio]) ELSE -1*WINDOW_MIN([Profit Ratio])
END

The statement MAX(WINDOW_MAX([Profit Ratio]), ABS(WINDOW_MIN([Profit Ratio]))) is returning the maximum of the max and min values (ie MAX(20, ABS(-30)) = MAX(20,30) = 30, MAX(40, ABS(-25)) = MAX(40,25) = 40.

Add this onto the Detail shelf and add another reference line to the Profit Ratio axis, setting it to use the Minimum of the Ref Line – Profit Ratio Symmetry field

Test how the field works, by selecting a few of the State marks towards the top and exclude; the axis should change, but still be symmetrical.

Highlighting the selected state

For this we need a parameter

pSelectedState

string parameter defaulted to Virginia

Then we need a field to identify which state has been captured

Selected State

[State]=[pSelectedState]

which returns a boolean. Change the mark type to Circle and add Selected State to the Colour shelf, and adjust accordingly (add a border to the circle too via the Colour shelf). Also add Selected State to the Size shelf and again tweak.

Once the sheet has been added to the dashboard, create a dashboard parameter action that passes the State field into the pSelectedState parameter on Select of the chart.

The final step to add is to stop all the other marks from ‘fading’ out when a State mark is selected. This is achieved by creating the following calculated fields

True

True

False

False

Add both of these to the Detail shelf of the scatter chart.

On the dashboard, create a dashboard filter action as below, which passes selected fields setting True = False, which can never be true and prevents the mark from being highlighted.

And with that, you should have the desired dashboard. I’m interested to see if this matches Erica’s solution (it’s likely I’ll start with the provided workbook next week, rather than mine, just in case there are some discrepancies – eg I’ve used a lot of table calcs… LoDs may have been possible…)

My published viz is here.

Happy vizzin’! Stay Safe!

Donna