This week’s challenge was set by Sean and was a recreation of a PowerBI WOW challenge set a few weeks ago. The data set just contains 2 pieces of information – a datetime field and a glucose measurement.
Creating the calculations
On a new sheet, add Device Timestamp to Rows as a discrete (blue) pill at the hour level.
Then create the following fields, which are aggregations of the glucose levels at each hourly time period.
Minimum
MIN([Glucose mg/dL])
5th Percentile
PERCENTILE([Glucose mg/dL],0.05)
25th Percentile
PERCENTILE([Glucose mg/dL],0.25)
Median
PERCENTILE([Glucose mg/dL],0.5)
75th percentile
PERCENTILE([Glucose mg/dL],0.75)
95th percentile
PERCENTILE([Glucose mg/dL],0.95)
Maximum
MAX([Glucose mg/dL])
Add all these fields to the table and then format the Hour field to use the 12hr format by default
Building the Viz
On a new sheet, add Device Timestamp as a continuous (green) pill at the hour level to Columns. Then add Median to Rows. Set the mark type to line and set the path to be a dashed line and colour black. Format the Device Timestamp pill so the default format on the pane is 12hr format (axis should remain as numeric).
Add Minimum to Rows to create a second marks card. Change the mark type to Area and set the colour to be 100% opacity.
Then drag 5th Percentile to the Minimum axis and drop when the ‘2 green column’ symbol appears.
This will automatically add Measure Values to Rows instead and Measure Names to Colour. Add 25th Percentile, 75th Percentile, 95th Percentile and Maximum to the axis too. Reorder the fields in the Measure Values section so the fields are listed in the appropriate order from Min to Max. Adjust the Colours accordingly. Unstack the marks (Analysis -> Stack Marks -> Off).
Add Measure Names to Label and align middle right. Allow labels to overlap marks.
Add Measure Names to the Label on the Median marks card. Also align middle right and format to be underlined.
Make the chart dual axis, synchronise the axis, and then right click on the right hand axis and move marks to back.
Hide the right hand axis (uncheck show header). Edit the Device Timestamp axis to fix from 0 to 26, set the axis to display a value every 3 hours and remove the title.
Add constant Reference Lines to the left hand axis for the values 80 and 180
Edit the left hand axis title. Remove row and column dividers. Add axis rulers. Add a title.
Add to a dashboard and then publish. My published version is here.
Note – the vertical grid lines that appear on the Tableau Public version are a nuance of Tableau Public. I originally recreated on Desktop thinking it was a requirement of the challenge, by adding multiple constant reference lines for each hour. But when I checked Sean’s solution after publishing, I found he didn’t have any and the intention was not to have any.
For the first WOW challenge of 2024, I set the task of completing a viz related to HR salary data.
Creating the calculations
To start, we’ll build out all the necessary calculations required, and display them in tabular form. First create
Employee
[Forename] + ‘ ‘ + [Surname]
then add Employee ID and Employee to Rows, along with Department and Job Title. Add Pay Level as a discrete dimension (blue pill) to Rows too, then format Salary to £ with 0 dp, and add to Text.
We want to compare each employee’s salary with the median salary of all the employees in the same pay level. For this we need
Median Salary per Pay Level
{FIXED [Pay Level]: MEDIAN([Salary])}
Add this to the view too. You should see that the value is the same for those rows where the Pay Level is the same.
We need to compute the difference between these numbers (to draw a line from the median to the salary)
Difference From Median
SUM([Salary])- SUM([Median Salary per Pay Level])
and we also need to understand the salary as a proportion of the median
Salary / Median %
SUM([Salary])/SUM([Median Salary per Pay Level])
format this to % with 0 dp and add both these fields to the table
Each employee is given a scale value based on the proportion
Salary Scale
IF [Salary / Median %] > 1.2 THEN 6 ELSEIF [Salary / Median %] >1.1 AND [Salary / Median %] <=1.2 THEN 5
ELSEIF [Salary / Median %] >1 AND [Salary / Median %] <= 1.1 THEN 4
ELSEIF [Salary / Median %] > 0.9 AND [Salary / Median %] <= 1 THEN 3
ELSEIF [Salary / Median %] > 0.8 AND [Salary / Median %] <=0.9 THEN 2 ELSE 1 END
Set this to be discrete, then add to the view on Rows.
Finally we need the markers for the 80%, 90% etc of the median salary
80% Median
0.8 * [Median Salary per Pay Level]
90% Median
0.9 * [Median Salary per Pay Level]
110% Median
1.1 * [Median Salary per Pay Level]
120% Median
1.2 * [Median Salary per Pay Level]
Building the Viz
On a new sheet, add Employee ID, Employee, Department and Job Title to Rows. Add Salary to Columns and set the mark type to Circle. Set the Colour of the circle to suit (or use #64cdcc).
Add Median Salary per Pay Level, 80% Median, 90% Median, 100% Median, 120% Median to the Detail shelf. Add Pay Level, Salary Scale and Salary / Median % to Tooltip.
Format the axis (right click) and set the scale to be at £k with 0dp
This will also change the formatting of the salary on the tooltip, but we want it to be more detailed so create
Tooltip-Salary
[Salary]
and format this to £ with 0 dp. Add this to Tooltip too and adjust tooltip to match.
Add a reference line to the axis (right click axis > add reference line) for the Median Salary per Pay Level. Set it to be per cell and display a dashed line with no labels/tooltips displaying.
Add another reference line. This time set it to be a band at the cell level which ranges from the 80% Median to the 90% Median. Don’t display any labels or tooltips or lines and fill the band with the appropriate colour.
Repeat the above process, adding a reference band from 90% – 110% of the median, and a further band from 110% – 120% of the median.
Add another instance of Salary to Columns. Change the mark type of the 2nd Salary marks card to gantt and adjust the colour. Add Difference From Median to Size and decrease the size as small as possible. Double click into the Difference From Median pill on the size shelf and manually type in * -1 to the end.
Now make the chart dual axis and synchronise the axis. Right click on the top axis and move marks to back.
We want the user to be able to filter the chart, so add Department, Pay Level (select All values) and Salary Scale to the Filter shelf.
Finally, format the chart – make each row a bit wider; set the background colour of the worksheet to #f9f8f7; remove column dividers; set row dividers to a thick white line; remove all gridlines, zero lines and axis rulers. Reduce the size of the axis values and increase the width of the initial 3 columns so the text doesn’t wrap as much. Test your filters.
If all working, then add the sheet to a dashboard.
Building the Legend
To build the legend, I simply duplicated the core sheet, then filtered to a specific Employee ID and hid the Employee ID, Employee, Department and Job Title fields. I then edited the reference lines to add custom labels to label each line band, formatting the text to display to the top or bottom. I added an annotation to the Salary circle mark to label that as ‘Employee Salary’ which I then manually moved into position.
When I added this to the dashboard, I then floated a blank object over the top so the legend could not be interacted with.
I’ve built jitter plots in the past for #WorkoutWednesday challenges (the hidden RANDOM() function is your friend in this), but I wanted to see how far I could get without having to peak at my previous solutions.
So I connected to the baseball data provided, and cracked on, building the jitter in a single sheet using Measure Names on the columns. But then I got stuck when it came to labelling the tooltips…. surely this didn’t need a sheet per measure did it…
…maybe it did… so I proceeded to recreate as 4 separate sheets, and felt quite smug that I’d managed to make use of the ‘little used’ worksheet caption to provide the summary detail at the bottom of each measure. When I’d finished, I checked Kyle’s solution, as the summary values for the SLG & OPS measures seemed to be mixed up…. and what did I find…. he had managed to build the jitter within a single sheet as he had pivoted the data first! Argggghhhh! It just hadn’t crossed my mind, and Kyle had chosen not to drop that hint in the requirements…. hey ho! c’est la vie! I may well recreate with a pivoted version at a later date, but for now, I am blogging what I did…
My solution has ended up with a lot of calculated fields as a result, as equivalent fields needed to be created for every measure. Most of this was managed via duplicating and editing existing fields, so it actually wasn’t too onerous.
Building the calculated fields
We’ll start as usual by building out the fields required, and will focus on the BA measure initially.
Add Name and BA into a tabular view and Sort descending. Format the BA measure to be a number with 3 decimal places.
We need to know the rank of each player. Create a calculated field
Rank BA
RANK(SUM([BA]))
Add this to the view, and we should get the player rankings displayed from 1 downwards.
Now the requirement wants us to normalise the measures so they can be displayed on the same axis (or in my case, since it’s not a single chart I’m building), within the same axis range.
What this means is we want to plot the measure on a scale between 0 and 1 where 0 represents the lowest measure value, and 1 the highest. for this we need
Min BA
{FIXED :MIN([BA])}
and
Max BA
{FIXED :MAX([BA])}
The normalised value then becomes
Normalise BA
([BA] – [Min BA])/([Max BA]-[Min BA])
The difference between the current value and the lowest value, as a proportion of the range (ie the difference between the highest and lowest values).
Adding this to the table, you should see that the Normalise BA value for the highest ranked player is 1 and that for the lowest ranked is 0.
As part of the information displayed, we also need to know the percentile each player is in.
Percentile BA
RANK_PERCENTILE(SUM([BA]))
Format this to a percentage with 0 dp and add into the table.
Next we need to identify the player selected, so we’re going to create a parameter based off of the Name.
Select a Player
Right click Name -> create -> Parameter. This will open the parameter dialog and auto populate the list of options with the values from the Name field. Default the parameter to Julio Rodriguez.
We can then create a field to identify if the player is the one selected
Is Selected Player?
[Name] = [Select a Player]
Add this into the table, on the Rows before Name and sort so True is listed at the top (just easier to check the results).
So now we need to identify the rank and percentile of the selected player only
Selected Player BA Rank
WINDOW_MAX(IF ATTR([Is Selected Player]) THEN [Rank BA] END)
format this to a number with 0 dp
Selected Player BA Percentile
WINDOW_MAX(IF ATTR([Is Selected Player]) THEN [Percentile BA] END)
format this to a percentage with 0 dp.
The window_max function has the effect of ‘spreading’ the result over all the rows.
Finally we need to get a count of all the players
Count Players
{FIXED:COUNTD([Name])}
format this to a number with 0 dp.
Building the Jitter Plot
To build a jitter plot, we need to plot each mark against 2 axes. The Normalise BA measure is one axis, but we need to create ‘something’ for the other. This is the value to make the ‘jitter’ which is essentially an arbitrary value between 0 and 1 that we can plot the mark against, and means the marks don’t all end up in a single line on top of each other, and we can get a better ‘feel’ for the volume of data being represented.
Jitter
RANDOM()
The random() function is a ‘hidden’ function that will, as it’s name suggests, generate a random number. It is ‘hidden’ as it only works with some data sources. Excel for example is fine, but if you were connected to a Snowflake database, you can’t use it.
The nature of random, also means that you can’t guarantee the value it produces, and it will regenerate on data refresh, so if you’re looking to compare your solution directly, your dots will not be positioned exactly the same.
On a new sheet add Jitter to Columns and Normalise BA to Rows. Add Name to Detail and change the mark type to Circle.
Add Is Selected Player to Colour, adjust accordingly and add a border to the circle. I dropped the opacity to 70%. Order the colour legend, so True is listed first, to ensure this circle is always ‘on top’.
Then add Is Selected Player to Size. Edit the sizes so they are reversed and adjust the sizes until you’re happy.
To label just the selected player mark
Label:BA
IF [Is Selected Player] THEN [BA] END
format this with a custom number font ,##.000;-#,##.000
Add this to the Label shelf and adjust the font colour, and align centrally
Add Rank BA and BA to the Tooltip shelf and adjust tooltip to suit. You will need to adjust the table calculation setting of the Rank BA field so that it is computing by all the fields.
Add Selected Player BA Rank and Selected Player BA Percentile and Count Players to the Detail shelf. Adjust the table calculations as above (including any nested calcs), then show the worksheet caption (Worksheet -> Show Caption), and edit the caption to display the relevant text.
From the analytics pane, drag the Median with Quartiles option onto the canvas and drop it on the table / Normalise BA axis option. Remove the quartile reference lines (right cick axis -> remove reference line), and edit the median reference line to be a dashed line with no label.
Finally remove all gridlines/dividers/axes lines and hide the axes. Title the sheet as per the measure ie BA, and align centrally.
Format the Caption and the Title to have a light grey background and a slightly darker thin border.
Now, repeat all that for the other measures 🙂 This isn’t that bad. All the fields above labelled BA, need duplicating, renaming and updated to reference the next measure eg OBP.
Once done, duplicate the BA jitter plot sheet, and replace all the ‘BA’ related fields with the equivalent ones, by dragging the equivalent field and dropping it directly on top. Sense check the table calculation settings are all ok. You may need to update the text in the caption, as that seems to lose anything to do with the table calculation fields referenced when they get touched.
Ultimately you should end up with 4 sheets.
Putting it all together
On a dashboard, use a horizontal container to position all 4 sheets in side by side. Show the worksheet caption for each sheet. Reduce the outer padding for each sheet to 0, and add a thin border around each sheet.
Add a parameter action to drive the interactivity ‘on click’ of a circle
Select Player
On select of any of the source sheets, update the Select a Player parameter with the value from the Name field. Retain the selected value on ‘unclick’
To prevent the selection on click from being ‘highlighted’, and al the other marks ‘fading’, we need one final step.
Create new calculated fields
True
TRUE
False
FALSE
Add both these fields to the Detail shelf of each of the 4 sheets.
Then add a dashboard filter action for each sheet which on select, goes from the sheet on the dashboard to the worksheet itself, passing the selected fields of True = False. Show all values when unselected.
My published viz is here. Kyle’s solution with a lot less calculated fields, and only 2 sheets (1 for the jitter and 1 for the summary section at the bottom) is here. You will need to pivot the data via the data source pane first through 🙂 Next time, when I really feel something should be able to be done in 1 sheet, I’ll try to think a little longer…. upshot though, I impressed myself at the use of the caption for the summary – something I must consider using more often!
It only seems like yesterday I was writing a solution guide, and I’m back at it again. This week Sean asked us to recreate this challenge to build a small multiple / trellis chart using table calculations only.
A note on the data
After downloading and connecting to the provided data source, I found the dates weren’t coming through as intended – they’d been transposed from dd/mm/yyyy to a mm/dd/yyyy so consequently the only dates I was getting were for the first 12 days in January for every year. Rather than trying to solve this at source, I just created a new field which transposed the Date field back so it behaved as I expected
You may not need to do this if the data pulls in correctly.
Filters
There are two filters that should be applied, which can either be added as data source filters (right click data source > Edit data source filters) or can be applied to the Filter shelf on any sheets created. Ultimately, this challenge only requires 1 sheet, but when building and verifying logic, I tend to have additional ‘check sheets’. I therefore added the filters below to the Filter shelf of the first sheet I started working with, but set them to apply to all worksheets using the data source (right click pill once it’s on the filter shelf -> Apply to Worksheets -> All using this data source).
Gender : All
Date Corrected : starting date = 01 Jan 2012
Setting up the data
As is good practice when working with table calculations, I start by building out the calculations I need and validating them in a tabular format before I build any vizzes. So let’s do that.
All the countries are displayed in capital letters, so we need
Country UPPER
UPPER([Country Name])
Add this to Rows
Additionally, for the purpose of validation and performance only, add this field to the Filter shelf too and just filter to Australia and Austria.
If you haven’t already added them as data source filters, apply the filters mentioned in the section above to this sheet too and set to apply to all worksheets using the data source.
Add Date Corrected to Rows as a discrete (blue pill) exact date. Format the date so it displays in month year format.
Add Unemployment Rate to Text. Format this number to 1 decimal place and add a % as a suffix.
Now for the table calcs
Median
WINDOW_MEDIAN(SUM([Unemployment Rate]))
Format this to display as a % using the same option as above. Add this to the table and set to compute using Date Corrected
You should find that your median value only differs by country.
Now we work out
Variance
SUM([Unemployment Rate]) – [Median]
Format this to display as a % and add to the table, setting the table calc to compute by Date Corrected again. This is the measure that will be used to plot the trend line against.
We also need to display the range of Unemployment rates for each country – ie we need to work out the minimum and maximum values.
Max Unemployment Rate
WINDOW_MAX(MAX([Unemployment Rate]))
Min Unemployment Rate
WINDOW_MIN(MIN([Unemployment Rate]))
Format both of these to display as 5 with 1dp, and add to the table, verifying the calculations are computing by Date Corrected once again. Verify you get the same values for all the rows associated to a single country.
Now we know the calculations are as expected, we can start to build out the viz.
Building the core chart
To start with we’ll just focus on getting the line chart with the associated text displayed for the two countries Australia & Austria. So on a new sheet add Country (UPPER) to Filter and filter to these selections. The other filters should automatically add.
Add Country UPPER and Date Corrected (green continuous exact date) to Columns and Variance to Rows. Set Variance to Compute By Date Corrected.
Add Unemployment Rate to the Tooltip and adjust the text to match.
To add the country title and other displayed text, we’re going to use a ‘fake axis’ and plot a mark at a central date. On hovering over the solution, October 2016 seems to be the appropriate date selected. So we need
Title Position to Plot
IF [Date Corrected] = #2016-10-01# THEN 1 END
Add this to Rows in front of the existing pill. Change the mark type of this measure only to a circle and re-Size to make it as small as possible and adjust the Colour Opacity to 0%. This will make the mark ‘disappear’.
Add Country UPPER, Median, Max Unemployment Rate and Min Unemployment Rate to the Label shelf of this marks card. Ensure all the table calculation fields are set to compute by Date Corrected. Adjust the text as required, and align centre. Ensure the Tooltip is blank for this marks card.
Change the colour of the variance line to grey, then remove all gridlines, row dividers and axis. Set the Column Dividers to be a thick white line (this will help provide a separator between the small multiples later).
Creating the trellis
There are multiple blog posts about creating trellis charts. My go to post has always been this one by Chris Love. It’s a more complex solution that dynamically flexes the number of rows and columns based on the number of members in the dimension you’re visualising. There have also been other Workout Wednesday challenges involving trellis charts, which I’ve blogged about too (see here).
Ultimately we’re aiming to determine a ‘grid position’ for each member of our dimension. In this case the dimension is Country UPPER and its a static list of 36 values, which we can display in a 6 x 6 grid. So Australia needs to be in row 1 column 1, Austria in row 1 column 2….. Costa Rica in row 2 column 1… USA in row 6 column.
As our members are static, the calculations we can use for this can be a bit simpler than those in Chris’ blog.
Firstly, let’s get our data in a tabular layout so we can ‘see’ the values as we go.
Duplicate the data sheet we built up, then move Measure Name and Date Corrected from Rows/Columns to the Detail shelf. Remove the Country UPPER field from the Filter shelf. You should have something like below, showing 1 row per country
Double click into the Rows shelf and type in INDEX(), then change the resulting pill to discrete (blue). You will see that index numbers every row. It’s a table calculation and although working as expected, let’s explicitly set it to compute using County UPPER.
Let’s now create our grid position values.
Cols
FLOAT(INT((INDEX()-1)%6))
This takes the Index value and subtracts 1, and returns the remainder when divided by 6 (%6=modulus of 6 – ie 6%6=0, 7%6=1). 6 is the number of columns we want.
Rows
FLOAT(INT((INDEX()-1)/6))
This takes the Index value and subtracts 1, and returns the integer part of the value when divided by 6. Again we’re using 6 as this is the number of rows we want to display.
Add these to the table, set to be discrete (blue) and compute using Country UPPER.
You can see that the first 6 countries are all in the same row (row 0) but different columns (0-5).
Now that’s understood, we can create the small multiples on the viz.
Duplicate the sheet we created further above which displays the trend graph for Australia & Austria. As we’re now going to make the changes to create the charts for every country, if things go a bit screwy, you can always get back to this one to try again :-).
Add Cols to Columns. Set to discrete and compute using Country UPPER. Add Rows to Rows and do the same thing. Move Country UPPER from Columns to the Detail shelf on the All marks card. Then remove Country UPPER from the Filter shelf.
Hopefully everything worked as expected and you have
Final step is to uncheck Show Header against the Cols and Rows pills so they don’t display and you can add to a dashboard.
It was Kyle’s turn to set this jittered box plot challenge this week. While it may sound complicated, this is quite a straightforward challenge this week, made more so as Kyle very kindly provides references to other blogs which help you.
So let’s build…
Firstly you need to download the 2 excel files Kyle provides, then relate them via the Team field (this should happen automatically).
On a new sheet add Team to Columns, Age to Rows and Player Code to Detail. Change the mark type to circle, and reduce the Size slightly.
From the Analytics pane, drag box plot onto the sheet and drop onto the Cell image that displays.
Create a new field which is the key field for the jittering functionality
Jitter
RANDOM()
This just generates a random number between 0 and 1.
Add this to Columns and change the view to Entire View so its not all squashed up.
We’ve got our jittered box plot. Now we just have to add in the additional functionality required.
Drag the Playoffs field so that it’s above the line in the data pane on the right hand side (ie change it from a measure to a dimension). Then right click > Aliases to alias to 0 and 1 values to the labels required
Add this field to the Columns in front of the Team field. Re-order so ‘Playoff Teams’ is listed first (I just click on the field name and drag it to the left).
We also need to sort the data based on the median value per team. We need a new field for this.
Median Age per Team
{FIXED [Team]:MEDIAN([Age])}
Add a sort to the Team field, so it sorts by the Median Age per Team descending
Add League to the Filter shelf and set to AL.
Format the worksheet and set the Column Banding to be level 0 and band size 1 to shade the sections as required.
Then adjust the format of the gridlines to remove all row and column gridlines.
Finally, add Name to the Tooltip and adjust accordingly. Then hide the Jitter axes (uncheck Show Header) and adjust the Age axis so it is fixed to start at 18.
You can now add this to a dashboard and you’re done! My published viz is here.
It was Sean Miller’s turn to expand on last week’s #WOW2022 challenge, by adding an additional viz to the scatter plot (challenge details here).
The assumption is you should be able to build on the challenge solution you have built in the previous week. My solution to the Reference Box challenge is here. I adopted (and blogged about) a table calc solution. However, the published solution used LoDs. Both methods achieved the desired result, but I decided when starting this challenge, that I would build this ‘extension’ using LoDs too, so anyone who uses the published solution as a starting point, gets help via this blog.
So if you used my previous blog to build the challenge, you’ll need to first create the LoD equivalent of the calculated fields we used (note I did not change the scatter plot viz to use these fields) :
Note – It’s also worth reiterating at this point, that I will be referencing calculated fields/parameters/objects in this blog created as part of my initial challenge.
In order to build the bar chart, we need to categorise each State into a grouping; the selected/highlighted state, the states in the reference box, all other states.
To identify the states in the reference box, we need to use FIXED LoDs to get the value of the Sales and Profit Ratio at the State level.
Sales by State
{FIXED [State]: SUM([Sales])}
PR by State
{FIXED [State]: SUM([Profit])/SUM([Sales])}
We can then use these fields along with the LoDs further above to determine whether a State is in the reference box
In Reference Box?
[PR by State]>=[PR-25th Percentile LOD] AND [PR by State]<= [PR-75th Percentile LOD] AND [Sales by State]>=[Sales – 25th Percentile LOD] AND [Sales by State]<= [Sales – 75th Percentile LOD]
This simply returns a boolean.
Now we can categorise each State
State to Display
IF [Selected State] THEN [State] ELSEIF [In Reference Box?] THEN [State] ELSE ‘Other (median)’ END
And with the above, we can then define the measures we want to show
Sales to Display
{FIXED [State To Display]: MEDIAN([Sales by State])}
PR to Display
{FIXED [State To Display]: MEDIAN([PR by State])}
And with these fields we can now build the bar chart.
Add Selected State to Rows and drag the dimension value, so True is listed before False.
Add In Reference Box? to Rows, and again drag so the True is listed before False.
Add State to Display to Rows
Add Sales to Display to Columns and Sort descending
Add PR to Display to Columns
On the All Marks Card add Selected State to Colour
Then add In Reference Box? to the Detail shelf. Then click on the … icon to the left of the In Reference Box? pill on the marks card, and change to the Colour icon. This should result in 2 fields on the Colour shelf.
Adjust the colours accordingly.
Add Sales to Display and PR to Display to the Tooltip shelf and adjust.
Change the titles of the axis
Remove row banding
Uncheck Show Header against Selected State and In Reference Band?
Hide field labels for Rows against the State to Display column heading
Add this sheet onto the dashboard and you’re done 🙂
Erica Hughes set her first #WOW challenge this week, which is all about reference lines. The intention by the #WOW crew is to use this initial challenge as the basis for the next few weeks, so it’s going to be interesting to see how that works out.
But back to this week. We’re building a basic Sales vs Profit Ratio scatter plot by State, but then need various calculated fields to plot the reference lines and the reference box.
Building the basic chart
Adding the dotted reference lines
Adding the reference box
Making the y-axis symmetrical
Highlighting the selected state
Building the basic chart
First up we need to create our favourite calculated field
Profit Ratio
SUM([Profit]) / SUM( [Sales])
format to percentage with 0 dp.
Create the basic scatter plot as below
Remove all gridlines, zero lines and axis rulers and tick marks.
Adding the dotted reference lines
These lines are the median values for the Profit Ratio and Sales.
Sales – Median
WINDOW_MEDIAN(SUM([Sales]))
format to Custom Currency with 1 dp, $ prefix and display units to Thousands (K).
PR- Median
WINDOW_MEDIAN([Profit Ratio])
format to percentage with 0 dp.
Add both these fields to the Detail shelf, and adjust the table calculation settings of both fields so they are computing by State
Add a reference line to the Sales axis (I do this by right clicking on the axis > Add Reference Line, but you can also drag from the Analytics pane).
Set the reference line to be against the Entire Table, using the Sales – Median field, no label, and the tooltip set to custom as below. Set the line to be dotted.
Apply a similar reference line to the Profit Ratio axis, using the PR – Median value instead.
Adding the reference box
I have to admit, this bit took a bit of trial and error. I knew it was going to involve a combination of bands and lines and colouring above and below, but things didn’t always go to plan. It was this post by Jonathan Allenby that helped.
The block is basically bounded by the 25th & 75th percentile of the Sales and Profit Ratio values. So lets create them
Sales – 25th Percentile
WINDOW_PERCENTILE(SUM([Sales]),0.25)
Sales – 75th Percentile
WINDOW_PERCENTILE(SUM([Sales]),0.75)
Add these fields to the Detail shelf and set the table calculation on both to compute by State.
Add a reference line again to the Sales axis, and this time select Band that goes from the Sales – 25th Percentile to the Sales – 75th Percentile and fill based on the colour stated in the requirements.
Now add a reference line to the Profit Ratio axis. This time select Distribution.
Select the computation to be Percentiles and enter 25 as the value. Set Label, Tooltip and Line to None. Now this is the odd bit… set the Fill to the colour white, then check the Fill Below box, which then seems to set the Fill option to ‘No Fill’. Doing this seems to have the desired effect. Trying this in any other order, or manually selecting the Fill option to be No Fill, just doesn’t seem to work…
Now repeat the process adding another reference line on the Profit Ratio axis, and this time create a distribution for the 75th percentile. This time, after setting the Fill to white, check the Fill Above checkbox. Weirdly this time, the fill on my laptop got set the Grey Light, although it was correctly displaying as white on the screen. I manually changed it to ‘No Fill’.
It feels like there’s something a bit ‘buggy’ with all this, which might explain while my initial attempts were failing. I’ll be interested in knowing if you see this behaviour too (I created on v 2021.4.2).
Making the y-axis symmetrical
This was a ‘bonus’ feature, but is again achieved with a reference line. What we’re looking for here is to understand what is the maximum absolute value so we can determine where to place a hidden reference line – ie if the maximum profit ratio is 20% and the minimum is -30%, the maximum absolute value is 30%, and we’d need a reference line +30%, so the axis extends from -30% to +30%. Conversely if the maximum profit ratio is 40% and the minimum is -25%, the maximum absolute value is 40% and we need a reference line at -40% for symmetry.
I’ve encapsulated all this logic within one field below
Ref Line – Profit Ratio Symmetry
IF MAX(WINDOW_MAX([Profit Ratio]), ABS(WINDOW_MIN([Profit Ratio]))) = WINDOW_MAX([Profit Ratio]) THEN -1*WINDOW_MAX([Profit Ratio]) ELSE -1*WINDOW_MIN([Profit Ratio]) END
The statement MAX(WINDOW_MAX([Profit Ratio]), ABS(WINDOW_MIN([Profit Ratio]))) is returning the maximum of the max and min values (ie MAX(20, ABS(-30)) = MAX(20,30) = 30, MAX(40, ABS(-25)) = MAX(40,25) = 40.
Add this onto the Detail shelf and add another reference line to the Profit Ratio axis, setting it to use the Minimum of the Ref Line – Profit Ratio Symmetry field
Test how the field works, by selecting a few of the State marks towards the top and exclude; the axis should change, but still be symmetrical.
Highlighting the selected state
For this we need a parameter
pSelectedState
string parameter defaulted to Virginia
Then we need a field to identify which state has been captured
Selected State
[State]=[pSelectedState]
which returns a boolean. Change the mark type to Circle and add Selected State to the Colour shelf, and adjust accordingly (add a border to the circle too via the Colour shelf). Also add Selected State to the Size shelf and again tweak.
Once the sheet has been added to the dashboard, create a dashboard parameter action that passes the State field into the pSelectedState parameter on Select of the chart.
The final step to add is to stop all the other marks from ‘fading’ out when a State mark is selected. This is achieved by creating the following calculated fields
True
True
False
False
Add both of these to the Detail shelf of the scatter chart.
On the dashboard, create a dashboard filter action as below, which passes selected fields setting True = False, which can never be true and prevents the mark from being highlighted.
And with that, you should have the desired dashboard. I’m interested to see if this matches Erica’s solution (it’s likely I’ll start with the provided workbook next week, rather than mine, just in case there are some discrepancies – eg I’ve used a lot of table calcs… LoDs may have been possible…)