Can you label & sort small multiples?

A colourful #WOW2022 challenge this week set by Kyle Yetter and using his favourite data – Baseball. Let’s jump straight in.

Building the required calculations

First up we need to calculate the core measure the viz is based on – % of wins

Win %

SUM([Wins])/SUM([Games])

I formatted this to 3 decimal places, then applied a custom number format to remove the leading 0 (custom number format looks like ,##.000;-#,##.000).

We also need to know the number of losses as this is part of the tooltip.

Losses

SUM([Games]) – SUM([Wins])

Let’s pop all these out into a table (I formatted all the whole numbers to display without any decimal places).

The viz however isn’t plotting the actual Win%, it’s plotting the difference from 50% (or 0.5), so values less than 50% are negative and those above are positive.

Plot Postion

[Win %] – 0.5

And we also need to know whether the Win% is above 50% or not

Above 50%

[Win %]>0.5

Pop these out onto the table too

The viz also displays the overall Win% for each team, and also uses this to sort the data. As it is used for sorting, we need to use an LoD calculation (rather than a table calculation).

Overall Win% LOD

{FIXED [Team]:SUM([Wins])} / {FIXED [Team]: SUM([Games])}

for each team, get the total wins, and divide by the total games for the team. Format this to 3 dp with no leading 0 as before.

pop this into the view (you’ll see it’s the same value for each row for a single team), and then apply a Sort on the Team field to sort descending by the Overeall Win% LOD.

Now we have the data sorted, we can create the fields needed to build the trellis chart.

I have already blogged challenges relating to trellis charts / small multiples (see here) which in turn reference other blogs in the community, so I’m not going to go into all the details. We just need to build two calculated fields to identify which row and which column each Team will sit in. The table is fixed at 6 columns wide as the data wea re using is static. Some solutions work with a more dynamic layout depending on how many entities you need to display. We’re keeping things simpler.

Cols

FLOAT(INT((INDEX()-1)%6))

Rows

FLOAT(INT((INDEX()-1)/6))

Add both these fields to the table as discrete dimensions (blue pills), and as they are both table calculations, set them both to Compute Using – > Team.

Building the Core Viz

On a new sheet, add Cols to Columns as discrete dimension, Rows to Rows as discrete dimension and Team to Detail. Set both Rows and Cols to Compute Using Team.

Add Year as continuous (green) pill to Columns and Plot Position to Rows and change the mark type to Bar and reduce the size. Sort the Team field based on Overall Win% LOD descending.

Add Wins, Losses, and Win% to the Tooltip shelf and adjust the tooltip to display as required. Add Above 50% to the Colour shelf (you may need to readjust the size). Leave the colours as they are for now – we’ll deal with this later.

Adding the labels

Create a new calculated field

Dummy Plot

FLOAT(IF [Year]=2000 OR Year = 2020 THEN 0.35 END)

This is basically going to position a mark at height 0.35 but only if the year is either 2000 or 2020. These values were all just based on a bit of trial and error as to what worked to get the desired result.

Also create a field

LABEL:Team

IF [Year]=2000 THEN [Team] END

and

LABEL:Win%

IF [Year]=2020 THEN [Overall Win % LOD] END

format this to 3dp and exclude the leading 0.

Add Dummy Plot onto Rows and change the mark type of this measure to circle. Amend the Tooltip of this marks card so it’s empty.

Add LABEL:TEAM and LABEL:Win% to the Label shelf, and adjust the label so both fields sit side by side (only 1 value will only ever actually display). Adjust the table calculation of both the Rows and Cols pills so they now compute using both the Team and the LABEL:Team fields.

Adjust the alignment of the labels so they are positioned bottom centre. Set the font colour to match mark colour and bold.

Then reduce the size of the circle mark to as small as possible, reduce the opacity of the mark colour to 0.

Now make the chart dual axis and synchronise the axis. Remove the Measure Names field that has automatically been added to the All marks card.

Hide all the headers and axis (uncheck Show Header), remove all grid lines, zero line, axis rulers.

Hide the null indicator (bottom right).

Colouring by Team

Copy the colour palette text Kyle provided into your preferences.tps file (usually located in the My Tableau Repository directory). For more information on working with custom colour palettes see thisĀ Tableau help article.

You’ll need to save your workbook and re-open for the new palette to be available for use.

In order to prevent having to manually set all the colours (and believe me you don’t want to do this!), perform the following steps in order

  • Add Team to also be on the Colour shelf. Click on the 3 dots (…) that are to the left of the Team pill on the All marks card, and change it to Colour. This means there are now 2 fields on colour. Move the Team field so it is listed above the Above 50% pill. This means your colour legend should be listed as <Team>, <True|False>
  • Adjust the Sort of the Above 50% pill, so it is manually sorted to list True before False.

  • Now change the Sort on the Team field so it is sorted alphabetically ascending instead. This will cause the viz to change its sort order, but don’t worry for now. It also changes the list on the colour legend, so ARI, True is listed first then ARI, False etc.

  • Now edit the Colour Legend and select the new MLB Team Colours palette we added. Click the Assign Palette button to automatically assign the colours. As we’ve made sure the entries listed are in the right order, they should get the correct colours.

  • Change the Sort on the Team field back to be based on Overall Win% LOD descending

And that should be it. You can now add the viz to a dashboard and publish. My published version is here.

Happy vizzin’!

Donna

Advertisement

Thanksgiving Day NFL Games

Sean Miller posted this week’s challenge based on the results of the annual NFL games hosted on Thanksgiving Day. It immediately reminded me of a previous #WOW challenge that Lorna posted in 2019 when she visualised Rugby League wins (see my viz here).

This is a table calculations based challenge. I did start using FIXED LoDs to help calculate the summary measures (Total Games and Win %) displayed at the front, but found that as there are 2 years (1975 and 1977) when the Dallas Cowboys did not host a game, I ended up with some pesky NULL values displaying which affected how the running sum area chart displayed.

Defining the calculations

As its a table calc challenge, I’ll build out what I can into a table to start with, to sense check I’m getting the correct numbers.

First up add Home Team, Game Date and Visiting Team to Rows and display Home Score and Visiting Score.

We start by determining the result of the fixture, based on whether it’s a home or away win or a tie. In the lollipop chart home wins are plotted at 1 and away wins at -1, so we’re going to store the result as a numeric value rather than text.

Result

FLOAT(IF [Home Score]>[Visiting Score] THEN 1
ELSEIF [Home Score]<[Visiting Score] THEN -1
ELSE 0 END)

The output is wrapped within a FLOAT, as this will help how the axis displays. Without it, by default Tableau will define the field to be a whole number, and the axis will extend to +/-2 which is too much room. We can’t adjust (fix) the axis to a decimal if the field itself is an integer, and adjusting to +/-1 chops off the displayed marks.

If you add this to the display, it will show 1, 0 -1 as you expect. You’ll notice though that the Axis on the lollipop chart is labelled as Win/Loss. This is achieved by applying a custom format to the field – “Win”;”Loss”;”Tie”

This is a sneaky but effective trick. The information stated before the first semi-colon applies to positive numbers, the info after the first semi-colon applied to negative numbers, and the information after the optional second semi-colon applies to zero.

Unfortunately though, it would appear that, at the point of writing, Tableau Public, isn’t honoring the zero formatting, and is displaying Win rather than Tie. The display works on Desktop though.

The win/loss/tie text is just a formatting feature and affects what is displayed, but the underlying value is still a number.

The Result field will be used to plot the lollipop chart. We now want a field to plot the area chart against. This is a running total of the Result values (ie win =1, win, win = 1+1, win, win, loss = 1+1 -1) and we need a table calculation.

However, as stated above due to a couple of missing years, I had to make an adjustment to ensure the running total displayed as Sean had in his challenge. I created another field

Result Adjusted

IIFNULL(SUM([Result]),0)

If the Result field doesn’t exist, as there is no data, then use 0 instead.

To see what’s going on, we’re going to need a different view of the data where the date field is continuous (green) rather than discrete (blue).

Build the below, and filter just for the first 10 years – you’ll see the gaps where the are no marks in 1975 and 1977 for Dallas

Use the context menu of the green YEAR(Game Date) pill and select the option to Show Missing Values. Marks will now display

Add Result to Label. Each mark is labelled Win or Loss, except the ones for Dallas for 1975 & 1977 as there is no data

Now add Result Adjusted to Label. A 0 value is now displayed against those two marks.

We can now build a running total off of this measure instead

Running Total Wins

RUNNING_SUM(([Result Adjusted]))

Add this to the Label too and verify the table calculation is computing by the Game Date field only. The running total for the 2 ‘missing’ dates is displaying a value which is the same as the previous value (since we’ve added 0 onto the running total). This will give us the flat line in the area chart when we come to build it.

Now back to our table of data, we can focus on the other calculated fields we need….

Total Games

WINDOW_COUNT(COUNTD([Game Date]))

This is a table calculation and is simply counting the number of distinct dates displayed. Add this to the table display we were building to start with, and adjust the table calculation to compute by all fields except Home Team. The total should display the same value for all the rows against each Home Team.

Next we want a field to indicate if the row is a win.

Is Win?

INT([Home Score]>[Visiting Score])

This is taking a boolean of true or false and converting to an INT (1 or 0).

From this we can work out the Win rate

Win %

WINDOW_SUM(SUM([Is Win?]))/[Total Games]

Add up all the Is Win? values associated to the Home Team as a proportion of the Total Games played. Format this field to a percentage with 0 dp. Again, add to the table and adjust the table calc to compute by all fields except Home Team, and verify the same settings applied to both the calculations nested in this calculation

For the All-Time Record, we need to know the number of wins and number of losses. We have a field to help us with the wins, but need an equivalent for the losses

Is Loss?

INT([Home Score]<[Visiting Score])

And from this we can work out

All-Time Record

STR({FIXED [Home Team]: SUM([Is Win?])}) + ‘-‘ +
STR({FIXED [Home Team]: SUM([Is Loss?])})

This is the one field I kept from my LoD based attempt.

The circles on the lollipop chart are coloured based on the difference in the score, so lets’s create that

Score Difference

[Home Score]-[Visiting Score]

And finally we need some fields to help display the tooltips properly. The tooltip indicates whether the result was ‘won’ or ‘lost’ which is different text to the axis labels.

TOOLTIP-Result

IF [Result]=1 THEN ‘won’
ELSEIF [Result]=-1 THEN ‘lost’
ELSE ‘tied’
END

The tooltip also displays the scores, but the scores are always presented as highest score – lowest score and not home score – visiting score. So we need fields to store the right values

TOOLTIPHigher Score

IF [Is Win?]=1 THEN [Home Score] ELSE [Visiting Score] END

TOOLTIP – Lower Score

IF [Is Loss?]=1 THEN [Home Score] ELSE [Visiting Score] END

Pop all these fields out onto the table, so you can validate you’ve got all your calcs right before building the viz.

Building the area chart

Add Home Team to Rows, Game Date (continuous, show missing values) to Columns and Running Total Wins to Rows (ensure table calculation set as required). Change to mark type of Area. You should have 2 horizontal lines from 1974-1975 and 1976-1977 against the Dallas Cowboys row.

Adjust the tooltip, edit the label of the Running Total Wins axis , and remove the label of the Game Date axis.

Building the lollipop chart

Now add Result to Rows directly after the Home Team pill. Change the mark type to circle.

Add Score Difference to the Colour shelf of the circle mark, and adjust the starting colour range to a dark grey. Readjust the colour of the area chart to blue too. Add a border to the area chart too (via the colour shelf).

Add another instance of Result to the Rows shelf, next to the existing one. Set the mark type of this to bar. Reduce the size to the smallest possible, set the colour to grey and remove the border.

Now set this to be dual axis, synchronise the axis, and set the marks of the 2nd Result axis displayed on the right hand side to move marks to back. Uncheck Show Header to remove this axis from displaying.

Add Visiting Team, TOOLTIP-Result, TOOLTIP-Higher Score and TOOLTIP-Lower Score to the Tooltip shelf of both the Result marks cards, and adjust the tooltip on both to

Remove the Column dividers.

Now drag Total Games to Rows and drop next to the Home Team field. Change to be discrete (blue). Verify the number is what you expect and adjust the table calc if need be.

Add All-Time Record and Win % (set to discrete) to the view too. Then format these 4 fields so the text is larger and aligned centrally.

All that’s left now is to add the sheet to a dashboard. My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you create a clustered histogram?

Ann Jackson returned with this week’s #WOW2020 challenge, to create a ‘clustered histogram’, whereby the orders in 2020 were placed in ‘bins’ based on the sale value.

I found this challenge quite straightforward this week, as its very similar to a previous challenge I’ve already blogged about in week 23, which created a side by side bar chart by month.

So what are the main points for this blog

  • Creating the bins
  • Getting the bars side by side
  • Tooltips

Creating the bins

I first created an LoD to store the total value of the sale for an Order

Order Value

{FIXED [Order ID]: SUM([Sales])}

then I need to ‘bin’ this, but as the sales over $2000 needs to be lumped together, we can’t use the traditional binning functionality. We also don’t want to have a massive case statement to assign the values. Instead we can do a bit of maths…. we want to essentially round each order value to the nearest 100.

Round Up to 100

(CEILING([Order Value]/100) * 100)

So if the Order Value is 39 for example, when divided by 100 this will be 0.39. The CEILING function always rounds up to the nearest whole number, so in this case will return 1, which is then multipled by 100 to give us 100.

Doing this, every order value is then assigned a ‘bin’ of 100, 200, 300 etc

I then created

Sales Bin

IF [Round Up to 100] >2100 THEN 2100 ELSE [Round Up to 100] END

to apply the grouping of all the values which were greater than 2100 (since 1999 would be rounded up to 2000 and 2001 would be rounded up to 2100).

Getting the bars side by side

We have the Sales Bin measure and will also need to plot the count of orders

# Orders

COUNTD([Order ID])

Plotting this out and splitting by Segment we get

but we don’t want the segments stacked, we want them side by side.

The Sales Bin axis is ‘continuous’, which means a value can be plotted at any number along the line, it just happens to be at the ‘100’ marks as that’s where are bins are.

So we use a sort of ‘jittering’ to plot each bar at a slightly different value depending on the segment

SALE AMOUNT

CASE [Segment]
WHEN ‘Consumer’ THEN [Sales Bin] – 75
WHEN ‘Corporate’ THEN [Sales Bin]- 50
ELSE [Sales Bin]- 25
END

So for all the orders in the ‘100’ bin (ie the order value was between 1 and 100), all the Consumer orders will actually get plotted at 25, Corporate at 50 and Home Office at 75.

All this is explained in much more detail in the Week 23 blog post referenced at the top of this post.

Tooltips

The tooltip has different text depending on where you hover.

We need the lower value for the bin range (eg $0-100$), so I created

Round Up to 100 minus 100

[Round Up to 100]-100

And I then created 4 different ‘tooltip’ calculations which I could place on the tooltip to give me the display I needed:

TOOLTIP Upper

IF [SALE AMOUNT]<2000 THEN [Round Up to 100] END

TOOLTIP Lower

IF [SALE AMOUNT]<2000 THEN [Round Up to 100 minus 100] END

TOOLTIP between

IF [SALE AMOUNT]< 2000 THEN ‘ between’ END

TOOLTIP Symbol

IF [SALE AMOUNT]< 2000 THEN ‘ – ‘ ELSE ‘$2000+’ END

The final thing needed is to create a reference band to colour the section at the end, and fix the axis to start after 0 and end at 2099, so you don’t get 0 and 2100 displayed on the axis.

So a relatively short write up today – think this is the quickest blog I’ve written.

My published viz is here.

Happy vizzin’! Stay Safe!

Donna