Can you build a divergent histogram?

December 18, 2020 Donna ColesLeave a comment

It was Ann Jackson’s last #WOW2020 challenge of the year, to build this divergent histogram depicting the average life expectancy by gender, with each bar indicating the number of countries with that life expectancy.

Ann hinted that reshaping the data may help, so I chose to do this by pivoting the data. To do this, in the data pane, I selected both the Life Expectancy Male and Life Expectancy Female columns, then right-clicked and selected Pivot.

This results in the number of rows in the data set being doubled, with a field called Pivot Field Names containing the values Life Expectancy Male and Life Expectancy Female, and then a field called Pivot Field Values which contains the life expectancy value. This field I renamed to Life Expectancy Age and moved it to the top section of the fields pane (above the line), so it is treated like a dimension rather than a measure.

We need to count the countries, so I created

Country Count

COUNTD([Country/Region])

Let’s see what this all looks like

Filtering by Year (2012), adding Life Expectancy Age to Rows, Pivot Field Names to Columns and Country Count to Text we get

Filtering out the Null and we have the basic data we want to plot, which we can do by

Year on Filter set to 2012
Life Expectancy Age on Filter set to exclude Null
Pivot Field Names on Columns
Life Expectancy Age on Columns (continuous)
Country Count on Rows
Mark Type set to Bar
Size set to Fixed

If we move Pivot Field Names from Columns to Colour, we get an overlapping bar.

But we want the males to be going down the chart, so let’s change the Country Count field to be

Country Count

IF CONTAINS(ATTR([Pivot Field Names]),’Female’) THEN COUNTD([Country/Region]) ELSE COUNTD([Country/Region])*-1 END

which is basically inverting the values for the male, and is pretty much the crux of the challenge 🙂

We need to label the max bar for each gender, or in this case the max & min. So showing mark labels and setting to just label the max and min values, we have a couple of issues….

… the labels are positioned in the centre, and males is labelled -13, rather than 13. To fix the positioning, we need to turn stacked marks off via the Analysis menu (took a bit of while to figure this one out).

To resolve the negative labelling, I formatted the County Count field to custom format to #,##0;#,##0

The final requirement is to ensure the axis range is identical. To do this I created reference lines based on the maximum value

Max Count Ref Line

WINDOW_MAX([Country Count])

which I then copied

Max Count Ref Line (Copy)

WINDOW_MAX([Country Count]) *-1

Adding both of these to the Detail shelf, and setting the table calculation to compute by the fields below

And apart from some formatting and the tooltips, that should be complete. My published viz is here. When I looked at Ann’s version, she tackled it differently, which is the beauty of these challenges – there’s often more than one solution.

Happy vizzin’! Stay Safe!

Donna

Can you highlight profits with Measure Names?

December 11, 2020 Donna ColesLeave a comment

Luke Stanke provided his last challenge of #WOW2020 for the year with this viz to utilise Measure Names & Measure Values. This plots Cost against Sales per Sub-Category (the circles), which are then bounded by a lozenge. The Sales circle and the lozenge are coloured based on profitability, and the lozenges are labelled by Profit, although the position of the label is also based on profitability.

Obviously as the title of the challenge suggests, using Measure Names & Measure Values is key to this. As usual I’ll try to just pull out the main points I think you might find most tricky.

Building the core chart
Aligning the Sub-Categories against a horizontal line
Colouring the circles & lozenge
Positioning the labels
Formatting the labels
Creating a border around the white Cost circle

Building the core chart

You’ll need to create a calculated field to store the cost

Cost

SUM([Sales]) – SUM([Profit])

Once done, then add Sub-Category to Rows and sort by descending by Profit. Add Measure Values to Columns and filter just to Sales and Cost. Change the mark type to circle.

Add a 2nd instance of Measure Values to Columns and change the mark type to Line. Add Measure Names to Path.

Then make this chart dual axis and synchronise the axis. This will give the base chart.

Aligning the Sub-Categories against a horizontal line

Whenever you add a blue discrete pill to the rows or columns, you’ll get a ‘header’ pane, which means the label for the pill is by default positioned in the centre of a row, as demonstrated below if I add row dividers to the chart

The requirement is to show the row label aligned with a horizontal line. To achieve this, we need to add a MIN(0) field the Rows. Adding this green pill, creates an axis which has a 0 horizontal 0 line, which is positioned exactly in alignment with the row label

Removing the row dividers, and uncheck Show Header on the MIN(0) pill, and the right aligning the row label, we get the desired result.

Colouring the circles & lozenge

The circles are either coloured white for Cost or dark red or green for Sales depending on the profitability of the Sub-Category.

So we need a new calculated field

Is Profitable?

SUM([Profit])>0

This returns true or false.

On the circles Measure Names mark card, add Is Profitable to the Colour shelf. If it’s not already there, then add Measure Names to the Detail shelf, then select the detail icon and choose the colour icon to also add Measure Names to the Colour shelf

This has the effect of giving you 4 colour combinations

which can then be edited to set the 2 combinations relating to Cost to white, Sales & True to dark green and Sales & False to dark red (at this point, you probably want to be setting the background colour of the whole worksheet to light grey (right click on sheet, format, select the Shading icon)

For the lozenge, you just need to add the Is Profitable field the the Colour shelf of the line Measure Names card, and adjust the size of the to make it slightly bigger than the circles (you may need to move marks to back to get the lozenge to sit behind the circles).

Positioning the labels

So the tricky thing about the labels is where they are positioned – they are to the right for the profitable sub-categories, and to the left for the non-profitable sub-categories. When you add a label, you can’t provide that sort of conditional logic to control the positioning. However, we happen to have 2 sets of marks cards (as dual axis), and so can use different fields to apply the labelling.

Label Profit is +ve

IF [Is Profitable] THEN SUM(Profit) END

This will only return a value for profitable sub-categories, otherwise the field will store blank.

Label Profit is -ve

IF NOT([Is Profitable]) THEN SUM(Profit) END

Conversely, this will only return a value for the non-profitable sub-categories.

Add the Label Profit is +ve field to the Label shelf of the Circle marks card. Adjust the positioning to be middle right, and adjust the settings as below (tbh this was just a bit of trial and error to find this worked :-)). You may want to additionally add a few <spaces> to the start of the label text to add a bit of breathing room.

Then on the Line marks card, add the Label Profit is -ve field to the Label shelf, but this time choose to Label Line Ends, and select Label end of line. Note – you can try to apply the same settings as above, but I found for some reason it increased the size of the mark around the line end… weird…

Formatting the labels

Right click on the Label Profit is +ve/-ve fields created and custom format with

+”$”#,##0;-“$”#,##0

Creating a border around the white Cost circle

Finally, a sneaky requirement was to add a border around the white circle that represented Cost only. This implied that using the formatting options on the Colour shelf that allows a border to be added to the mark wasn’t going to work, as this will apply to all marks and not just to the Cost circle.

I handled this by creating a 2nd instance of the Cost field (right click Cost and select duplicate).

I then added this Cost (copy) field to the Measure Values section of the chart on the left hand side. This doesn’t appear to do much, since the mark is directly on top of an existing mark, but you will see the Colour legend associated to the Circle marks card change and add 2 new entries.

I changed the colours of the 2 new entries in the Colour legend to a dark grey, then added Measure Names to the Size shelf of the Circles marks card. I then adjusted the range and size of the circles so that the Cost measure is slightly smaller than the Cost (copy) measure. Ultimately all 3 circles are slightly different sizes, but the range is so narrow, it’s hard to tell.

And that’s it. A lovely looking viz with a fair bit going on under the bonnet, but nothing too complex! My published version is here.

Happy vizzin’! Stay Safe!

Donna

Can you toggle between AND|OR filtering logic?

December 4, 2020 Donna ColesLeave a comment

Week 49 of #WOW2020 was a guest challenge by Sam Epley to demonstrate the ability to toggle between filtering using AND logic and filtering using OR logic. Sam stated that it had been something he and colleagues had been pondering for a while but v2020.2 provided a possible solution.

So the first thing I did was a quick check on what functionality had been released in v2020.2 which included set controls. I also noticed when examining the published solution, that if I hovered over the Select Values filter controls, a small set icon would appear, which indicated these filters were indeed based on set controls

This blog will focus on

Select a Field for Slicer x parameter
Select Values for Slicer x filter
Building the Logic

Select a Field for Slicer x parameter

I had a slight ‘moment’ with this, until I realised I just had to copy the ╚ character and paste it into the parameter list I created

You’ll need to create 3 instances of this parameter (or create once, duplicate and rename).

Select Values for Slicer x filter

Note – what I describe here will need to be duplicated for each slicer.

There’s a chance I may well have been a bit long-winded in how I went about this, but it worked for me…

First up, I extracted the value selected in the Select a Field parameter, only returning a value if it contained the special ╚ character. So this would return the word ‘Region’ or ‘Segment’ but would return NULL if ‘Location’ or ‘Customer’ had been selected.

Selected Slicer 1

IF CONTAINS([Select a Field for Slicer 1:],’╚ ‘) THEN REPLACE([Select a Field for Slicer 1:],’╚ ‘,”)
ELSE NULL
END

I then needed to ‘map’ this value to the actual field in the data source, so I could get a handle on the actual values associated to the field

Selected Slicer 1 Values

IF [Selected Slicer 1] = ‘Region’ THEN [Region]
ELSEIF [Selected Slicer 1] = ‘State’ THEN [State]
ELSEIF [Selected Slicer 1] = ‘Category’ THEN [Category]
ELSEIF [Selected Slicer 1] = ‘Sub-Category’ THEN [Sub-Category]
ELSEIF [Selected Slicer 1] = ‘Segment’ THEN [Segment]
ELSEIF [Selected Slicer 1] = ‘Ship Mode’ THEN [Ship Mode]
ELSE ‘No Selection’
END

From this I could create a set to store the values, by right-clicking on the Selected Slicer 1 Values field and choosing Create -> Set, and selecting the Use all option.

The values in the set can then be accessible on the sheet foe selection by right-clicking on the set field and choosing Show Set

Building the Logic

To help with this, you can build out a basic view by Region that returns In or Out for each of the sets

As you can see when everything is set to ‘None Selected’, everything is In.

As we change the filter options, you can see the values change to be In and Out

For the AND logic to work, we’re looking for the rows of data where every In/Out column is In.

For the OR logic to work, we’re looking for the rows of data where In exists in at least one column.

However, when there is a No Selection option selected, all values are In the set, and when we’re using the OR logic, we don’t want these.

So we need to identify when No Selection is selected in the filter

Slicer 1 is Not Selected?

[Selected Slicer 1 Values] = ‘No Selection’

and we need this for the other slicers too.

Then we need to build a new calculated field that is going to handle all this logic, which is driven by a pLogic parameter which simply contains the values of AND and OR.

In Combined Set

IF [pLogic]=’AND’ THEN [Select Values for Slicer 1:] AND [Select Values for Slicer 2:] AND [Select Values for Slicer 3:]
ELSE //It’s OR
IF [Slicer 2 Is Not Selected?] AND [Slicer 3 Is Not Selected?] THEN [Select Values for Slicer 1:]
ELSEIF [Slicer 1 Is Not Selected?] AND [Slicer 3 Is Not Selected?] THEN [Select Values for Slicer 2:]
ELSEIF [Slicer 1 Is Not Selected?] AND [Slicer 2 Is Not Selected?] THEN [Select Values for Slicer 3:]
ELSEIF [Slicer 3 Is Not Selected?] THEN [Select Values for Slicer 1:] OR [Select Values for Slicer 2:]
ELSEIF [Slicer 2 Is Not Selected?] THEN [Select Values for Slicer 1:] OR [Select Values for Slicer 3:]
ELSEIF [Slicer 1 Is Not Selected?] THEN [Select Values for Slicer 2:] OR [Select Values for Slicer 3:]
ELSE [Select Values for Slicer 1:] OR [Select Values for Slicer 2:] OR [Select Values for Slicer 3:]
END
END

This field returns a true or false against each row in the data set, and can be used to build the bar chart display.

This covers the most ‘complex’ bit of this challenge – below shows what one of the charts looks like

If you’re not familiar with how to use measure swapping, which is another feature of this challenge, then check out a previous blog I wrote.

I also created a field to add to the Tooltip to show a $ symbol in the event the measure selected was Sum of Sales.

My published solution is here.

Happy vizzin’! Stay Safe!

Donna

Can you create a variable width bar chart?

November 27, 2020 Donna ColesLeave a comment

For #WOW2020 Week 48, Jami Delagrange set the challenge based on a problem she’d been puzzling over for a while.

The data set was one Jami created herself about turkeys creating a ‘conference’ (it is Thanksgiving week in the US).

The above is basically 2 bar charts plotting the time an event happened (on the x-axis) against the number of turkeys attending the event (on the y-axis). One bar chart is coloured based on the location, the other is a white bar which has a width based on the duration of the event.

There are a couple of calcs needed for the basic data

# Turkeys

COUNTD([User Name (Original Name)])

Duration (mins)

DATEDIFF(‘minute’,[Start time],[End time])

With these we can get a basic set of data

The Start Time and End Time fields can be custom formatted to hh:mm AMPM.

The first bar chart we need is very simple, plotting Start Time against # Turkeys, coloured by Location.

For the second chart, we need to make use of the Size shelf, and the feature that allows you to fix the size based on a field. If we add Duration (mins), we don’t get what we need

The bars are far too wide. As the Start Time being plotted is at the ‘minute’ level, each unit of the axis is 1 minute of a day. There are 24 * 60 = 1440 minutes in a day. So the size needs to be based as a proportion of the number of minutes in a day ie

Duration Proportion of Day

[Duration (mins)] / (24 * 60)

Adding into our data table, you can see what these values translate to

and if we drop this onto the Size shelf instead, we get the required display

Build these charts as a dual axis, and you’ve got the output. Tableau’s own help article, discusses the Size feature a bit more.

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you create a clustered histogram?

November 20, 2020 Donna ColesLeave a comment

Ann Jackson returned with this week’s #WOW2020 challenge, to create a ‘clustered histogram’, whereby the orders in 2020 were placed in ‘bins’ based on the sale value.

I found this challenge quite straightforward this week, as its very similar to a previous challenge I’ve already blogged about in week 23, which created a side by side bar chart by month.

So what are the main points for this blog

Creating the bins
Getting the bars side by side
Tooltips

Creating the bins

I first created an LoD to store the total value of the sale for an Order

Order Value

{FIXED [Order ID]: SUM([Sales])}

then I need to ‘bin’ this, but as the sales over $2000 needs to be lumped together, we can’t use the traditional binning functionality. We also don’t want to have a massive case statement to assign the values. Instead we can do a bit of maths…. we want to essentially round each order value to the nearest 100.

Round Up to 100

(CEILING([Order Value]/100) * 100)

So if the Order Value is 39 for example, when divided by 100 this will be 0.39. The CEILING function always rounds up to the nearest whole number, so in this case will return 1, which is then multipled by 100 to give us 100.

Doing this, every order value is then assigned a ‘bin’ of 100, 200, 300 etc

I then created

Sales Bin

IF [Round Up to 100] >2100 THEN 2100 ELSE [Round Up to 100] END

to apply the grouping of all the values which were greater than 2100 (since 1999 would be rounded up to 2000 and 2001 would be rounded up to 2100).

Getting the bars side by side

We have the Sales Bin measure and will also need to plot the count of orders

# Orders

COUNTD([Order ID])

Plotting this out and splitting by Segment we get

but we don’t want the segments stacked, we want them side by side.

The Sales Bin axis is ‘continuous’, which means a value can be plotted at any number along the line, it just happens to be at the ‘100’ marks as that’s where are bins are.

So we use a sort of ‘jittering’ to plot each bar at a slightly different value depending on the segment

SALE AMOUNT

CASE [Segment]
WHEN ‘Consumer’ THEN [Sales Bin] – 75
WHEN ‘Corporate’ THEN [Sales Bin]- 50
ELSE [Sales Bin]- 25
END

So for all the orders in the ‘100’ bin (ie the order value was between 1 and 100), all the Consumer orders will actually get plotted at 25, Corporate at 50 and Home Office at 75.

All this is explained in much more detail in the Week 23 blog post referenced at the top of this post.

Tooltips

The tooltip has different text depending on where you hover.

We need the lower value for the bin range (eg $0-100$), so I created

Round Up to 100 minus 100

[Round Up to 100]-100

And I then created 4 different ‘tooltip’ calculations which I could place on the tooltip to give me the display I needed:

TOOLTIP Upper

IF [SALE AMOUNT]<2000 THEN [Round Up to 100] END

TOOLTIP Lower

IF [SALE AMOUNT]<2000 THEN [Round Up to 100 minus 100] END

TOOLTIP between

IF [SALE AMOUNT]< 2000 THEN ‘ between’ END

TOOLTIP Symbol

IF [SALE AMOUNT]< 2000 THEN ‘ – ‘ ELSE ‘$2000+’ END

The final thing needed is to create a reference band to colour the section at the end, and fix the axis to start after 0 and end at 2099, so you don’t get 0 and 2100 displayed on the axis.

So a relatively short write up today – think this is the quickest blog I’ve written.

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you build a Premier League table?

November 13, 2020 Donna ColesLeave a comment

Luke provided this fun challenge this week. It was nice to see an English sport that I’m very familiar with feature, even if I had to keep retyping the word ‘Draw’ for ‘Tie’ 🙂

This blog will cover

Re-modelling the data
Building the table & bar chart
Building the Last 5 matches chart
Getting the data for the Last 5 matches chart tooltip

Re-modelling the data

Sometimes the #WOW challenges are set with a very specific direction in mind eg no LODs, use 1 sheet only, no data modelling etc. This challenge had no real restrictions, and there were minimal clues. The Latest Challenges page indicated they’d be table calculations involved

and Luke’s tweet introducing the challenge suggested there was a ‘trick’ that would make this a lot simpler to solve…. hmmmm???

I looked at the data referenced to get a feel of the shape and layout – it showed 1 row per game, with the home & away team for each game stored in different columns. Trying to count the number of games a specific team played was likely to be complicated in this format. Given Luke had suggested a ‘trick’, I decided I was going to pivot the data to expand the data to give me 1 row per game per team, hoping this was the ‘trick’ he implied, and there was no requirement to state that data modelling wasn’t allowed.

I did this in Tableau itself, in the data source pane. After connecting to the data, highlight the columns Home Team and Away Team, right click and select Pivot.

This has the effect of duplicating the rows of data and adding 2 additional columns to the end Pivot Field Names containing the column names that had been pivoted, and Pivot Field Values containing the values of those fields.

I then simply renamed Pivot Field Names to Home or Away and Pivot Field Values to Team.

With the data structured this way it was very simple to calculate the various measures required :

Win-Loss-Draw

IF [Home or Away] = ‘Home Team’ AND FTR = ‘H’ THEN ‘W’
ELSEIF [Home or Away] = ‘Away Team’ AND FTR = ‘A’ THEN ‘W’
ELSEIF FTR = ‘D’ THEN ‘T’
ELSE ‘L’
END

FTR in the data set stands for full-time result and indicated whether it was the Home (H) or the Away (A) team that won, or whether it was a Draw (D). So this field indicates whether the Team in each row won, lost or drew (tied) the match.

From this we can then work out the points the Team gained in the match.

Points

IF [Win-Loss-Draw] = ‘W’ THEN 3
ELSEIF [Win-Loss-Draw] = ‘T’ THEN 1
ELSE 0
END

To determine the total number of Wins for each team I created

Wins

FIXED [Team]: SUM(IF [Win-Loss-Draw] = ‘W’ THEN 1 ELSE 0 END)}

and then repeated this similarly to get a Losses and Ties measure.

Finally I determined the total number of games each team played by using the provided tablename (count) field that is now added in more recent versions of v2020 (supersedes the Number of Records field).

Matches Played

{FIXED [Team]: (COUNT([2020_11_11_WW46_EPL.csv]))}

Building the table & bar charts

In my initial built, I created one sheet to show both the table summary with the bar chart, and a separate sheet to show the last 5 matches, with the intention the charts would be lined up side by side on the dashboard, removing padding to make the charts look to be one.

But with the table and bar on the same chart, I couldn’t get the labels of each column and the ‘Total Points’ to be aligned in the way displayed. So I ended up using 3 different sheets.

Table

Given we’ve already got all the measures, this was a very simple table, with the Team field sorted descending by the Points measure.

Bar Chart

Again this is a pretty basic chart with Team on Rows,(which is then hidden) and Points on Columns, and Win-Lose-Draw on Colour.

To label the end of the chart with the total points, I created an additional measure

Total Points

{FIXED [Team]: SUM([Points])}

which I added to Columns and made dual axis, with the Total Points being a Gantt type, and the mark being labelled.

Building the last 5 matches chart

This chart is slightly more complex, and where the table calculations clue comes into play.

In the data, a match for each Team is identified by the Date, but we can’t use Date to plot against as not everyone has their matches on the same date. We just want to number each match played by each team in order of the date played.

We can use a field Index = INDEX() to number our matches per team – you’ll see that some teams have played 8 matches while others 7.

But we only want the last 5 matches, which for some teams will be matches 3-7 and others 4-8.

To do this, we need to determine the number of matches played. In hindsight I should have realised, I’ve already got this in a variable, but as my head was in ‘table calc’ mode, I created a field Size = SIZE()

As you can see, this returns the number of rows for each team against every row for that team.

I then created another calculated field

Last 5 matches only

[Index] > [Size]-5

Adding this into the table, and verifying the table calc settings for both the nested table calcs are set as above, you can see the last 5 rows of each team are set to True

This can then be added to the Filter shelf to reduce the rows shown.

However, as some teams have played less matches than others, the Index values shown aren’t consistent for every team, and we need this.

So another field is required

Index to Plot

//want a value from 1 to 5 based on the index
[Index] – ([Size]-5)

This gives us the 1-5 sequence for each row we need to build the chart

The layout below is the basic premise.

The circles need to be coloured based on the Win-Loss-Draw field, and the labelled with the same field which is set to match mark colour. Reducing the transparency of the colour will fade the circle colour and make the label stand out.

Getting the data for the Last 5 matches chart tooltip

So just by pivoting the data, I was able to crack through the measures and the charts pretty quickly, and felt happy that I’d worked out Luke’s trick.

But then I came to adding the tooltip to the circle chart which needs information about the opposition, and the score.

The score is always displayed on the tooltip as

Team ‘s score – Opposition score.

For example when Arsenal played Sheffield United at Home where Arsenal won 2-1, the score is displayed on the tooltip as 2-1.

And when Arsenal played Manchester City Away, where Manchester City won 1-0, the score for Arsenal is displayed on the tooltip as 0-1.

So to get the team’s score I created

Team Score

IF [Home or Away]=’Home Team’ THEN [Fthg] ELSE [Ftag] END

Where FTHG is Full Time Home Goals and FTAG is Full Time Away Goals. So if the team is listed as the Home team, their score is identified by FTHG, but if they are the Away team, their score is identified by FTAG.

Correspondingly, we need

Opposition Score

IF [Home or Away]=’Away Team’ THEN [Fthg] ELSE [Ftag] END

If the team we’re interested in is listed as the Away Team, then the opposition will be the Home team, so FTHG identifies the goals they scored, otherwise the opposition is the Away team, so FTAG is the goals they scored.

Both of these fields may sound complicated, but it was retrievable in the data I had.

However the actual name of the Opposition team wasn’t – this was lost once I pivoted the data – the Opposition is now located on a completely different row of data and as we have no ‘match id’ or similar in the data, I can’t actually identify who the opposition might have been, since multiple matches can play on the same date.

This is where I started to question whether ‘my trick’ was the right one…

I had a bit of a think, and came up with a solution, but it was a bit ‘off piste’ for a #WOW challenge as it required blending… typically when blending is required they’d be some mention of it in the requirements, but since there was nothing stated to say it couldn’t be used, I went with it.

I added the original unpivoted data set to my workbook as an additional data source. I renamed this with the word ‘Home‘ on the end.

When I work with blending, I try to be specific which fields the data sources are going to be linked/blended with.

So in the pivoted data source I’d been working with for most of the challenge, I create a new field

BLEND – Team

[Team]

Then in the Home data source I’d added, I also created a field with the same name, but it contained a different field.

BLEND – Team

[Home Team]

I then added another instance of the original unpivoted data source to my workbook, this time renaming it with the word Away on the end. In this data source I created

BLEND – Team

[Away Team]

so from the 3 data sources, I could now build up a table to verify I could get to the data I needed:

Pivoted Data.Team
Pivoted Data.Date
Home Data.Away Team
Away Data.Home Team

and when blended to the secondary data sources, the links for both BLEND – Team and Date fields are enabled

From this data, we can see that either the Home Team column or the Away Team column is populated, and provides the name of the opposition. So this means we can then create in the primary data source (the pivoted data source)

Opposition

IF ISNULL(ATTR([2020_11_11_WW46_EPL – Home].[Away Team])) THEN ATTR([2020_11_11_WW46_EPL – Away].[Home Team])
ELSE ATTR([2020_11_11_WW46_EPL – Home].[Away Team])
END

If the Away Team is NULL, then store the Home Team value, else store the Away Team value

And from this we now have all the building blocks we need to finalise the Last 5 matches chart.

Just make sure that as you add fields to the chart, you may need to alter the various table calculations to ensure they are still computing correctly (typically all the fields other than Team should be selected in the table calc dialog). And also keep an eye on the fields being blended – BLEND – Team and Date should always be selected for both.

After I posted my solution up, I did indicate I thought I might have done something a bit ‘out of the ordinary’, but apparently my approach match Luke’s! That was a real surprise.

My published version is here.

Happy vizzin’! Stay Safe!

Donna

Can you show the top 10 products for each state?

November 6, 2020 Donna ColesLeave a comment

For this week’s #WOW2020 challenge, Sean Miller stripped things right back and went ‘back to basics’.

This blog should be brief as I’m only going to touch on the bits that I think some people might find a little tricky.

The Map Colours

Use the Red-Black Diverging colour palette, centred at 0 to ensure the colours match exactly (this is most noticeable on the Viz in Tooltip table if it’s not centred at 0).

Map Background

On the Map -> May Layers menu, ensure all the items under the Map Layers section are unchecked

Seaboard States

I used the MIN(1) on the Columns shelf and fixed the axis from 0-1 to fill it up.

Top 10 Products

Orders Count

I dragged Order ID into the ‘measures’ section (below the line on the left hand pane if you’re using later versions of Tableau), and chose the COUNTD aggregation. When I added this to the table, I then changed the alias of the field and called it ‘Orders’

Top 10

Add Product Name to the Filter shelf and select the Top tab.

Colouring the columns

This uses the Legend Per Measure functionality. Add Measure Values to the Colour shelf and select the Use Separate Legends option

This will add 3 colour legends onto the canvas. Set the colours of the Profit measure to the Red – Black diverging as with the map.

For the other 2 legends select any diverging colour palette, then click on the coloured square at each end, and select white from the palette displayed. Change the stepped colour to 2, and you’ll find that the measures now don’t look like they actually have a background colour.

Viz in Tooltip

When adding the sheet as a tooltip, I adjusted the size to 500×350

The size of the Top 10 Products sheet should be set to Entire View to ensure you don’t get a ‘View is too large to display’ message on the tooltip

Getting the Top 10 filtered properly

Once the viz has been added as a ‘viz in tooltip’ a State related filter pill will automatically be added to the Filter shelf of the the Top 10 Products sheet. To ensure the top 10 products gets filtered by the state BEFORE the top 10 products by sales are identified, the filter needs to be Added to Context

Arranging on the Dashboard

I managed to tile all the items, except for the ‘Eastern Seaboard States’ title which I floated.

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you create a small multiple waterfall chart?

October 30, 2020 Donna ColesLeave a comment

For this week’s #WOW2020 challenge, Lorna Brown asked us to recreate a waterfall chart – a chart style that hasn’t featured in many previous challenges (if at all), and is always a useful one to know how to build.

I’m familiar with these, and this challenge didn’t cause me too many issues, so this blog is going to be brief.

Building the waterfall
Small multiple / grid layout
Adding the month label

Building the waterfall

So I’d built out the basic waterfall for each month and day by plotting Year(Order Date), Month(Order Date) and Day(Order Date) on Columns. The Day(Order Date) field was set to Show Missing Values so each day without an order was still plotted, and I was trying to figure out how to get the additional ‘long’ bar at the end.

I worked out it was essentially a ‘total’ bar and when I duplicated my data as crosstab and played round with the data in a tabular form, I got the subtotals I needed displayed.

But I seemed to be having issues displaying these on the chart view. So I turned to my usual route, Google, and had a search, and came across this blog from Tim Ryan at The Data School, which gives you the complete guide to building the waterfall, so there’s no need for me to repeat it all – thanks Tim! 🙂

My issue was I had a green continuous Day(Order Date) field rather than a blue discrete one – doh!

The only couple of things you need to make note of – you need to ensure you have 0 displayed for the missing dates

Actual Profit

ZN(SUM([Profit]))

and the gantt bars should be coloured red for negative profit, blue for positive and grey for the missing days

Colour

IF SUM([Profit])<0 THEN ‘Red’ ELSEIF SUM([Profit]) > 0 THEN ‘Blue’
ELSE ‘Grey’
END

Small multiple / grid layout

For this you need fields to add to the Rows and Columns shelf that position the month in the appropriate cell.

The Quarter(Order Date) (blue discrete) on Rows, is just used to define the row a month lands in.

You then need

Cols

IF MONTH([Order Date])%3 = 0 THEN 3
ELSE MONTH([Order Date])%3
END

which assigns each month a value of 1,2 or 3. You’ll need this on the Columns shelf.

Adding the month label

The label shows the month and the total profit in the month, so I created an LOD for this

Profit for Month

{FIXED YEAR([Order Date]),MONTH([Order Date]): SUM([Profit])}

From this I wanted the maximum value of all the monthly profits

Max Monthly Profit in Year

{FIXED Year([Order Date]): MAX([Profit for Month])}

I then used a dual axis to plot this field on the Rows, set the mark type to a line and set the opacity of the line colour to 0%, so it disappears.

The month and value were then added to the Label shelf and the label set to Label start of line only and also right aligned to get the required positioning.

And that’s it. I said this would be brief 🙂 My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Designing KPIs for Mobile

October 22, 2020October 22, 2020 Donna ColesLeave a comment

Luke Stanke returned for week 43 of #WOW2020 with a challenge focussed on building KPIs for mobile consumption.

In general this looked to be (and was) less taxing than some from previous weeks, but Luke did throw in some very specific requirements which did prove to be a bit tricksy.

To deliver this solution I built 8 sheets, 1 for each KPI heading and 1 for each bar chart. The dashboard then uses a vertical layout container to arrange the 8 objects in. A filter control on each bar chart determines whether the bar chart should ‘show’ or not. When a particular bar chart is displayed it fills up the space, which makes the display look to ‘expand’. Parameter actions are used to drive the ‘expand/collapse’ functionality.

The areas of focus for this blog are

Building the KPI chart
Formatting the Bar chart
Expand / Collapse function
Ensuring the KPI isn’t highlighted on selection
Making the display work for mobile

Building the KPI Chart

Because we have text on the left and the right, then I built this as a dual axis chart.

I’m going to build the Sales KPI.

I used MIN(1) on Columns, with Mark Type of bar, and fixed the axis to range from 0 to 1. SUM(Sales) is then added to Label, right aligned and formatted appropriately.

For the 2nd axis, we’re going to use MIN(0) positioned alongside MIN(1) on Columns, and this time, set the Mark Type to Gantt. I type the word ‘SALES’ into the Label field.

We also need to display a + or – icon on this label too. This will rely on a parameter that is going to be set.

Create a string parameter Selected Measure which is just empty.

For the icon, then create

Sales – Icon

IF [Selected Measure] = ‘SALES’ THEN ‘▬’ ELSE ‘✚’ END

I use this site to get the characters I use for these types of things.

Add Sales – Icon to the Label shelf of the MIN(0) axis, and position in front of the ‘SALES’ text.

If you show the Selected Measure parameter on the screen, and enter the word SALES, you’ll see the shape change to ‘-‘. Make this a dual axis chart and synchronise axis, and you should have the basis of the KPI (once all the axis have been hidden of course).

We also need to add an additional field onto the chart that we’ll need to use with the parameter action later.

Sales – String to pass

IF [Selected Measure] <> ‘SALES’ THEN ‘SALES’ ELSE ” END

When the Selected Measure parameter is empty (or contains another value), then this field will contain ‘SALES’ otherwise it’ll be blank.

Add this onto the Detail shelf of the ‘All’ Marks card (ie it needs to exist on both axes).

You essentially need to duplicate this sheet and build instances of the calculated fields for for Profit, Margin (SUM(Profit)/SUM(Sales)), and Customers (COUNTD(Customer ID)).

Formatting the Bar chart

Luke is very specific in his requirements that the axis tick for the bars displayed needs to be centred. This means there needs to be a continuous (green) pill on the date axis.

This very tiny requirement gave me a lot of grief 😦

I initially used MONTH(Order Date) (which will return numbers 1-12), but when I formatted the axis to First Letter, I get a D (for 0) and J (for 13) at each end.

Fixing the axis to start from 1-12 doesn’t work, as this chops off part of the first and last bars.

I tried a variety of mechanisms but to no avail. A chat with my fellow #WOW participant Rosario Gauna, who was also mulling over the issue eventually provided a solution, courtesy of Annabelle Rincon.

We need to use an ‘undocumented’ formatting option of mmmmm, which is the formatting for first letter. However, this doesn’t work with the above. I need to change the axis to use an actual date field (which was one of the mechanisms I’d already tried).

Baseline Date

DATETRUNC(‘month’,MAKEDATE(2019,MONTH([Order Date]), DAY([Order Date])))

This is taking every date in the data set, and ‘baselining’ it to all be on the same year (I chose 2019, but this could be any year you choose). The DATETRUNC is then setting all dates within the same month to be reported as being the 1st of the month.

Adding this to Columns instead as an exact date, and then formatting the axis and changing the Dates option to mmmmm will display the 1st letter

There is also a requirement to just show a label and tick mark for every other month, starting in Feb. You need to edit the axis, and on the Tick Marks tab

Set Major Ticked Marks to Fixed
Set the Tick Origin to 01 Dec 2018, and the interval to every 2 months (2 months after 1st Dec 2018 is 1st Feb 2019, so the F displays while the J for Jan doesn’t)
Set Minor Tick Marks to None

An additional ‘formatting’ setting I used on the bar charts was to fix the vertical axis to a number high enough to ensure the label always displayed after the bar and not in it. What this needs to be set to will vary and is only really noticeable once added to the dashboard, so may require some tweaking to get it correct.

Once again, you’ll need an instance of this bar chart for each of the measures Sales, Profit, Margin & Customers.

Expand / Collapse Function

On the dashboard, you need to use a vertical layout container, and place all the sheets in order underneath each other, so you have KPI, bar, KPI, bar etc.

Each bar chart needs a filter which will determine whether it shows any data or not. Create a calculated field

FILTER : Selected Measure

[Selected Measure]

On the Sales bar chart, make sure the Selected Measure parameter contains the value ‘SALES’, then add the Filter: Selected Measure to the Filter shelf and select the value SALES (it’ll be the only option available)

Then go to the Profit bar chart, and change the Selected Measure parameter to PROFIT. Now add the FILTER : Selected Measure to the Filter shelf and select the value PROFIT. If you switch back to the Sales bar, you’ll find the display is empty. Repeat this for the MARGIN and CUSTOMERS bar charts.

If you return to the dashboard you should find you probably have the 4 KPIs displayed and 1 bar chart showing. The final step is to add parameter actions.

For each KPI sheet, create a parameter action that targets the Selected Measure parameter by passing the relevant <Measure> – String to Pass field.

Ensuring the KPI isn’t highlighted on selection

I’ve used the True = False concept which I’ve applied multiple times to previous challenges.

Create a calculated field called True that just stores the value of True and and a calculated field called False that stores the value False. Add both of these to the Detail shelf of teach KPI viz.

Then on the dashboard, add a Filter action for each KPI sheet that goes from the KPI object on the dashboard to the sheet itself, passing the values Source: True = Target: False

Making the display work for mobile

This frustrated me no end. When I click on the Phone layout option (right image below), the layout container just doesn’t behave as it does on the Default view – the space for the bar charts is retained even when there’s no data.

I tried all sorts of combinations of containers to try to resolve this, and just couldn’t get it, and I struggled to find anything online that would help. I published to Tableau Public to test what the result would actually look like on my mobile, in the vain hope it might ‘just work’ but it didn’t 😦

When Sam Epley posted his solution, and I found it worked on mobile, I had to look. and found that you could just ‘delete’ the phone layout.

A couple of gems in this one that I need to store away for future! My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you build a Strava workout calendar?

October 16, 2020 Donna ColesLeave a comment

For 2020 Week 42, the #WoW founder, Andy Kriebel, returned with a challenge to reproduce the Strava training calendar. Compared to some challenges recently, this looked to be quite straight forward; Andy threw in some specific requirements to test certain features – ie no data modelling and no LoDs.

I’ve been doing #WorkoutWednesday challenges since they first started, so I know that Andy is a stickler for formatting and layout – points not necessarily listed as a requirement, just expected as part of the challenge to reproduce. I kept my fingers tightly crossed when I published that I’d got all the finer details, but alas, Andy still found fault – my month summaries weren’t right aligned (my bad – missed that little nuance completely), and my bars had borders on them… Andy must have eyes like Superman to have seen that, as it wasn’t obvious. It also wasn’t a setting I’d intentionally added. I later found out that adding a particular type of pill to the Detail shelf caused borders to automagically be added… There’s always something to learn when Andy’s about!

So onto the challenge – as with previous weeks, I’m going to try to focus on the areas that may be a bit trickier / newer to some rather than detail the complete build step by step.

Using the data sets – blending
Building the calendar grid
Ensuring a 0 measure value is displayed for missing days
Adding the monthly hours summary
Building the BANs
Year Filter control
Remove highlighting
Setting the colour of the Calendar chart background

Using the data sets – blending

Andy was very specific that the 2 data sets provided should be used separately and not joined in any via the data pane.

This meant the data sources would need to be blended (further detail on this is here). Blending used to be one of the only ways within Tableau you could combine data together.

When blending, the number of rows in your output will never be more than the number of rows in your primary* data source. If there are multiple matching rows in the secondary data source, then the results will be aggregated in the display.

* whatever data source the first pill you add to your canvas comes from, will be the primary, and is denoted by a small blue icon by your data source. Secondary data sources are denoted by a small orange icon.

In the case of this challenge, we had a data set containing a list of dates (1 row per day from 01 Jan 2014 up to 31 Dec 2021), along with Andy’s Strava activity, containing a row for each activity recorded, which included the date time the activity occurred. This data could vary in that there could be multiple activities on the same day, and equally days when no activity occurred at all.

So the Calendar data set is our primary data source, as we need to show a bar on the calendar chart for every day of the year, regardless if there’s any activity. The Activity data set is our secondary data source. The number of hours, number of activities etc can all be aggregated from this data set.

When blending data sources, especially on dates, I prefer to create explicit calculated fields that define the fields I want to blend on. So in the Calendar data source I created

BLEND: Date

[Date]

essentially just a duplicate of the existing Date field, and in the Activity data source, I also created

BLEND: Date

DATE([Date Time])

Note the fields are spelled exactly the same, so Tableau automatically uses them as the linking fields when the view is built.

If you now do the following

Add the Calendar.BLEND: Date field to the Filter shelf, and select the Year = 2020,
Add Calendar.BLEND: Date as an exact date to Rows
Add Activity.BLEND: Date as an exact date to Rows
Add Activity.Seconds to Text

You can see that fields from the secondary data source have an orange icon by them; and that there are Null/missing values for the records from the secondary data source as these were the days when there was no activity recorded. You can also see a red link icon against the BLEND: Date field in the left hand data source pane, as this identifies how the two data sets are being matched.

Building the calendar grid

The calendar is essentially a ‘small multiple’ layout with each month being positioned in a particular row or column. To build out this layout we need to define the row number and the column number. There are many ways to build a dynamic small multiple grid which can flex based on the number of items you might be trying to organise, but for the purpose of this exercise, we can keep it simple. We’re working with 12 months that are to be displayed in a 4 x 3 grid layout. Create the following calculated fields in the Calendar data source.

Rows

IF MONTH([Date])<=4 THEN 0
ELSEIF MONTH([Date]) <=8 THEN 1
ELSE 2
END

Cols

(MONTH([Date])-1)%4

I make both of these to be dimensions rather than measures by dragging them above the line on the left hand data source pane. If you build out the view as below, you can see how these calcs are working

As we want to show a mark for every day in the month, we need to add the day of the month from the Calendar data source to Columns. Drag Calendar.BLEND: Date to Columns, then select the drop down to change to the Day date part

We need to show the amount of time in hours rather than seconds. In the Activity data source, create the field

Hours

([Seconds]/60)/60

and drag this onto the Rows, and change the mark type to bar. If need be re-add the YEAR(BLEND: Date) = 2020 to the Filter shelf. Now add Calendar.BLEND: Date as an exact date to the Detail shelf. You should now have

where you can see the gaps in the days where no activities took place, and if you hover vertically, you should find that the days of the month are vertically aligned – ie 30th Jan aligns with 30th May etc.

Ensuring a 0 measure value is displayed for missing days

With the above we displayed the Activity.Hours field, but if you hover over the day when there is no activity, nothing displays on the tooltip rather then 0.

To fix this, create a calculated field in the primary Calendar data source

Hours

ZN(SUM([Sheet1 (Activities Summary)].[Hours]))

This is basically just referencing the field in the secondary blended data source, but wrapping in a ZN() function means it will display 0 when no match can be found

Use this field from the primary data source instead on the calendar viz.

Adding the monthly hours summary

The requirements meant Andy expected the summary to be displayed within the same sheet as the daily calendar viz.

For this I used an old friend MIN(0) to create another axis, which is placed on the Rows in front of the Hours measure.

What I now plan to do is set this axis to be Text and plot the month, monthly hours, and the word ‘hours’ at a specific point to the right of the each cell – I’m choosing day 28 – you might want to experiment and choose a different day.

First up though, I need to build some fields to plot.

Month Name Abbrev

IF DAY([Date]) = 28 THEN
UPPER(LEFT(DATENAME(‘month’,[Date]),3))
END

Hours in Month

IF MIN(DAY([Date])) = 28 THEN
WINDOW_SUM([Hours])
END

LABEL: Hours

IF DAY([Date]) = 28 THEN ‘HOURS’ END

Month

DATE(DATETRUNC(‘month’,[Date]))

Add Month as an exact discrete date to Detail and the other 3 fields to the Text shelf of the Min(0) marks card (change the mark type to Text if you haven’t already done so). Alter the table calculation setting of the Hours in Month field to compute by all fields except Month

Building the BANs

These use a similar concept as above, by using 3 instances of MIN(0) placed side by side on the Columns shelf and set to the Text mark type. This creates 3 marks cards which you can then add the relevant measures and text on.

The measures are all coming from fields in the primary data source that reference measures in the seconday data source ie

# Activities (in Activity data source)

COUNT([Activity ID])

# Activities (in Calendar data source)

ZN([Sheet1 (Activities Summary)].[# Activities])

#Miles (in Calendar data source)

ZN(SUM([Sheet1 (Activities Summary)].[Miles]))

Year Filter Control

All the sheets you are building need to be filtered by the same BLEND: Date field from the Calendar data source (set the filter to Apply to all worksheets).

When this field is added to the dashboard, you can customise it so the All values does not show and the slider control also doesn’t display

Remove highlighting

To stop items on the dashboard from highlighting when they are clicked on, I use a trick that has been probably been the ‘most used trick of #WOW2020’ 🙂

In the primary data source, create a field called True which contains the value TRUE and a field False containing the value FALSE. Add both these fields to the Detail shelf of each sheet you don’t want highlighting on.

On the dashboard, create a Filter URL action for the each sheet that goes from the sheet on the dashboard to the sheet itself, and passes selected fields setting true = false. As this condition will never be true, then there is nothing to ‘filter’ so the marks don’t highlight. This needs to be repeated for each sheet on the dashboard, so I had 3 filter dashboard actions.

NOTE – a consequence of adding the True and False fields to the Detail shelf on the bar sheets, was that it caused a border to be added around the bars.

This wasn’t something I noticed, as it isn’t at all obvious, but Andy called it out!

Setting the colour of the Calendar chart background

You need to format the sheet and set the fill colour of the Pane rather than the whole sheet to grey.

There’s obviously a lot of other formatting settings to apply to get rid of all the row/column borders and gridlines etc, but this was a slight difference that I wanted to call out, as ended up with a ‘border’ on my dashboard that wasn’t required when I set the whole worksheet background.

Right, I think that’s about it for this week! Thanks for the fun challenge Andy – great to have you back!

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Donna + DataViz

ramblings in all things Tableau (and occasionally some other Stuff….)

Category: #WOW2020

Can you build a divergent histogram?

Can you highlight profits with Measure Names?

Can you toggle between AND|OR filtering logic?

Can you create a variable width bar chart?

Can you create a clustered histogram?

Can you build a Premier League table?

Can you show the top 10 products for each state?

Can you create a small multiple waterfall chart?

Designing KPIs for Mobile

Can you build a Strava workout calendar?