# Can you build a Premier League table?

Luke provided this fun challenge this week. It was nice to see an English sport that I’m very familiar with feature, even if I had to keep retyping the word ‘Draw’ for ‘Tie’ 🙂

This blog will cover

• Re-modelling the data
• Building the table & bar chart
• Building the Last 5 matches chart
• Getting the data for the Last 5 matches chart tooltip

Re-modelling the data

Sometimes the #WOW challenges are set with a very specific direction in mind eg no LODs, use 1 sheet only, no data modelling etc. This challenge had no real restrictions, and there were minimal clues. The Latest Challenges page indicated they’d be table calculations involved

and Luke’s tweet introducing the challenge suggested there was a ‘trick’ that would make this a lot simpler to solve…. hmmmm???

I looked at the data referenced to get a feel of the shape and layout – it showed 1 row per game, with the home & away team for each game stored in different columns. Trying to count the number of games a specific team played was likely to be complicated in this format. Given Luke had suggested a ‘trick’, I decided I was going to pivot the data to expand the data to give me 1 row per game per team, hoping this was the ‘trick’ he implied, and there was no requirement to state that data modelling wasn’t allowed.

I did this in Tableau itself, in the data source pane. After connecting to the data, highlight the columns Home Team and Away Team, right click and select Pivot.

This has the effect of duplicating the rows of data and adding 2 additional columns to the end Pivot Field Names containing the column names that had been pivoted, and Pivot Field Values containing the values of those fields.

I then simply renamed Pivot Field Names to Home or Away and Pivot Field Values to Team.

With the data structured this way it was very simple to calculate the various measures required :

Win-Loss-Draw

IF [Home or Away] = ‘Home Team’ AND FTR = ‘H’ THEN ‘W’
ELSEIF [Home or Away] = ‘Away Team’ AND FTR = ‘A’ THEN ‘W’
ELSEIF FTR = ‘D’ THEN ‘T’
ELSE ‘L’
END

FTR in the data set stands for full-time result and indicated whether it was the Home (H) or the Away (A) team that won, or whether it was a Draw (D). So this field indicates whether the Team in each row won, lost or drew (tied) the match.

From this we can then work out the points the Team gained in the match.

Points

IF [Win-Loss-Draw] = ‘W’ THEN 3
ELSEIF [Win-Loss-Draw] = ‘T’ THEN 1
ELSE 0
END

To determine the total number of Wins for each team I created

Wins

FIXED [Team]: SUM(IF [Win-Loss-Draw] = ‘W’ THEN 1 ELSE 0 END)}

and then repeated this similarly to get a Losses and Ties measure.

Finally I determined the total number of games each team played by using the provided tablename (count) field that is now added in more recent versions of v2020 (supersedes the Number of Records field).

Matches Played

{FIXED [Team]: (COUNT([2020_11_11_WW46_EPL.csv]))}

Building the table & bar charts

In my initial built, I created one sheet to show both the table summary with the bar chart, and a separate sheet to show the last 5 matches, with the intention the charts would be lined up side by side on the dashboard, removing padding to make the charts look to be one.

But with the table and bar on the same chart, I couldn’t get the labels of each column and the ‘Total Points’ to be aligned in the way displayed. So I ended up using 3 different sheets.

Table

Given we’ve already got all the measures, this was a very simple table, with the Team field sorted descending by the Points measure.

Bar Chart

Again this is a pretty basic chart with Team on Rows,(which is then hidden) and Points on Columns, and Win-Lose-Draw on Colour.

To label the end of the chart with the total points, I created an additional measure

Total Points

{FIXED [Team]: SUM([Points])}

which I added to Columns and made dual axis, with the Total Points being a Gantt type, and the mark being labelled.

Building the last 5 matches chart

This chart is slightly more complex, and where the table calculations clue comes into play.

In the data, a match for each Team is identified by the Date, but we can’t use Date to plot against as not everyone has their matches on the same date. We just want to number each match played by each team in order of the date played.

We can use a field Index = INDEX() to number our matches per team – you’ll see that some teams have played 8 matches while others 7.

But we only want the last 5 matches, which for some teams will be matches 3-7 and others 4-8.

To do this, we need to determine the number of matches played. In hindsight I should have realised, I’ve already got this in a variable, but as my head was in ‘table calc’ mode, I created a field Size = SIZE()

As you can see, this returns the number of rows for each team against every row for that team.

I then created another calculated field

Last 5 matches only

[Index] > [Size]-5

Adding this into the table, and verifying the table calc settings for both the nested table calcs are set as above, you can see the last 5 rows of each team are set to True

This can then be added to the Filter shelf to reduce the rows shown.

However, as some teams have played less matches than others, the Index values shown aren’t consistent for every team, and we need this.

So another field is required

Index to Plot

//want a value from 1 to 5 based on the index
[Index] – ([Size]-5)

This gives us the 1-5 sequence for each row we need to build the chart

The layout below is the basic premise.

The circles need to be coloured based on the Win-Loss-Draw field, and the labelled with the same field which is set to match mark colour. Reducing the transparency of the colour will fade the circle colour and make the label stand out.

Getting the data for the Last 5 matches chart tooltip

So just by pivoting the data, I was able to crack through the measures and the charts pretty quickly, and felt happy that I’d worked out Luke’s trick.

But then I came to adding the tooltip to the circle chart which needs information about the opposition, and the score.

The score is always displayed on the tooltip as

Team ‘s score – Opposition score.

For example when Arsenal played Sheffield United at Home where Arsenal won 2-1, the score is displayed on the tooltip as 2-1.

And when Arsenal played Manchester City Away, where Manchester City won 1-0, the score for Arsenal is displayed on the tooltip as 0-1.

So to get the team’s score I created

Team Score

IF [Home or Away]=’Home Team’ THEN [Fthg] ELSE [Ftag] END

Where FTHG is Full Time Home Goals and FTAG is Full Time Away Goals. So if the team is listed as the Home team, their score is identified by FTHG, but if they are the Away team, their score is identified by FTAG.

Correspondingly, we need

Opposition Score

IF [Home or Away]=’Away Team’ THEN [Fthg] ELSE [Ftag] END

If the team we’re interested in is listed as the Away Team, then the opposition will be the Home team, so FTHG identifies the goals they scored, otherwise the opposition is the Away team, so FTAG is the goals they scored.

Both of these fields may sound complicated, but it was retrievable in the data I had.

However the actual name of the Opposition team wasn’t – this was lost once I pivoted the data – the Opposition is now located on a completely different row of data and as we have no ‘match id’ or similar in the data, I can’t actually identify who the opposition might have been, since multiple matches can play on the same date.

This is where I started to question whether ‘my trick’ was the right one…

I had a bit of a think, and came up with a solution, but it was a bit ‘off piste’ for a #WOW challenge as it required blending… typically when blending is required they’d be some mention of it in the requirements, but since there was nothing stated to say it couldn’t be used, I went with it.

I added the original unpivoted data set to my workbook as an additional data source. I renamed this with the word ‘Home‘ on the end.

When I work with blending, I try to be specific which fields the data sources are going to be linked/blended with.

So in the pivoted data source I’d been working with for most of the challenge, I create a new field

BLEND – Team

[Team]

Then in the Home data source I’d added, I also created a field with the same name, but it contained a different field.

BLEND – Team

[Home Team]

I then added another instance of the original unpivoted data source to my workbook, this time renaming it with the word Away on the end. In this data source I created

BLEND – Team

[Away Team]

so from the 3 data sources, I could now build up a table to verify I could get to the data I needed:

• Pivoted Data.Team
• Pivoted Data.Date
• Home Data.Away Team
• Away Data.Home Team

and when blended to the secondary data sources, the links for both BLEND – Team and Date fields are enabled

From this data, we can see that either the Home Team column or the Away Team column is populated, and provides the name of the opposition. So this means we can then create in the primary data source (the pivoted data source)

Opposition

IF ISNULL(ATTR([2020_11_11_WW46_EPL – Home].[Away Team])) THEN ATTR([2020_11_11_WW46_EPL – Away].[Home Team])
ELSE ATTR([2020_11_11_WW46_EPL – Home].[Away Team])
END

If the Away Team is NULL, then store the Home Team value, else store the Away Team value

And from this we now have all the building blocks we need to finalise the Last 5 matches chart.

Just make sure that as you add fields to the chart, you may need to alter the various table calculations to ensure they are still computing correctly (typically all the fields other than Team should be selected in the table calc dialog). And also keep an eye on the fields being blended – BLEND – Team and Date should always be selected for both.

After I posted my solution up, I did indicate I thought I might have done something a bit ‘out of the ordinary’, but apparently my approach match Luke’s! That was a real surprise.

My published version is here.

Happy vizzin’! Stay Safe!

Donna