How are EPL Football Teams performing this season compared to previous seasons?

January 17, 2025 Donna ColesLeave a comment

I set this week’s challenge and I tried to deliver something to hopefully suit everyone wherever they are on their Tableau journey. The primary focus for this challenge is on table calculations, but there’s a few other features/functionality included too.

I love football – all the family are involved in it in some way, so thought I’d see what I could get from a different data set this week. The data set contains the results of matches played in the English Premier League for the last five seasons – from the 2020-21 season to the current season 2024-25. Matches up to the end of 2024 only are included.

As I wrote the challenge in such a way that should allow it to be ‘built upon’ from the beginner challenge, I’m going to author this blog in the same way. I’ll describe how to build the beginner challenge first, then will adapt /add on to that solution. Obviously, as in most cases, this is just how I built my solution – there may well be other ways to achieve the same result.

Beginner Challenge

Modelling the data

The data set provided displays 1 row per match with the teams being displayed in the Home and the Away columns respectively, and the FTR (full time result) column indicating if the Home team won (H), the Away team won (A) or if the match was a draw (D).

The first thing we need to do is pivot this data so we have 2 rows per match, with a column displaying the Team and another indicating if the team is the Home or Away team. To do this, in the data source window, select the Home and the Away columns (Ctrl-Click to multi-select – they’ll be highlighted in blue), then from the context menu on either column, select the Pivot option.

This will duplicate the rows, and generate two new fields called Pivot Field Names and Pivot Field Values. Rename the fields by double-clicking into the field heading

Pivot Field Names to Home or Away
Pivot Field Values to Team

Creating the points calculation

We need to determine how many points each team gained out of the match which is based on whether they won (3pts), lost (0pts) or drew (1pt). Create a calculated field

Pts

IF [Home or Away] = ‘Home’ AND [FTR] = ‘H’ THEN 3
ELSEIF [Home or Away] = ‘Away’ AND [FTR] = ‘A’ THEN 3
ELSEIF [FTR] = ‘D’ THEN 1
ELSE 0
END

Creating the cumulative points line chart

We want to create a line chart that displays the cumulative number of points each team has gained by each week in each season.

Start by adding Wk to Columns and Pts to Rows as these are the core 2 fields we want to plot. But we need to split this by Team and by season, so add Team and Season End Year to the Detail shelf.

This gives us all the data points we need, but at the moment, it’s currently just showing how many points each team gained per week in each season.

To get the cumulative value, we add a running total quick table calculation to the Pts field.

which gives us the display we need

While we can leave the SUM(Pts) field as is in the Rows, I tend to like to ‘bake’ this field into the data set, so I have a dedicated field representing the running total. I can create this field in 2 ways

Create a calculated field called Cumulative Points Per Season which contains the text RUNNING_SUM(SUM([Pts])). Add this field to Rows instead of just Pts.
Hold down Ctrl and then click on the SUM([Pts]) pill in Rows and drag into the left hand data pane and then release the mouse. This will automatically create a new field which you can rename to Cumulative Points Per Season. The field will already contain the text from the calculation used within the quick table calculation, and the field on Rows will automatically be updated.

Filtering the data

Add Team to the Filter shelf and when prompted just select a single entry eg Arsenal.

Show the Filter control. From the context menu, change the control to be a single value (list) display, and then select customise and uncheck the show all value option, so All can’t be selected.

Format the line

To make the line stepped, click the path button and select the step option

To identify the latest season without hardcoding, we can create

Is Latest Season

[Season End Year] = {MAX([Season End Year])}

The {MAX([Season End Year])} is a FIXED LOD (level of detail) calculation, and is a shortened notation for {FIXED: MAX([Season End Year])} which basically returns the latest value of Season End Year across every row in the data set. This calculation returns true if the Season End Year value of the row matches the overall value.

Add this field to the Colour shelf and adjust colours to suit. Also add the same field to the Size shelf

The line for the latest season is being displayed ‘behind’ the other seasons. To fix this, drag the True value in either the colour or size legend to be listed before the False value. Then edit the sizes from the context menu of the size legend, and check the reversed checkbox to make True the thicker line.

If the thick line seems too thick, adjust the mark size range to get it as you’d prefer

Label the line

Add Season End Year to the Label shelf. Align middle right. Allow labels to overlap marks. And match mark colour.

Create the Tooltip

We need an additional calculated field for this, as we want to display the season in the tooltip in the format 2024-2025 and not just the ending year of the season.

Season Start Year

[Season End Year]-1

Add this field to the Tooltip shelf, and then edit the Tooltip to build up the text in the required format. Use the Insert button on the tooltip to add referenced fields.

Final Formatting

To tidy up the display we want to

Change the axis titles
- Right click on each axis and Edit Axis then adjust the title (or remove altogether if you don’t want a title to display)
Remove gridlines and axis ruler
- Right click anywhere on the chart canvas, and select Format. In the left hand pane, select the format lines option and set grid Lines to None and axis rulers to None.

Set all text to dark purple
- Select the Format menu option at the top of the screen and select Workbook. The under the All fonts section, change the colour to that required

Update the title to reference the selected team
- double click in to the title of the sheet and amend accordingly, using the Insert option to add relevant fields.

Intermediate Challenge

For this part of the challenge we’re looking at using a dual axis to display another set of marks – these ones are circular and only up to 1 mark per season should display. As this now takes a bit more thought, and to help verify the calculations required, I’m going to build up the calculations I need in a tabular form first.

Defining the additional mark to plot

On a new sheet add Team, Season End Year and Wk to Rows. Set the latter 2 fields to be discrete (blue) pills. Add Cumulative Points Per Season to Text. Add Team to Filter and select Arsenal.

We need to identify the date of the last match played by each team, so we can use an LOD for this

Latest Date Per Team

{FIXED [Team] : MAX([Date])}

Add this to Rows as a discrete (blue pill) exact date. For Arsenal, the last match was on 27 Dec 2024, whereas for Chelsea it’s 22 Dec 2024.

With this, we can work out what the latest points are for each team in the current season.

Latest Points Per Team

WINDOW_MAX(IF MIN([Date]) = MIN([Latest Date Per Team]) THEN [Cumulative Points Per Season] END)

Breaking this down : the inner part of the statement says “if the date associated to the row matches the latest date, then return the points associated with that row”. Only 1 row in the table of data has a value at this point, all the rest of the rows are ‘nothing’. The WINDOW_MAX statement, then essentially ‘floods’ that value across every row in the data, because the ‘value’ returned by the inner statement is the maximum value (it’s higher than nothing). Add this field into the table.

We’re trying to identify the week in each season where the points are at least the same as the latest points. We’re going to capture the previous week’s points against every row.

Previous Points

LOOKUP([Cumulative Points Per Season],-1)

This is a table calculation that returns the value of the Cumulative Points Per Season from the previous row (-1). If we wanted the next row, the function parameter would be 1. 0 identifies the ‘current row’.

Add this to the table.

We can see the behaviour – The Previous Points associated to 2025 week 18 for Arsenal is 33, which is the value associated to Cumulative Points Per Season for week 17. But we can also see that week 38 from season 2024 is being reported as the previous points for week 1 of season 2025, which is wrong – we don’t want a previous value for this row.

To resolve, edit the table calculation of the Previous Points field and adjust so the calculation for Previous Points is just computing by the Wk field only.

With this we can identify the week in each season we want to ‘match’ against. In the case of the latest season, we just want the last row of data, but for previous seasons, we want to identify the first row where the number of points was at least the same as the latest points; the row where the points in the row are the same or greater than the latest points, and the points in the previous row are less.

Matching Week

//for latest season, just label latest record, otherwise find the week where the team had scored at least the same number of points as the current season so far
IF MIN([Is Latest Season]) AND LAST()=0 THEN TRUE
ELSE
IF NOT(MIN([Is Latest Season])) AND [Cumulative Points Per Season]>= [Latest Points Per Team] AND [Previous Points] < [Latest Points Per Team] THEN TRUE
ELSE FALSE
END
END

Add this to Rows and check the data. In the example below, in Season 2020-21, in week 26, Arsenal had 37 points. The previous week they had 34 points. Arsenal’s latest points are 36, so since 37 >=36 and 34 < 36, then week 26 is the matching week.

Looking at season 2023-24, in both week 15 and 16, Arsenal had 36 points. But only week 15 is highlighted as the match, as in week 14, Arsenal had 33 points so 36 >=36 and 33 <36, but for week 16, as the previous week was also 36, the 2nd half of the logic isn’t true : 36 is not less than 36.

So now we’ve identified the row in each season we want to display on the viz, we need to get the relevant points isolated in their own field too.

Matching Week Points

IF [Matching Week] THEN [Cumulative Points Per Season] END

Add this to the table

We now have data in a field we can plot.

Visualise the additional mark

If you want to retain your ‘Beginner’ solution, then the first step is to duplicate the Beginner worksheet, other wise, just build on what you have.

Add Matching Week Points to Rows to create an additional axis. By default only 1 mark may have displayed. Adjust the table calculation setting of the field, so the Latest Points Per Team calculation is computing by all fields except the Team field.

Change the mark type of the Matching Week Points marks card to Circle and remove Season End Year from the Label shelf (simply drag the pill against the T symbol off off the marks card)

Size the circles

We want the circles to be bigger, but if we adjust the Size, the lines change too, as the Size is being set based on the Is Latest Year pill on both marks cards. To resolve this, create a duplicate instance of Is Latest Year (right click pill and duplicate). This will automatically create Is Latest Season (copy) in the dimensions pane. Drag this onto the size shelf of the Matching Week Points marks card instead to make the sizing independent (you will probably find you need to readjust the table calculation for the Matching Week Points pill to include Is Latest Season (copy)). Then adjust the sizes as required.

Label the circles

Add Latest Points Per Team to the Label shelf of the Matching Week Points marks card. Adjust the table calculation setting, so the Latest Points Per Team calc is computing just by Wk only, so only the latest value displays.

Then format the Label so the text is aligned middle centre, is white & bold and slightly larger font.

Adjust the Tooltip text on the Matching Week Points mark, so it reads the same as on the line mark. You will need to reference the Matching Week Points value instead.

Then make the chart dual axis and synchronise the axis. Remove the Measure Names field that has automatically been added to the All marks card

Remove Label for Latest Season

We don’t want the season label to display for the current year, so create a new field

Label:Season

IF NOT([Is Latest Season]) THEN [Season End Year] END

Replace the Season End Year pill on the Label shelf of the Cumulative Points Per Season marks card, with this one instead.

Final Formatting

To tidy up

Remove right hand axis
- right click on axis and uncheck show header
Remove 165 nulls indicator
- right click on indicator and hide indicator
Remove row & column dividers
- Right click on the canvas and Format. Select Format Borders and set Row and Column Divider values to None

Advanced Challenge

For the final part of the challenge we want to add some additional text to the tooltip and adjust the filter control. As before either duplicate the Intermediate challenge sheet or just build on.

We’ll start with the tooltip text.

Expand the Tooltip Text

For this, we’ll go back to the data table sheet we were working with to validate the calculations required.

We want to calculate the difference in the number of weeks between the latest week of the current season, and the week number of the matching record from previous seasons. So first, we want to identify the latest week of the current season, and ‘flood’ that over every row.

Latest Week Number Per Team

WINDOW_MAX(MAX(IF ([Date]) = ([Latest Date Per Team]) THEN [Wk] END))

This is the same logic we used above when getting the Latest Points Per Team, although as this time the Wk field isn’t already an aggregated field like Cumulative Points Per Season is, we have to wrap the conditional statement with an aggregation (eg MAX) before applying the WINDOW_MAX.

Add this to the table.

And then we need, the week number in each season where the match was found, but this needs to be spread across every row associated to that season.

Matching Week No Per Team and Season

WINDOW_MAX(IF [Matching Week] THEN MIN([Wk]) END)

Add to the table, but adjust the table calculation so the field is computing by all fields except the Season End Year.

Now we have these 2 fields, we can compute the difference

Week Difference

[Latest Week Number Per Team] – [Matching Week No Per Team and Season]

Add to the table. With this we can then define what this means for the text in the tooltip

Less/More

IF [Week Difference]<0 THEN ‘more’ ELSEIF [Week Difference]>0 then ‘less’
ELSE ”
END

and also build the tooltip text completely

Tooltip- other seasons

IF NOT(MIN([Is Latest Season])) THEN
IF [Week Difference] = 0 THEN ‘It took the same amount of weeks to accrue at least the same number of points as the current season’
ELSE ‘It took ‘ + STR(ABS([Week Difference])) + ‘ weeks ‘ + [Less/More] + ‘ to accrue at least the same number of points as the current season’
END
END

Add this to the Tooltip shelf of the Matching Week Points marks card. Adjust the Tooltip to reference the field.

Adjust the Tooltip – other seasons table calculation, so the Latest Points Per Team nested calculation is computing for all fields except Team (this will make the circles reappear)

and also adjust the Latest Week Number Per Team nested calculation to compute by all fields except Team. This should make the text in the tooltip appear.

Filtering by teams in the current season only

We need to get a list of the teams in the current season only, which we can define by

Filter Team

IF {FIXED [Team]: MAX([Season End Year])} = {MAX([Season End Year])} THEN [Team] END

Breaking this down: {FIXED [Team]: MAX([Season End Year])} returns the maximum season for each team, which is compared against the maximum season in the data set. So if there is a match, then name of the team is returned. Listing this out in a table we get Null for all the teams whose latest season didn’t match the overall maximum

We can then build a set off of this field. Right click on the Filter Team field Create > Set and click the Null field and select Exclude

Then add this to the Filter shelf, and the list of teams displayed in the Team filter display will be reduced.

And with that the challenge is completed. My published viz is here.

Happy vizzin’!

Donna

Can you replicate quick table calcs?

January 3, 2025 Donna ColesLeave a comment

Erica kicked off the 1st year of #WOW2025 with a table calculation based challenge asking us not to just use quick tableau calculations, to avoid the use of the RUNNING_SUM table calculation and to just create 2 calculated fields.

Defining the calculations

On a new sheet, add Order Date to Rows as a discrete (blue) pill at the Month-Year format and add Sales to Text.

For the running total, we want

Running Total

SUM([Sales]) + PREVIOUS_VALUE(0)

which takes the Sales from the current row, and adds it to the value of the cumulative Sales (ie this calculation) in the previous row

For the moving average, we want

3 Month Moving Average

WINDOW_AVG(SUM([Sales]), -2, 0)

which averages the Sales for the current row plus the previous 2 rows (ie 3 rows in total)

Building the Viz

ON a new sheet add Order Date as a continuous (green) pill at the Month-Year format to Columns and Running Total to Rows.

Change the mark type to Area, and set the Colour to #a16eaf with a 25% Opacity.

Add Order Date to Tooltip, and set to the MIN aggregation. Format the pill on the Tooltip shelf to have the <month year> format when displayed on the pane.

Add another instance of Running Total to Rows. Set the Mark type to be Line and reset the opacity on the colour shelf to be 100%. Make the chart dual axis and synchronise the axis.

Add 3 Month Moving Average to Rows. Set the Colour to be #67c79c at 25% opacity.

The add another instance of 3 Month Moving Average to Rows. Set the colour to #67c79c and the mark type to line and increase opacity to 100%. Make dual axis and synchronise the axis.

Update the Tooltip on the All Marks card, referencing the MIN(Order Date) pill and adding $ before the values

Edit the Order Date axis, to start from 01 Jan 2021 and end on 31 Dec 2024. Remove the axis title.

Format the Order Date axis, and set the Dates on the scale to be formatted with the MMMMM notation to just show the first letter of the month

Add a reference line to the Order Date axis, set to be a constant value of 01 Jan 2021, with a custom label of 2021 and the line is formatted to be a white dashed line of 100%

Then format the reference line so the label is aligned top right

Repeat this step 3 more times, adding Reference Lines for 01 Jan 2022 (labelled 2022), 01 Jan 2023 (labelled 2023) and 01 Jan 2024 (labelled 2024).

Finally, hide the right hand axis (uncheck show header), remove column dividers, but show the axis rules on the rows.

And that should complete the challenge – just pop the viz onto a dashboard and publish. My published version is here.

Happy vizzin’!

Donna

Sleep Survey Marimekko

May 3, 2024 Donna ColesLeave a comment

Week 18 2024, was the #Data24 TC edition, which we shared live during our session at TC. It was so wonderful to see so many people at our session, and to be able to meet one of our new coaches, Yusuke Nakanishi, in person.

Sean set the challenge to recreate a Marimekko chart based on some sleep survey data.

Let’s dive in…

Setting up the data

The data provided contained a row per respondent indicating the hours slept and their responses to 4 questions. For this viz, we only care about the data associated to the Wake Up Well Rested and Sleep Hours fields.

The hours of sleep is grouped into custom bins

Hours Bin

IF [Sleep Hours] <= 4 THEN “<= 4” ELSEIF [Sleep Hours] >= 10 THEN “10+”
ELSE STR([Sleep Hours])
END

On a new sheet add Hours Bin to Rows and Wake up Well rested to Columns and Sleep Survey.csv (Count) to Text. Manually adjust the ordering of the Hours Bin field, so it is displayed in the correct sequence. This gives us the number of respondents per bin per response.

For the y-axis, we’re looking to display the % of total of respondents for each bin

% of Total

COUNT([Sleep Survey.csv]) / TOTAL(COUNT([Sleep Survey.csv]))

Add this into the table, and adjust the table calculation to ensure it is computing by Wake up Well rested only

This should mean each row adds up to 100% (you can check by displaying row grand totals).

Now for the x-axis, we need to plot a point based on how wide each bar needs to be. This width is based on the total number of respondents in each Hours Bin

Count People per Bin

{FIXED [Hours Bin]: COUNT([Sleep Survey.csv])}

Add to the table, and you should see the values are the same for each Hours Bin regardless of the response column.

But if we plotted at these points, we won’t get the points in the right place, we need a further step… we need to plot at the cumulative values. So add a Running Total Quick Table Calculation to the Count People per Bin field, that is set to compute by the Hours Bin only.

We now have the core fields we need to build the chart.

Building the Marimekko chart

On a new sheet, add Hours Bin to Detail. Add % of Total to Rows and Count People per Bin to Rows. Manually sort the Hours Bin field so that it is logically ordered.

Add Wake up Well Rested to Colour and adjust accordingly, and manually reorder the values in the colour legend.

Adjust the table calculation of the % of Total field, so that is computing by Wake up Well rested only – all the bars should extend to 100%

Then adjust the table calculation of the Count People per Bin Running Total so that its is computing by Hours Bin only – the bars should shift, into their expected logical position based on the Hours Bin ordering – ie <=4 is the first bar and 10+ is the last.

Now to get the bars to be wider, add Count People per Bin to the Size shelf, then as we have measures (green pills) on both the rows and column shelves, we can set the size to be Fixed and Right aligned, to get the display we require

Hide the Tooltips, hide the bottom axis, and remove the title of the y-axis, and you should have the required display.

When I added to the dashboard, I just used text boxes carefully positioned to simulate the labelling of the x-axis, and then I floated text boxes on top of the chart to label the sections.

My published viz is here.

Happy vizzin’!

Donna

Can you build this segmented bar chart?

March 28, 2024 Donna Coles1 Comment

My #WOW2024 challenge this week, was to recreate this bar chart which displayed the total value in conjunction with the values split by another dimension, in this instance year. It was inspired by Sam Parsons’ film franchises viz.

The viz is built on a single sheet, and only uses 3 fields from the data set, the Order Date the Sub-Category and the Sales value.

Let’s build…

On a new sheet, add Sub-Category to Rows and Sales to Columns and sort by Sales descending. Add Order Date to Colour to build a basic stacked bar chart.

We need to be able to show two bars directly ‘side by side’, but with different properties, so we need to end up with 2 marks cards. The ‘heights’ of the bars need to differ, so we’re going to create a ‘fake axis’ on Rows to help.

Double click into the Rows shelf and manually type MIN(0.5)

Change the mark type to bar. Each bar is now positioned based on it’s specific value (ie not stacked).

To get the bars to be positioned where we want them, apply a Running Total Quick Table Calculation to the Sales pill in the Columns shelf. Then edit the table calculation so that is computes by Year of Order Date only.

Add Sales to the Size shelf, and edit the size to be Fixed and aligned Right. You should now have a stacked bar effect.

Add Sales to Tooltip, then add a Percentage of Total quick table calculation. Edit the calculation so it is computing by Year of Order Date.

Adjust the Tooltip to reference the relevant details (note in the above gif, and my original published version, at some point the labelling went awry and the label was showing the cumulative sales value, and not the value for that year).

So now we’ve managed to build a segmented bar chart in a ‘different way’ by using an axis and the size field. We build on this to get the total bar. As this bar needs to be wider than the segmented bar, double click into the Rows shelf and type MIN(1.5), then from the MIN(1.5) marks card, remove the Order Date field from Colour. We now have 2 bars – 1 segmented and 1 not.

Make the chart dual axis and synchronise the axis, then right click on the right hand axis and move marks to back.

On the Min(0.5) marks card, move Order Date from Colour to Detail. Then on the All marks card, click on the Colour shelf and adjust the colour to suit (I set to #dec7b6) and add a white border.

Adjust the Tooltips on both marks cards so the MIN(1.5) is just showing the Sub-Category and Sales value, while the MIN(0.5) card is showing the Order Date year and % of sales value as well.

Narrow the width of the rows. Hide the right hand and the bottom axis (uncheck show header).

We want to make each pair of bars more distinctive, by providing more space between them. Edit the axis to fix the axis from -0.3 to 1.5.

Now hide the axis, and the Sub-Category column heading (right click label and hide field label for rows). Remove all row/column divider lines, zero lines and gridlines. Set the worksheet background colour (I used #fefaf1). Adjust the colour/style of the Sub-Category labels (I used bold, brown) and align top right.

And ta-dah! this should be your finished viz. My published version is here.

We’re going to building on this in a later challenge (I felt putting it all together in a single challenge might be a bit much), so look out for Part 2 in a few weeks time.

Happy vizzin’!

Donna

Donna + DataViz

ramblings in all things Tableau (and occasionally some other Stuff….)

Tag: Running Total

How are EPL Football Teams performing this season compared to previous seasons?

Can you replicate quick table calcs?

Sleep Survey Marimekko

Can you build this segmented bar chart?