How many consecutive starts?

Another table calculation related challenge this week, set by Luke, visualising cumulative starts for NFL Quarterbacks per team from 2006.

Luke provided the data within a Tableau workbook template on Tableau Public, so I started by downloading the workbook and understanding the data structure.

The challenge talks about teams playing over 17 weeks, but the data showed some data associated to weeks 28-32. So I excluded these weeks by filtering them out.

I then started to build out the data in tabular form, so I could start to build up what was required. I added Team, Season, Week and Player ID to Rows, and just to reduce the amount of data I was working with while I built up the calcs, chose to filter to the Teams ARZ, ATL & BLT.

What we’re looking to do is examine each Player ID and work out whether it is the first record for the Team or whether it differs from the previous row’s data. If so then we’re at the ‘start’ of a run, so we record a ‘counter’ value of 1. If not, the values match, so we need to increment the counter.

We’ll do this in stages.

Firstly, let’s get the previous Player ID value.

Prev Player ID

LOOKUP(MIN([Player Id]),-1)

This ‘looks up’ the record in the previous (-1) row. Change this field to be discrete and add to the Rows. Set the table calculation to compute by all fields except Team.

Each Prev Player ID value matches the Player ID from the row before, unless its the first row for a new Team in which case the value is Null.

Then we can create a field to check if the values match

Match Prev Player ID

MIN([Player Id])=[Prev Player ID]

Add this to the view and set the table calc as above, and the data shows True, False or NULL

Now we can work out the consecutive streak values

Consecutive Streak

IF (NOT([Match Prev Player ID])) OR ISNULL([Match Prev Player ID]) THEN 1
ELSE 1+PREVIOUS_VALUE(-1)
END

If we don’t match the previous value or we’re at the start of a new team (as value is NULL), then start the streak by setting the counter to 1, otherwise increment the counter. Add this to the view and set the table calc for both the nested calculations as per the settings described above.

Next we need to identify the last value of the Consecutive Streak for each Team.

Current Streak

WINDOW_MAX(IF LAST()=0 THEN [Consecutive Streak] END)

The inner IF statement, will return the value of Consecutive Streak stored against the last row for the Team. All other rows will be Null/blank. The WINDOW_MAX() statement then ‘spreads’ this value across all the rows for the Team.

Add this onto the view, and set the table calc for all the nested calcs.

Finally, we need one more bit of data. The chart essentially plots values from 2006 week 1 through to 2020 week 17. We need to ‘index’ these values, so we have a continuous week number value from the 1st week. We can use the Index table calculation for this

Index

INDEX()

Add this field to the view, set it to be discrete (blue pill) and position after the Week field on the Rows. Set the table calc as usual, so the Index restarts at each Team.

Now we’ve got all the data points we need, we can build the viz. I did this by duplicating the tabular view and then

  • Remove Prev Player ID and Match Prev Player ID
  • Move Season, Week and Player ID from Rows to Detail
  • Move Current Streak from Text to Rows and change to be discrete (blue)
  • Move Index from Rows to Columns and change to be continuous (green)
  • Move Consecutive Streak from Text to Rows
  • Change mark type to bar, and set to fit width to expand the view.
  • Change Size to be Fixed, width size 1 and aligned right
  • Set the border on the Colour shelf to be None.
  • Remove the Team filter and adjust the row height

All that’s left now is to set the tooltip (add Player to the Tooltip shelf to help this), and then apply the formatting. You can use workbook formatting (Format -> Workbook menu) to set all the font to Times New Roman.

Hopefully this is enough to get you to the end 🙂 My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Advertisement

Is Profit Ratio Influenced by Quantity?

For this week’s challenge, Ann asked we recreated a scatter plot that ‘on click’ became a connected scatter plot and also displayed an ‘insights’ box to the user.

Building out the data

The scatter plot is to show Profit Ratio by Quantity, so first up we need

Profit Ratio

SUM([Profit])/SUM([Sales])

formatted to percentage with 0 dp.

A mark is shown on the scatter plot per Category and Month/Year. When working with dates like this, I often prefer to created a dedicated field to use as an ‘exact date’ which I can then format. So I built

Month Year

DATE(DATETRUNC(‘month’,[Order Date]))

and formatted it to the March 2001 format (ie mmmm yyyy).

We’re also going to need a set based on the Category field, as that is going to drive the interactivity to help us connect the dots ‘on click’. So right click on Category and Create Set and just tick one of the categories displayed for now.

Selected Category

Let’s build these out into a table as follows

  • Order Date to Filter, filtered by years 2018 & 2019
  • Category, Selected Category & Month Year (exact date , discrete) onto Rows
  • Measure Names to Text and Filter to Quantity & Profit Ratio
  • Right click on Selected Category and select Show Set to display a selection control to easily allow the set values to be changed

To help drive the interactivity, we’re going to need to store the value of the Profit Ratio for the Selected Category in the set.

Selected PR

IF ATTR([Selected Category]) THEN [Profit Ratio] END

Format this to percentage 0 dp too, and add this to the table – it will only display the profit ratio against the rows associated to the category in the set

At this point, we’ve got what we need to build out the scatter plot, but I’m going to continue to create the fields needed for the ‘Insights’ piece.

These are looking for the months that contained the Max & Min values for the Profit Ratio and Quantity. I’m going to tackle this with Table Calcs.

Max PR per Category

WINDOW_MAX([Profit Ratio])

Adding this to the table, and setting the table calculation to compute by Month Year only, gives us the Max Profit Ratio for each Category

We need 3 more fields like this :

Max Qty per Category

WINDOW_MAX(SUM([Quantity]))

Min PR per Category

WINDOW_MIN([Profit Ratio])

Min Qty per Category

WINDOW_MIN(SUM([Quantity]))

Add all these to the table, setting the table calc in the same way as described above.

Now we know what the min & max values are, we need to know the month when these occurred.

Max PR Month

WINDOW_MAX(IF ATTR([Selected Category]) AND [Profit Ratio] = [Max PR per Category] THEN MIN([Month Year]) END)

Format this to March 2001 format.

If the category is that selected, and the profit ratio matches the maximum value, then retrieve the Month Year value, and duplicate that value across all the rows (this is what the outer WINDOW_MAX does).

Add this to the table in the Rows, and adjust the table calc settings for both the nested calculations to compute by Month Year only. The month for the selected category which had the highest profit ratio will be displayed against all rows for that category, and null for all other rows.

Again we need similar fields for the other months

Max Qty Month

WINDOW_MAX(IF ATTR([Selected Category]) AND SUM([Quantity]) = [Max Qty per Category] THEN MIN([Month Year]) END)

Min PR Month

WINDOW_MAX(IF ATTR([Selected Category]) AND [Profit Ratio] = [Min PR per Category] THEN MIN([Month Year]) END)

Min Qty Month

WINDOW_MAX(IF ATTR([Selected Category]) AND SUM([Quantity]) = [Min Qty per Category] THEN MIN([Month Year]) END)

Format all these too, and add to the table with the appropriate table calc settings applied.

If you now change the value in the the Selected Category set, you should see the table update.

Building the Scatter Plot

On a new sheet, build out the basic scatter plot by

  • Order Date to Filter, filtered by years 2018 & 2019
  • Profit Ratio to Rows
  • Quantity to Columns
  • Category to Colour and adjust accordingly
  • Month Year exact date, continuous to Detail
  • Change Shape to filled circle
  • Adjust the Tooltip to suit
  • Change the axis titles to be capitalised
  • Change the sheet title
  • Remove all row/column borders & gridlines. Leave the zero lines and set the axis rulers to be a sold grey line

Add Selected PR to Rows, and change the mark type of that card to line, and add Month Year to path. In the image below, I still have Furniture selected in my set.

Add Month Year to the Label shelf too and format so the text is smaller and coloured to match mark colour

Make the chart dual axis, synchronise the axis and remove the Measure Names from the colour shelf of each mark. Hide the Selected PR axis.

If you now change the selected set value, you can see how the other categories ‘join up’.

Building the Insights Box

Take a duplicate of the data table built originally, and add Selected Category to Filter. If you’ve still got an entry in your set, this should reduce the rows to just those records for that Category.

Remove the fields Profit Ratio, Quantity, Selected PR from the Measure Values section, and remove Selected Category from Rows.

Create a new field

Index = Size

INDEX()=SIZE()

and add this to the Filter shelf, setting the value to true.

The INDEX() function will number each row, the SIZE() function will number each row based on the total number of rows displayed (ie all rows will display the same number), so INDEX()=SIZE() will return true for the last row only

Now we’ve only got 1 row displayed, we’ve got the info we need to build the data in the Insights box by

  • Move Month Year to Detail
  • Category to Colour
  • Move Max PR Month, Max Qty Month, Min PR Month, Min Qty Month to Text
  • Move Max PR per Category, Max Qty per Category, Min PR Per Category, Min Qty per Category to Text

This will probably show duplicate instances of the values. This is because we didn’t fix the table calculation setting of the Index=Size field. Edit that to compute by Month Year

and then re-edit the Index=Size filter to just be true again, and expand the display so you can see all the values.

Now you can easily format the Text to the required display

I used this site to get the bullet point image from

Building the dashboard & interactivity

Add the scatter plot to the dashboard, and remove the legends that automatically get added.

Float the Insights sheet and position bottom right, remove the title. fit entire view, and also remove the colour legend which will have been added.

Add a dashboard Set Action to assign values to the Selected Category set and remove all values when the selection is cleared

Additionally, add a highlight action that highlights the Category field only (this keeps the lines highlighted on selection of a single mark).

Now practice clicking on and off marks on the scatter plot and you should see your lines & insights box being added and removed.

Finally add the title and header line by adding a vertical container at the top above the scatter plot.

Add a text box for the title, and underneath add a blank object.

Set the padding of the blank object to 0 and the background to black. Then edit the height to 4, and adjust the height of the container if it’s now too large.

Make adjustments to the position of the floating insights box if need be so it isn’t overlapping any axis.

And that should be it! My published viz is here.

I’ll be taking a break for a couple of weeks, so will be on catch up when I return from some R&R.

Happy vizzin’! Stay Safe!

Donna