How much change has occurred?

September 2, 2021 Donna Coles2 Comments

A relatively straightforward challenge was set by Luke this week, to visualise the difference in Sales between 2020 and 2021 in a slightly different format than what you might usually think of.

Start by filtering the data to just the years 2020 and 2021 (add Order Date to the Filter shelf and select specific years, or add a data source filter to limit the whole data set).

Add Sub-Category to Rows, and Sales to Columns, then add Order Date to Colour which by default will display as YEAR(Order Date). Colour the years appropriately.

Now unstack the marks (Analysis menu -> Stack Marks -> Off), and re-order the colour legend, so 2021 is listed first (this makes the 2021 bars sit ‘on top’ of 2020).

Adjust the size to make the bars thinner.

Now add another instance of Sales to the Columns shelf, and make the chart dual axis (synchronising the axis). Reset the mark type of the original SUM(Sales) marks card back to bar.

We need the circle mark for the 2021 Sales to be blue. To do this, duplicate the Order Date field, then add Order Date (copy) to the Colour shelf of the SUM(Sales)(2) marks card. This will show another colour legend, and you can set the colours accordingly. Add a white border around the circle marks.

To work out the % difference to display on the label, we need the following fields

2021 Sales

{FIXED [Sub-Category]: SUM(IF YEAR([Order Date])=2021 Then [Sales] END)}

This returns the value of the 2021 sales for each Sub-Category against all the years in the data set. Similarly we need

2020 Sales

{FIXED [Sub-Category]: SUM(IF YEAR([Order Date])=2020 Then [Sales] END)}

which means we can then create

% Difference

(SUM([2021 Sales])-SUM([2020 Sales]))/SUM([2020 Sales])

format this using custom formatting to display as +0%;-0%

Now we can add % Difference to the Label field of the Sum(Sales)(2) marks card.

You’ll notice you’ll have duplicate labels displayed. To resolve this, you need to adjust the label settings as below

To sort the rows, you need to sort the Sub-Category field by 2021 Sales descending

And finally to show the value of the 2021 Sales, add this field to the Rows shelf, and change to be discrete (blue pill).

All that’s left to do now is adjust the wording of the tooltips as you see fit, and format to remove gridlines, headers etc.

My published version is here.

Happy vizzin’! Stay Safe!

Donna

How many consecutive starts?

August 31, 2021 Donna Coles1 Comment

Another table calculation related challenge this week, set by Luke, visualising cumulative starts for NFL Quarterbacks per team from 2006.

Luke provided the data within a Tableau workbook template on Tableau Public, so I started by downloading the workbook and understanding the data structure.

The challenge talks about teams playing over 17 weeks, but the data showed some data associated to weeks 28-32. So I excluded these weeks by filtering them out.

I then started to build out the data in tabular form, so I could start to build up what was required. I added Team, Season, Week and Player ID to Rows, and just to reduce the amount of data I was working with while I built up the calcs, chose to filter to the Teams ARZ, ATL & BLT.

What we’re looking to do is examine each Player ID and work out whether it is the first record for the Team or whether it differs from the previous row’s data. If so then we’re at the ‘start’ of a run, so we record a ‘counter’ value of 1. If not, the values match, so we need to increment the counter.

We’ll do this in stages.

Firstly, let’s get the previous Player ID value.

Prev Player ID

LOOKUP(MIN([Player Id]),-1)

This ‘looks up’ the record in the previous (-1) row. Change this field to be discrete and add to the Rows. Set the table calculation to compute by all fields except Team.

Each Prev Player ID value matches the Player ID from the row before, unless its the first row for a new Team in which case the value is Null.

Then we can create a field to check if the values match

Match Prev Player ID

MIN([Player Id])=[Prev Player ID]

Add this to the view and set the table calc as above, and the data shows True, False or NULL

Now we can work out the consecutive streak values

Consecutive Streak

IF (NOT([Match Prev Player ID])) OR ISNULL([Match Prev Player ID]) THEN 1
ELSE 1+PREVIOUS_VALUE(-1)
END

If we don’t match the previous value or we’re at the start of a new team (as value is NULL), then start the streak by setting the counter to 1, otherwise increment the counter. Add this to the view and set the table calc for both the nested calculations as per the settings described above.

Next we need to identify the last value of the Consecutive Streak for each Team.

Current Streak

WINDOW_MAX(IF LAST()=0 THEN [Consecutive Streak] END)

The inner IF statement, will return the value of Consecutive Streak stored against the last row for the Team. All other rows will be Null/blank. The WINDOW_MAX() statement then ‘spreads’ this value across all the rows for the Team.

Add this onto the view, and set the table calc for all the nested calcs.

Finally, we need one more bit of data. The chart essentially plots values from 2006 week 1 through to 2020 week 17. We need to ‘index’ these values, so we have a continuous week number value from the 1st week. We can use the Index table calculation for this

Index

INDEX()

Add this field to the view, set it to be discrete (blue pill) and position after the Week field on the Rows. Set the table calc as usual, so the Index restarts at each Team.

Now we’ve got all the data points we need, we can build the viz. I did this by duplicating the tabular view and then

Remove Prev Player ID and Match Prev Player ID
Move Season, Week and Player ID from Rows to Detail
Move Current Streak from Text to Rows and change to be discrete (blue)
Move Index from Rows to Columns and change to be continuous (green)
Move Consecutive Streak from Text to Rows
Change mark type to bar, and set to fit width to expand the view.
Change Size to be Fixed, width size 1 and aligned right
Set the border on the Colour shelf to be None.
Remove the Team filter and adjust the row height

All that’s left now is to set the tooltip (add Player to the Tooltip shelf to help this), and then apply the formatting. You can use workbook formatting (Format -> Workbook menu) to set all the font to Times New Roman.

Hopefully this is enough to get you to the end 🙂 My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you create a moving average chart with a focus on selected subcategories?

August 29, 2021 Donna ColesLeave a comment

Following the #WOW survey where practice in table calculations was the most requested feature, Lorna continues with the theme in this challenge, where the focus is on the moving average table calculation, plus a couple of extra features thrown in.

Moving average

This is based on the values of data points before and after the ‘current’ point, as defied by the parameters which will need to be created.

pPrior

Integer parameter ranging from 1 to 6 and defaulted to 3. You need to explicitly set the Step size to 1 to ensure the step control slider appears when you add the parameter to the dashboard. This will be used to define the number of data points prior to the current to use in the calculation.

Create an identical parameter **pPost** to define the number of data points to use after the current one.

With these parameters, we can now create the core calculation

Moving Avg

WINDOW_AVG(SUM([Sales]), (-1*[pPrior])+1, [pPost])

As the requirement states that the ‘prior’ parameter needs to include the ‘current’ value, then we need to adjust the calculation – ie if the parameter is 3, we actually only want to include 2 prior data points, as the 3rd will be the current point itself. This is what the +1 is doing in the 2nd argument of the function.

Lorna has stated that 3 Sub-Categories are grouped to form a Misc category, so we need to create a group off of Sub-Category (right click Sub-Category -> Create -> Group).

Multi-select the 4 options that need to be grouped (hold down Ctrl as you select), and then group, and rename the group Misc.

Now we can check what the calculation is doing. If you add the fields onto the view as below, and set the Moving Avg table calculation to compute using Month of Order Date only (see further below), you should be able to see that each month’s moving avg value is calculated based on the sales value of the set of previous & post months as defined by your parameters. In the image below the Moving Avg for Accessories in June 2018, is the average of the Sales values from April 2018 – Sept 2018.

With this you can start the beginnings of the viz – don’t forget to set the table calc as above.

Colouring the lines

This will be managed by using a set.

Right click on the Sub-Category (group) field -> Create -> Set. Initially select all values. Add this field to the Colour shelf. Additionally, click the Detail symbol (…) to the left of the Sub-Category (group), and select the Colour symbol, so this field is also added to the Colour shelf.

The resulting colour legend will look something like this

Edit the colour legend, then choose **Hue Circle** and select **Assign Palette** to randomly assign colours to all the options

To show the set values, click on the context menu of the Sub-Category (Group) field on the Colour shelf, and Show Set.

This will add the list of options for selection

Uncheck All so none are selected, which will change the colour legend to read ‘Out, xxx’. Edit the colour legend again, and control-click to multi select all options, then set to a single grey

Now if you select a few options, the ones selected will be coloured, while the others remain grey

Additionally add the set field onto the Size shelf and make the In option bigger than the Out.

Shading the background

For this we need to create an unstacked area chart with one measure representing the maximum moving average value for the month, and the other representing the minimum moving average value for the month. We’ll need new calculated fields for this:

Window Max Avg

WINDOW_MAX([Moving Avg])

Window Min Avg

WINDOW_MIN([Moving Avg])

If you’ve still got your data sheet available, then move Sub-Category (Group) onto Rows, then add the two newly created fields.

In this case there are ‘nested’ table calcs. You need to ensure the setting related to the Moving Avg is computing by Month Order Date only, but the setting related to the Window Max Avg (or Window Min Avg) is computing by Sub-Category (Group).

If set properly, you should see that for each month the max / min values are displayed against every row.

Back to your chart viz sheet, and add Window Max Avg to Rows. Set the table calc settings as described above, then remove the Sub-Category (group) Set field from the Colour shelf of this measure, and change the Sub-Category (group) to be on the Detail rather than Colour shelf.

Change the mark type to Area, set the Opacity of the colour to 100% and set stack Marks to be Off (Analysis Menu -> Stack Marks -> Off).

Now drag Window Min Avg onto the Window Max Avg axis and drop it when the ‘2 columns’ image appears.

This will change the view so Measure Values is now on the Rows shelf and Window Min Avg is now displayed in the Measure Values section on the left hand side.

Adjust the table calc setting of Window Min Avg to be similar to how we set the Max field. And now drag the fields so Window Min Avg is listed before Window Max Avg. Measure Names will now be on the Colour shelf of this marks card, so adjust so Window Min Avg is white and Window Max Avg is pale grey.

Now make the chart dual axes, synchronise the axes, and set the Measure Values axis to the ‘back’.

Everything else is now just formatting and adding onto a dashboard. My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Visualise Our Survey Data

August 12, 2021 Donna Coles1 Comment

This week, Ann Jackson set a table calculations based challenge, using the responses from a recent survey on #WorkoutWednesday, as the most requested topic was for table calcs!

There’s a lot of visuals going on in this challenge, and I’m shortly off on my holibobs (so will be playing catch up in a couple of weeks), so I’m going to try to pare down this write up and attempt just to focus on key points for each chart.

Donut Chart

By default when you connect, Respondent is likely to be listed in the ‘Measures’ part of the data pane – towards the bottom. This needs to be dragged into the top half to turn it into a dimension. You can then create

# of Respondents

COUNTD([Respondent])

which is the key field measures are based on throughout this dashboard.

When building donuts, we need to get a handle on the % of total respondents for each track, along with the inverse – the % of total non-respondents for each track. To do this I created fields

Total # of Respondents

TOTAL(COUNTD([Respondent]))

and then

Track – % of Total

[# of Respondents]/([Total # of Respondents])

along with the ‘inverse’ of

Non Track – % of Total

1-[Track – % of Total]

To then build the donut, we ultimately need to create a dual axis chart, with Which track do you participate in? on Columns and a MIN(1) field on Rows. Manually reorder the entries so the tracks are listed in the relevant order.

On the first MIN(1) axis/marks card, build a pie chart. Add Measure Names to the filter shelf and filter to the Track & Non Track % of Total fields. Set Mark Type to Pie Chart and add Measure Values to the angle shelf. Add both Which track do you participate in? and Measure Names to the Colour shelf. Set a white border on the Colour shelf. Reorder the entries in the colour legend, and set the colours appropriately.

The create another MIN(1) field next to the existing one on the Rows shelf

Set this marks type to circle, and remove all the fields from the colour & detail shelves. Set the colour to white. Add Which track do you participate in? and Track – % of Total to the Label shelf and format. Reduce the Size. Make dual axis, and synchronise. Further adjust sizes to suit.

Participation Bar Chart

Plot How often do you participate? against # of Respondents, and then add a Quick Table Calculation to the measure using Percent of Total. Manually re-sort the order of the entries, show mark labels and Colour the bars light grey. Apply relevant formatting.

Diverging Bar Chart

In this bar chart, the percentage of ‘agree’ responses are plotted to the right on the +ve scale and the percentage of the ‘disagree’ responses are plotted to the left on the -ve scale. The percentage of the ‘inbetweeners’ (neither agree nor disagree) is halved, and displayed on both sides. To address this, I created the following:

# of Respondents – Diverging +ve

CASE ATTR([Answer])
WHEN ‘Agree’ THEN [# of Respondents]
WHEN ‘Strongly Agree’ THEN [# of Respondents]
WHEN ‘Neither Agree nor Disagree’ THEN [# of Respondents]/2
END / [Total # of Respondents]

This is the % of total respondents for the ‘agree’ responses and half of the ‘inbetweeners’.

Similarly I then have

# of Respondents – Diverging -ve

(CASE ATTR([Answer])
WHEN ‘Disagree’ THEN -1*[# of Respondents]
WHEN ‘Strongly Disagree’ THEN -1 *[# of Respondents]
WHEN ‘Neither Agree nor Disagree’ THEN ([# of Respondents]/2) * -1
END) / [Total # of Respondents]

which is doing similar for the ‘disagree’ responses, except all results are multiple by -1 to make it negative.

The Question field is added to the Filter shelf and the relevant 5 questions are selected. Answer is also on the Filter shelf with the N/A answer excluded.

Add Question to Rows (and manually sort the entries), then add # of Respondents – Diverging -ve and # of Respondents – Diverging +ve to Columns and add Answer to the Colour shelf. Manually resort the entries in the colour legend and adjust the colours accordingly.

Make the chart dual axis, and synchronise the axis. Change the mark type back to bar and remove Measure Names from the colour shelf if it was added. Edit the bottom axis to fix the range from -0.99 to 0.99 and amend the title. Format the axis to display as percentage to 0 dp. Hide the top axis.

Additionally format both the measures to be percentage 0dp, but for the # of Respondents – Diverging -ve custom format, so the negative value is displayed as positive on the tooltip.

Adjust formatting to set row banding, remove gridlines etc and set tooltips.

Vertical bar chart

The best way to start building this chart is to duplicate the diverging one. Then remove both measures from the Columns shelf and add Answer to Columns. Manually re-sort the answers. Add # of Respondents to Rows and add a Quick Table Calculation of percent of total.

Show the marks label, and align bottom centre, and **match mark colour**. Hide the axis from displaying, and also hide the **Question** field (uncheck show header). Update the **Tooltip**.

Heatmap

Right click on the Question field > Aliases and set the alias for the relevant questions

Also add Question to Filter and select relevant values. Add Answer to Filter too and exclude NULL.

Add Question to Columns and add Answer to the Text shelf. Add # of Respondents to the Text shelf, and set to Percent of Total quick table calculation. Edit the table calculation to compute using the Answer field only

We need to get each of these columns ‘sorted’ from high to low – we want to rank them. To do this, add # of Respondents to Rows, then change it to be a blue discrete pill. Add a Rank quick table calculation and once again set to compute by Answer only. Also set the rank to be Unique

Now change the mark type to square, and then add the # of Respondents percent of total field onto Colour as well as Text (the easiest way to do this to retain all the table calc settings, is to hold down Ctrl then click and drag the pill from the Text shelf onto Colour. This should duplicate the pill.

Format the % of total displayed to be 0dp, and adjust the label. Change the Colour to use the purple range and set a white border too. Hide the ‘rank’ field from displaying and hide field labels for columns too.

The dashboard

I used a vertical container then added the objects as required, using nested horizontal containers to organise the side by side charts.

To make the diverging bar and vertical bar charts look like they are one chart, adjust the padding of diverging bar chart object to have 0 to the right, and similarly, adjust the padding of the vertical bar to have 0 padding to the left.

I found it a bit fiddly to get the charts to line up exactly. Both charts were set to fit entire view. The diverging bar chart displays it’s title. I also displayed a title on the vertical bar chart, but made the text white so it’s invisible.

Dashboard filter actions are set against the donut and the participation bar charts.

The filter uses selected fields, which for the donut chart references the Which track do you partcipate in? field. A similar dashboard action needs creating for the participation chart as the source and references the How often do you partcipate? field.

A highlight dashboard action is required for the diverging and vertical bar charts. They only impact each other and should be set up as below on hover.

Hopefully I’ve covered everything… my published version is here.

Happy vizzin’! Stay Safe!

Donna

Can you find the needle in the haystack?

August 5, 2021 Donna ColesLeave a comment

It was Candra’s turn to ‘set’ the #WOW2021 challenge this week providing a hint in the challenge description that the solution would involve sets.

As with many challenges, I built the data out in tabular format to start with to verify I had all the components and calculations correct. The areas of focus are

Identify number of distinct customers per product
Identify overall average number of distinct customers per product
Identify if product above or below average distinct customers
Identify Top 50 products by Sales
Identify Unprofitable Products
Identify products that are both in the top 50 AND unprofitable
Building the viz

Identify number of distinct customers per product

To start off, add Product Name, Sub-Category, Category to the Rows shelf to begin building out a table. Add Sales (formatted to $k 0dp) and Profit (formatted to $k 0dp with negative values as () ) to Text and sort by Sales descending.

To identify the distinct customers per product, we can create

Customer Count per Product

{FIXED [Product Name] : COUNTD([Customer ID])}

Add this to the view.

Identify overall average number of distinct customers per product

What we’re looking for here is the average of all the values we’ve got listed in the Customer Count per Product column. Ie we want to sum up those values displayed and divide by the number of rows.

The number of rows is equivalent to the number of products, which we can get from

Count Products

{FIXED : COUNTD([Product Name])}

And so to get the overall average we calculate

Avg Overall Customer Count

{FIXED: SUM([Customer Count Per Product])} / [Count Products]

Add these fields to the view as well, so you can see how the values work per row. The last two calculations give you the same value across all rows.

Identify if product above or below average distinct customers

Given the above display, this is just a case of comparing values in 2 columns

Higher than Avg Customer Count

AVG([Customer Count Per Product]) > SUM([Avg Overall Customer Count])

this returns true or false – add this to the view too.

Identify Top 50 products by Sales

We can create a set for this. Right click on Product Name > Create > Set. Name the set something suitable eg Top 50 Products, and on the Top tab, state the number (50) and the field (Sales) and the aggregation (Sum)

Add this to the view, and if you’ve sorted by the sales, you should find the top 50 rows are all In the set, and the rest are Out.

Identify Unprofitable Products

We can use another set for this. Again create a set off of Product Name, call it Unprofitable Products, and on the Condition tab, set the condition so that the Sum of Profit is less than 0

Add this onto the view too.

Identify products that are both in the top 50 AND unprofitable

For this, we’re explicitly looking for the rows that are both In the Top 50 Products set and In the Unprofitable Products set.

We can use the Combined Set functionality to do this.

In the left hand data pane, select both the Top 50 Products and the Unprofitable Products sets (hold down ctrl to multi select), then right click and Create Combined Set. I called the set Products to Include, and select to combine the sets by including Shared members in both sets

If you then add this field to the Filter shelf, you will be left with just the 13 Products that match

This is the single filter field you can use as per Candra’s requirements.

Building the viz

To get the text to display to the left of the bar, you actually need to create a ‘fake’ bar chart.

Add Products to Include to Filter
Add Product Name to Rows
On the Columns shelf, double click and type in MIN(1)
Add Sales to Columns to the right of MIN(1)
Sort by Sales descending

Against the MIN(1) marks card

Change the Size to small
Set the Opacity of the Colour to 0% and the border to None
Add Product Name, Sub-Category and Category to the Label shelf and adjust accordingly, aligning left
Increase the height of each row to make the text visible

On the Sales marks card

Add Higher than Avg Customer to the Colour shelf and adjust
Show mark labels
Create a new field Profit Ratio : SUM([Profit])/SUM([Sales]) Format to % with 0dp and add to Tooltip
Add Profit, and Customer Count by Product to Tooltip and adjust accordingly

Finally, uncheck Show Header against Product Name and MIN(1) and Sales and format the borders/gridlines etc. Add the title, then add to the dashboard.

All done (I hope…)! My published version is here.

Happy vizzin’! Stay Safe!

Donna

Let’s go streaking!

July 29, 2021 Donna ColesLeave a comment

It was Sean Miller’s turn to set the challenge this week, where the primary focus was to find the highest number of consecutive months where the monthly sales value was higher than the previous month.

This was a table calculations based challenge, and I always tackle these by building out the data required in a tabular format. The challenge was also reminiscent of a previous challenge Sean has set, which I’ve blogged about here, and admit I used as a reference myself.

So let’s get started.

To start with, we need the month date, the Sub-Category, the Sales value and the difference in Sales from the previous month. For the month date, I like to define this explicitly

Order Date Month

DATE(DATETRUNC(‘month’,[Order Date]))

This aligns all Order Dates to the 1st of the relevant month.

Add Sales Category, Order Date Month (set to discrete exact date blue pill), and Sales into a view, then set a Quick Table Calculation of Difference on the Sales pill

Edit the table calculation to compute by Order Date Month only, so the previous calculation restarts at each Sub-Category.

Then drag this pill from the marks card into the left hand data pane to ‘bake’ the calculated field into the data model. Name the field Sales Diff. The re-add Sales back into the view too, so you can double check the figures.

Identify whether there is an increase with the field

Diff is +ve

IF [Sales Diff]>0 THEN 1 ELSE 0 END

Add this into the view too, and verify the calculation is computing by Order Date Month only again.

Now we need to work out if the row matches the previous value

Match Prev Value

LOOKUP([Diff Is +ve],-1) = [Diff Is +ve]

The LOOKUP is looking at the previous row (identified by the -1) and comparing to the current. If they match then it returns True else False.

Again add into the view, and again double check the table calc settings. In this case there is nested calculations so you need to double check the settings against each calc referenced in the drop down

Now we need to work out when there are consecutive increases, and how many of them there are

Increase Streak

IF (NOT([Match Prev Value])) AND ([Diff Is +ve] = 1) THEN 1
ELSEIF [Diff Is +ve] = 1 THEN ([Diff Is +ve]+PREVIOUS_VALUE([Diff Is +ve]))

END

If the current row has a +ve difference and the previous row wasn’t +ve, then we’re at the start of an increase streak, so set to 1. Else, if the current row has a +ve difference then we must be on a consecutive increase, so add to the previous row, and this becomes a recursive calculation, so builds up the values..

Add this onto the view, set the table calc settings, and you can see how this is working…

So now we’ve identified the streaks in each Sub-Category, we just want the maximum value.

Longest Streak

WINDOW_MAX([Increase Streak])

Add this and set the table calc setting again. You’ll see the max value is spread across every row per Sub-Category.

Finally we need to identify Sales values in the months when the streak is at its highest.

Sales of Month with Longest Streak

IF [Longest Streak]=[Increase Streak] THEN SUM([Sales]) END

Add this into the view again (don’t forget those table calc settings), and you’ll notice that for some Sub-Categorys there are multiple points with the same max streak

With all this we can now build the viz, which is relatively straight forward….

Add Order Date Month (exact date, continuous green pill) to Columns, Sub-Category to Rows and Sales to Rows. Edit the Sales axis to be independent, then change the line type of the Path to stepped

Add Sales of Month with Longest Streak to Rows and set to dual axis, and synchronise. Make sure the mark type of the 2nd axis is set to circle, and remove Measure Names from the colour shelf of both marks.

Manually set the colour of the line chart to grey. Add Longest Streak to the Colour shelf of the circle marks card. Adjust the colour to use the green palette, set to stepped of 5 value and ensure the range starts at 0 and ends at 5 (don’t forget to edit the table calc settings!).

Now add Longest Streak as a discrete blue pill to the view too.

This is all the core components. The last thing we need to do is sort the list. I wasn’t entirely sure how it had been sorted, apart from the largest Longest Streak at the top. I created a new field for this

Sort

[Longest Streak]*-1

and added this as a blue discrete pill in front of Sub-Category….

…, then hid the column.

Then just apply the tooltip and relevant formatting on the chart.

For the legend, I created a new field

Legend

CASE [Sub-Category]
WHEN ‘Art’ THEN 0
WHEN ‘Chairs’ THEN 1
WHEN ‘Labels’ THEN 2
WHEN ‘Paper’ THEN 3
WHEN ‘Phones’ THEN 4
ELSE 5 END

and added this into a new sheet as below

The components then just need to be added to the dashboard. My published version is here.

Happy vizzin’! Stay Safe!

Donna

Can you rebuild the Olympic Schedule?

July 23, 2021 Donna ColesLeave a comment

This week’s #WOW challenge was a joint one with the #PreppinData crew, with the intention to use the #PreppinData challenge to create the data set needed for the Tableau challenge. I completed the Prep challenge, but decided to use the output provided by the #PreppinData crew as the input to this challenge (just in case I had inadvertently ended up with discrepancies).

Sport Selector
Adjusting the time
The Schedule Viz
Event Counts in Tooltip
Event listing Viz in Tooltip
Other Sports bar chart Viz in Tooltip
Dashboard interactivity

Sport Selector

This is a simple chart that lists the Sport Groups. I chose to build a bar chart using MIN(1) on the Columns and Sport Group on Rows. The axis was then fixed from 0-1.

Create a set based on Sport Group and select a few values to be ‘in’ the set (eg Boxing, Gymnastics, Martial Arts).

Add the Sport Group Set to the Colour shelf to identify the selected sports. Adjust colours accordingly.

Adjusting the Time

Create a parameter pTimeAdjust which is an integer paramater, defaulted to 0 and ranges from -12 to +12. Set the step value to 1 as this will ensure when you add the parameter to the dashboard, the prev/next buttons can be displayed alongside the slider.

Create a calculated field to store the time of the event based on the ‘timezone’ selected via the above parameter

Date Time Adjust

DATEADD(‘hour’, [pTimeAdjust], [UK Date Time])

This field will be used to display the full event date & time on the event listing viz in tooltip, along with building the schedule viz itself.

Additionally, create a field based on the above, which just stores the day of the adjusted datetime field above

Day of Adjusted Date

DATE(DATETRUNC(‘day’,[Date Time Adjust]))

This field is needed to help with the filtering required for the viz in tooltips to display.

The Schedule Viz

Add Date Time Adjust set to the Month datepart (blue pill) to the Columns shelf, and alongside it add the same field set to the Day datepart (blue pill). On the Rows, add Sport Group and Sport. Add the Sport Group Set to the Filter shelf. This will give you the ‘bones’ of the schedule

In viewing the provided solution, there was a bit of a discrepancy between when a ‘medal’ icon should show or not, compared to the Medal Ceremony? field provided in the data. It transpired Lorna had made an adjustment, as there were some events that had a ‘final’, but did not include a gold medal or ceremony event.

So to try to match up with Lorna’s output, I too made adjustments, but I can’t guarantee it matches any published solution.

First up I identify the Victory Ceremony events

Is Victory Ceremony?

CONTAINS([Event],’Victory Ceremony’)

I chose to exclude all these events from the schedule, so this field is added to the Filter shelf and set to False.

I also identify the events which appear to be a ‘final’

Is Final?

CONTAINS([Event],’Gold Medal’) OR CONTAINS([Event],’ Final’)

This field will separate the events into two types. Change the Mark Type to Shape, then add this field onto that shelf. Set the shapes accordingly. Note – to add the medal shape, save the image Lorna provided to your machine, then follow these instructions so it’s available for selection.

I chose to add the Is Final? to the Size shelf too, so the shapes can be adjusted to something more suitable.

If you add the rows and columns dividers, you’ll notice the single circles aren’t centred. To resolve this, we’re going to need some axis.

Add MIN(1) to the Rows shelf (y-axis). This will give us some vertical headspace.

Now we need to manage the horizontal space, and ensure the marks don’t overlap each other. When there’s no finals, we want the circle to be plotted in the middle. When there’s both non-final and final ‘events’ we want the two marks to be off-centre, one to the left and one to the right.

We need some calculations to help with this.

#Events by Sport Per Day

{FIXED [Day of Adjusted Date], [Is Victory Ceremony?],[Sport]: COUNT([Event Schedule])}

This helps us count the number of of events per day for a specific sport

#Event Finals By Sport By Day

{FIXED [Day of Adjusted Date], [Is Victory Ceremony?],[Sport]: SUM(IIF([Is Final?],1,0))}

This basically helps us count the number of finals for each sport on a day.

With this we can build

X-Axis

IF [#Event Finals by Sport Per Day] =0 THEN 5
ELSEIF [#Event Finals by Sport Per Day]-[#Events by Sport Per Day] =0 THEN 5
ELSEIF [Is Final?] THEN 7
ELSE 3
END

If there’s no final, plot at 5, if there’s only a final, plot at 5 otherwise plot a final at 7 and a non-final at 3.

Add this to the Columns shelf (set to be a dimension ie not SUM), and edit the axis to be fixed from 0-10.

Events Count in Tooltip

I was also a bit puzzled by some of the numbers being displayed in the tooltip, so chose to compute and show the following 3 measures

Number of Events for Sport on that day (this is the #Events by Sport per Day already calculated)
Number of Event Finals for Sport on that day (this is the #Event Finals by Sport by Day already calculated)
Total Number of events for all other sports on that day (ie the selected sport is excluded from the count).

For this last measure we need

#Events per Day

{FIXED [Day of Adjusted Date], [Is Victory Ceremony?] : COUNT([Event Schedule])}

#Events per Day of Other Sports

SUM([# Events Per Day]) – SUM([#Events by Sport Per Day])

Pop all these on the Tooltip shelf and format appropriately.

You’ll also need to add the Day of Adjusted Date to the Tooltip. This should be set to exact date and discrete (blue pill).

Event Listing in Tooltip

Build out a data listing view of Sport Group, Sport, Date Time Adjust and add Event (I renamed the Event Split field) to the Text shelf. Add Day of Adjusted Date to the Detail shelf. Hide Sport Group and format.

On the schedule viz, add this worksheet to the tooltip, passing Sport and Day of Adjusted Date as filters on the string

Other Sports bar chart Viz in Tooltip

Once again this is a relatively simple chart to build out, with the Day of Adjusted Date field hidden in the display (but necessary for the VIT to filter properly).

However, this will display all sports, and we need this chart to not show the sport that has been initially selected (hovered).

Create a parameter pSportToExclude which is a string parameter. For the purpose of demonstration, enter the text Football.

Create a field

Excluded Sport?

[Sport]=[pSportToExclude]

Add this field to the Filter shelf and set to False, and the sport will disappear from the list

Add a reference to this sheet from the tooltip of the schedule viz, this time passing just Day of Adjusted Date as the filter.

Dashboard Interactivity

Hide / Show Sport Selector

When adding the Sport Selector sheet and the Schedule viz to the dashboard, you need to make sure they exist side by side in the same horizontal container.

Then, providing you are using v2021.2, you can set the Sport Selector object to Hide/Show. See this video for help.

Add remove sports

You will need 2 dashboard set actions for this. They should run on ‘menu’, and one will add items to the set, and the other remove

Set the selected sport to exclude

We’ll use a parameter action for this to run on hover and set the pSportToExclude parameter

Stop Sport Selector highlighting

Create a new field Dummy containing the text “Dummy” and add this to the Detail shelf of the Sport Selector viz.

The add a highlight action against this sheet only

Hopefully I’ve ticked off all the core elements here. There was a fair bit going on, and I’m conscious I’ve drafted this blog fairly quickly in comparison. My published viz is here .

Note, there are a couple of elements in my viz that I added which weren’t on the original solution. I’ve chosen not to include in the blog as the images/characters I chose to use didn’t render on Tableau Public. If you download the workbook, you’ll be able to see what my intention was.

I did also create an alternative view ‘heatmap’ style view as well which you can see here.

Happy vizzin’! Enjoy the Olympics! Stay Safe!

Donna

Can you build an app to visualise wildfires?

July 15, 2021 Donna ColesLeave a comment

Ann Jackson’s husband Josh (@VizJosh) set the challenge this week, to build an ‘application’ to help visualise the scale of wildfires; that is when a fire is said to be 5000 acres, you can use the app to view how that compares to an area of the world you may know, so you can really appreciate just how large (or small) the fire is.

I have to hold my hand up here, and say that after reading the requirements several times, I was absolutely stumped as to where to start. We were provided with some ‘data’ to copy which consisted of 5 rows, which I duly copied and pasted into Desktop, but I then like ‘what now….?’ I knew I needed something geographic to build a map, but couldn’t understand the relevance of the 5 rows… I’ve said before I don’t use maps that often, so was unsure whether there was something I needed to do with this data. After staring at the screen for what seemed like an age, I ended up looking at the solution.

The data is just ‘dummy’ data and is just something to allow you to ‘connect’ Tableau to. You can’t build anything in Tableau without a data source. It could just have been 1 row with a column headed ‘Dummy’ and a value of 0. If it had been that, it might have been more obvious to me 🙂

Defining the parameters
Building the map
Apply button
Dashboard Actions

Defining the parameters

Ultimately the ‘data’ being used to build the viz is driven by parameters – the Location selector and the Latitude & Longitude inputs.

pLocation

An integer list parameter that stores values, but displays worded locations – wherever you choose. I opted for my hometown of Didcot in the UK alongside locations Josh had used, mainly so I could validate how the rest of the ‘app’ would work when I came to build it.

pLongitude

Float input, defaulted to the longitude of location 1 (ie Didcot) above.

I just googled Didcot Latitude and Longitude to find the relevant values

Note – Longitude W means an input of -1 * value. Similarly for Latitude S needs to be a negative input.

Then I created

pLatitude

Since we’re talking about parameters, there’s a couple more required, so lets create them now

Acres

Integer parameter defaulted to 5000

pZoom

Integer, list parameter with the values below, defaulted to 2.

Building the map

Now we have some lat & long values (in the pLatitude and pLongitude parameters), we can create some geographic data needed to build a map.

Location

MAKEPOINT([pLatitude], [pLongtitude])

This gives us the centre point which we want to build the ‘fire size’ buffer around. For this we need the calculation JOsh kindly provided :

Acres to Feet

SQRT(([Acres]*43560)/PI())

and then we can create the buffer

Fire Size

BUFFER([Location],[Acres to Feet],’ft’)

Double click on this field and it should automatically create you a ‘map’

Adjust the map ‘format’ via the Map > Map Layers menu option. I chose to set it to the dark style at 20% washout, then ticked various selections to give the details I needed (I added and removed options as I was testing against Josh’s version). I also set the colour of the mark via the Colour shelf to be pale red.

Also, as per the requirement, turn off the map options via Map > Map Options menu, and unchecking all the selections.

So this is the basic map, and you can input different lats & longs into the parameters to test things out.

Now we need to deal with the zoom requirement.

I wasn’t entirely sure about this, so had a bit of a search and found Jeffrey Shaffer’s blog post How to create a map zoom with buffer calculation in Tableau – bingo!

The zoom had to be x times the size of the circle on the map, so achieved by

Zoom

BUFFER([Location],[pZoom] * [Acres to Feet],’ft’)

Add this a map layer (drag field onto the map and drop onto the Add a Marks Layer section that displays)

This has generated a 2nd circle and consequently caused the background map to zoom out. We don’t want this circle to show, nor to be selected, so on the Colour shelf, set the Opacity to 0%, and the Border and Halo to None. To prevent the circle from showing when you hover your mouse on the map, you need to Disable Selection of the Zoom marks card

Apply Button

On a separate sheet, double click into the space below the Marks card, and type ‘Apply’ into the resulting ‘text pill’ that displays, and then press return.

This will create a blue pill, which you can then add to the Label/Text shelf. Align the text to be middle centre

This view is essentially going to act as your ‘Apply’ button on the dashboard. When it is clicked on, we want it to take the Lat & Long values associated to the place listed in the pLocation parameter, and update the pLatitude & pLongitude parameter values.

For this, we need a couple of extra calculated fields

Location Lat

CASE [pLocation]
WHEN 1 THEN 51.6080
WHEN 2 THEN 40.7812
WHEN 3 THEN 51.5007
WHEN 4 THEN 48.8584
END

Note – as before, all these values were worked out via Google as shown above.

Location Long

CASE [pLocation]
WHEN 1 THEN -1.2448
WHEN 2 THEN -73.9665
WHEN 3 THEN 0.1246
WHEN 4 THEN 2.2945
END

Add both these fields to the Detail shelf of the Apply sheet.

Dashboard Actions

When you add the 2 sheets to the dashboard, you then need to add parameter actions to set the values of the pLongitude & pLatitude parameters on click of the Apply button

Set Lat

A parameter action that runs on Select of the Apply sheet, setting the pLatitude parameter with the value from the Location Lat field.

You need another action Set Long which does a similar thing by passing the Location Long field into the pLongitude variable.

Finally, you don’t want the ‘Apply’ button to look ‘selected’ (ie highlighted pale blue) once clicked. Create calculated fields True = True and False = False and add both of these to the Detail shelf on your Apply button sheet.

Then add a dashboard filter action that uses Selected Fields and maps True to False

Hopefully, this should provide you with all the core features to get the functionality working as required. Ultimately once I got out of the starting blocks, it wasn’t too bad…

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Can you find the top and bottom performers?

July 8, 2021 Donna ColesLeave a comment

The challenge this week came from Candra McRae, where the focus was to use statistics to identify the top & bottom performers, rather than the more common ‘top n’ and ‘bottom n’. By statistics, we’re specifically looking records within the top 25th percentile and the bottom 25th percentile.

So let’s dive in.

Identify Current date
Defining the calculations
Building the chart
Month Selector and interaction

Identify Current date

The data we’re using is the Superstore Sales data from 2021.1 which includes data up to 31st Dec 2021. The requirement talks about the current rolling x months worth of data compared to the previous x months worth of data.

This means we need a way to determine what ‘current’ is. Typically, in a real sales environment, you’d probably only have data up to ‘today’, and I did consider working up a solution based on ‘today’, but equally I like to deliver a solution that I know matches the challenger, as it helps to validate my workings, and also I like to have a solution that I can look back on in the future and know there’s data.

So I took Candra’s hint and based ‘current’ off of the maximum date in the data set, derived by

Max Date

DATE({FIXED:MAX([Order Date])})

Defining the calculations

If you’re a regular reader of my blogs, you’ll know when I can, I like to build the data out into a tabular form, so I can verify the calculations, and then I’ll build out the viz.

First up, we want to get a value for the ‘current rolling n months’.

To define ‘n’ we need a parameter.

pRollingMonth

An integer defaulted to 12. It doesn’t need to be a list, as this will be populated via a parameter action from another sheet – more on that later.

IF [Order Date]>= DATEADD(‘month’, -1 * [pRollingMonth] ,DATETRUNC(‘month’,[Max Date])) AND [Order Date] <= [Max Date] THEN [Sales] END

let’s break this down… DATETRUNC(‘month’,[Max Date]) truncates the Max Date which is 31st Dec 2021 to the 1st of the month ie it returns 01 Dec 2021.

DATEADD(‘month’, -1 * [pRollingMonth] ,DATETRUNC(‘month’,[Max Date])) , is then going back to the 12 months prior (-1×12=-12) , so is 01 Dec 2020.

So we’re only going to get a sales value if the Order Date is >= 01 Dec 2020 and <= 31 Dec 2021 (essentially 13 months of sales data).

For the previous sales, we first need

Prev Month

DATEADD(‘month’, -1, [Max Date])

so in our current example, this will be 30 Nov 2021.

and then to get the previous rolling 12 month sales, we can apply similar logic using Prev Month instead of Max Date

Previous Rolling Sales

IF [Order Date]>= DATEADD(‘month’, -1 * [pRollingMonth] ,DATETRUNC(‘month’,[Prev Month])) AND [Order Date] <= [Prev Month] THEN [Sales] END

Both these fields can be formatted to 1 decimal place, $ prefix and format in thousands (k).

And then we also need a difference to display on the tooltip

Difference

SUM([Current Rolling Sales]) – SUM([Previous Rolling Sales])

This needs to be additionally formatted so that negatives are displayed in brackets ().

Add all these into a table and sort by the Current Rolling Sales descending

So we’ve got the data needed for the bar, the line and the tooltip. We now need to work on the crux of the challenge – the calculations needed to identify the top & bottom.

We’re looking to identify the 25th percentile value based on Current Rolling Sales values displayed on screen

25th Percentile

WINDOW_PERCENTILE(SUM([Current Rolling Sales]),0.25)

and also we need the 75th percentile

75th Percentile

WINDOW_PERCENTILE(SUM([Current Rolling Sales]),0.25)

If you pop these table calcs into the table, you’ll see the values for each field are the same for each row

and with these we can now identify where each row falls

Colour

IF SUM([Current Rolling Sales]) >= [75th Percentile] THEN ‘Top’
ELSEIF SUM([Current Rolling Sales]) <= [25th Percentile] THEN ‘Bottom’
ELSE ‘Middle’
END

Finally we need to identify the rows with a negative difference and flag with a circle.

Sales Contraction Indicator

IF [Difference]<0 THEN ‘●’ ELSE ” END

I use this site to get the symbols for these types of requirements,

Pop these two fields in to the table, and you’ve got all the data needed to build the chart:

Building the chart

Candra states that we can’t use a reference line to display the previous sales data, so for the core chart we need to build a dual axis chart plotting Sub-Category against Current Rolling Sales (bar chart) and Previous Rolling Sales (gantt chart).

Current Rolling Sales is coloured by Colour. I created a Label:Current Rolling Sales field just based on Current Rolling Sales but formatted to 0dp to add to the Label shelf.

To get the circles displayed, and to retain the order of the display, duplicate the **Sub-Category** field so you have a **Sub-Category (copy)** field. Add this to **Rows** alongside the existing **Sub-Category** field.

Then add the Sales Contraction Indicator field between these 2 fields, and format the font of that field so it is in red text (getting a coloured circle, was the part of this challenge I struggled most over, yet it really was very simple once the penny dropped!).

Then hide the first Sub-Category field (uncheck Show Header) so it no longer displays.

Apply various formatting to remove the row & column lines, gridlines etc, and adjust the tooltip and you should be done.

Month Selector and Interaction

A separate sheet is needed for this. We need to build a basic viz that has 12 data points with values 1-12. And we can get this from the Order Date field

Month Order Date

MONTH([Order Date])

will return the month number

Add this field as a discrete (blue) pill to the Columns shelf, set the mark type to square, and add Month Order Date to the Label shelf too. Colour the mark pale grey, and remove all borders etc, and hide the headers

When you add the 2 sheets onto the dashboard, you need to set a parameter action from the Month Selector sheet that sets the pRollingMonth parameter, using the value from the Month Order Date field. When unselecting, the value should default back to 12.

Hopefully there’s enough here to get you to the end! My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Profitability with Dual Axis Charts

July 1, 2021 Donna ColesLeave a comment

Luke Stanke returned for this week’s challenge, to build a pareto chart & bar chart on an unsynchronised dual axis. The crux of this challenge is table calculations, so as with any challenge like this, I’m going to build out what I need in tabular form first, so I can thoroughly validate I’m getting the right values. Once that is done, I’ll build the chart, then finally I’ll look at how to get the measures needed for the subtitle text.

Defining the core calculations
Building the chart
Working out the measures for the subtitle

Defining the core calculations

For the pareto, we need to plot % of orders against cumulative profit, so we need to build up some fields to get to these.

Add Order ID to Rows and Profit to Text and sort by Profit descending.

For the cumulative profit, we can add a Running Total Quick Table Calculation to the Profit pill

Add another Profit pill back into the view, and you can see how the table calculation is adding up the values of the Profit from the previous rows.

The triangle symbol indicates the field is a table calculation. By default, if you edit the table calculation, the calculation is computing down the table. I always choose to ‘fix’ how my calculations are computing, so that the values don’t inadvertently change if I move the pill elsewhere. So I recommend you set the table calc to Compute Using Order ID

I also want to ‘bake’ this table calculation into the data model (ie create a dedicated calculated field) that I can pick up and reuse. The simplest way to do this is to press Ctrl, then drag the field into the left hand data field pane (this will effectively copy the field rather than remove it from the view). Name the field and then you can verify it’s contents.

Cumulative Profit

RUNNING_SUM(SUM([Profit]))

So that’s one of the measures we need. Onto the next.

First of all we need to get a cumulative count of the number of orders.

Count Orders

COUNTD([Order ID])

Add this to the measures and it will display the value 1 per row (since each row is an order). Add a Running Total Quick Table Calculation to this field too, and again set to Compute Using Order ID. ‘Bake’ this into the data model too, by dragging the field as described above, and create a new field

Cumulative Order Count

RUNNING_SUM([Count Orders])

Now we need to get a handle on the total number of orders. I could do this with a LoD, but will stick with table calcs

Total Order Count

WINDOW_SUM([Count Orders])

Add to the view, and compute using Order ID again.

Now we can calculate the cumulative % of total orders

Cumulative % of Total Orders

[Cumulative Order Count]/[Total Order Count]

Format this to a % with 2 dp.

Add to the view and again compute using Order ID. You should see the values increase until 100%.

NOTE – I could have got this value by adding a Running Total table calculation to the Order Count field, and then editing that table calculation and adding a secondary table calculation to get to the % of total. However, I want to be able to reference the output of this field later on, so having a dedicated calculated field is the better option.

Ok, so now we have the 2 measures we need to plot the basic chart – Cumulative Profit and Cumulative % of Total Orders.

Building the chart

I typically start by duplicating the data sheet and then moving pills around

Duplicate Sheet
Remove Cumulative Order Count and Total Order Count
Move Order ID to the Detail shelf. Reset the sort on this pill to sort by Profit Descending

Remove Measure Names
Move Cumulative Profit to Rows
Move Cumulative % of Total Orders to Columns
Move Profit to Tooltip
Change mark type to Line

Add Sales to Tooltip and adjust tooltip accordingly

The chart needs to be coloured based on whether the marks has a profit > 0 or not. So for this we need

Profit is +ve

SUM([Profit]) >0

Add this to the Colour shelf and adjust accordingly.

Now we can add the second axis by adding Sales to the Rows shelf, then

Change mark type of the Sales marks card to bar
Remove the Profit is +ve field from the Colour shelf
Change the size to the smallest value
Adjust the tooltip
Make dual axis
Bring the Cumulative Profit axis to the front (right click on the axis > move marks to front)

Now the chart just needs to be formatted

remove column and row borders
edit the axis titles
format all the axes to to 8pt, and change the font of the axis title to Times New Roman
format the % of Total Orders axis to be 0dp

Working out the measures for the subtitle

For this, we are going to revert back to the tabular view.

We need to identify the point at which the Profit value starts to become negative. Let’s add the Profit is +ve field to Rows.

We’re looking for the row highlighted, which is the row where the previous value is true, while itself is false, which is achieved by

Profitable Marker

LOOKUP([Profit is +ve],-1) AND NOT([Profit is +ve])

Let’s add this now (ensuring the compute using Order ID)

We need to get a handle on the Cumulative % of Total Orders value for this row, but spread it across all the rows in data set, which we can do by

% of Total Profitable

WINDOW_MAX(IF [Profitable Marker] THEN [Cumulative % of Total Orders] END)

Add this on, compute by Order ID, and you can see the value for the ‘true’ line is displayed against every row. Format this field to % 0 dp.

For the potential profitability decrease, we need to get the Cumulative Profit value for the Profitable Marker row, along with the final (total) Cumulative Profit value.

Total Cumulative Profit

WINDOW_MAX(IF LAST()=0 THEN [Cumulative Profit] END)

This takes the value from the very last row in the data and again spreads across the all the rows.

With this, we can now work out the potential decrease

Potential Profitability Decrease

WINDOW_MAX(IF [Profitable Marker] THEN ([Cumulative Profit]-[Total Cumulative Profit])/[Cumulative Profit] END)

of the profitable marker row, take the difference between the ‘current’ cumulative profit and the final cumulative profit, as a proportion of the current value. Spread this across every row. Format to % of 0dp.

Now, as we have worked out these 2 values, % of Total Profitable and Potential Profitability Decrease to be the same across every row, you can add them to the Detail shelf of the All marks card on the chart viz, and reference them in the Title of the viz. (Don’t forget to ensure all table calc fields are set to compute using Order ID).

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

Donna + DataViz

ramblings in all things Tableau (and occasionally some other Stuff….)

How much change has occurred?

How many consecutive starts?

Can you create a moving average chart with a focus on selected subcategories?

Visualise Our Survey Data

Can you find the needle in the haystack?

Let’s go streaking!

Can you rebuild the Olympic Schedule?

Can you build an app to visualise wildfires?

Can you find the top and bottom performers?

Profitability with Dual Axis Charts