# Can you show the Year-on-Year Trend?

It’s Community Month over at #WOW HQ this month, which means guest posters, and Kyle Yetter kicked it all off with this challenge. Having completed numerous YoY related workbooks both through work and previous #WOW challenges, this looked like it might be relatively straight forward on the surface. But Kyle threw in some curve balls, which I’ll try to explain within this blog. The points I’ll be focussing on

• YoY % calculation for colouring the map
• Displaying the circles on the map
• Restricting the Date parameter to 1st Jan – 14th July only
• Showing Daily or Weekly dates on the viz in tooltip
• Restricting to full weeks only (in weekly view)

YoY % calculation

The data provided includes dates from 1st Jan 2019 to 21st July 2020. We need to be able to show Current Year (CY) values alongside Previous Year (PY) values and the YoY% difference. I built up the following calculations for all this

Today

#2020-07-15#

This is just hardcoded based on the requirement. In a business scenario where the data changes, you may use the TODAY() function to get the current date.

Current Year

YEAR([Today])

simply returns 2020, which I could have hardcoded as well, but I prefer to build solutions as if the data were more dynamic.

CY

IF YEAR([Subscription Date]) = [Current Year] THEN [Subscription] END

stores the value of the Subscription field but only for records associated to 2020

PY

IF YEAR([Subscription Date]) = [Current Year]-1 THEN [Subscription] END

stores the value of the Subscription field but only for records associated to 2019 (ie 2020-1)

YoY%

(SUM([CY])- SUM([PY]))/SUM([PY])

format this to a percentage with 0 decimal places. This ultimately is the measure used to colour the map. CY, PY & YoY% are also referenced on the Tooltip.

Displaying circles on the map

This is achieved using a dual axis map (via a second instance of the Latitude pill on Rows). One ‘axis’ is a map mark type coloured by the YoY% and the other is a circle mark type, sized by CY, explicitly coloured black.

The Tooltip for the circle mark type also shows the % of Total subscriptions for the current year, which is a Percent of Total Quick Table Calculation

Restricting the Date parameter to 1st Jan – 14th July only

As mentioned the Subscription Date contains dates from 01 Jan 2019 to 21 July 2020, but we can’t simply add a filter restricting this date to 01 Jan 20 to 14 Jul 20 as that would remove all the rows associated to the 2019 data which we need available to provide the PY and YoY% values.

So to solve this we need a new date field, and we need to baseline / normalise the dates in the data set to all align to the same year.

Baseline Date

//set all dates to be based on current year
MAKEDATE([Current Year], MONTH([Subscription Date]), DAY([Subscription Date]))

So if the Subscription Date is 01 Jan 2019, the equivalent Baseline Date associated will be 01 Jan 2020. The Subscription Date of 01 Jan 2020 will also have a Baseline Date of 01 Jan 2020.

We also want to ensure we don’t have dates beyond ‘today’

Include Dates < Today

[Baseline Date]< [Today]

Add Include Dates < Today to the Filter shelf, and set to True.

Add Baseline Date to the Filter shelf, choose Range of Dates , and by default the dates 01 Jan 2020 to 14 Jul 2020 should be displayed

Select to Show Filter, and when the filter displays, select the drop down arrow (top right) and change to Only Relevant Values

Whilst you can edit the start and end dates in the filter to be before/after the specific dates, this won’t actually use those dates, and the filter control slider can only be moved between the range we want.

The Baseline Date field should then be custom formatted to mmmm dd to display the dates in the January 01 format.

Showing Daily or Weekly dates on the viz in tooltip

The requirements state that if the date range selected is <=30 days, the trend chart shown on the Viz in Tooltip should display daily data, otherwise it should be weekly figures, where the week ‘starts’ on the minimum date selected in the range.

There’s a lot going on to meet this requirement.

First up we need to be able to identify the min & max dates selected by the user via the Baseline Date filter.

This did cause me some trouble. I knew what I wanted, but struggled. A FIXED LOD always gave me the 1st Jan 2020 for the Min Date, regardless of where I moved the slider, whereas a WINDOW_MIN() table calculation function caused issues as it required the data displayed to be at a level of detail that I didn’t want.

A peak at Kyle’s solution and I found he’d added the date filters to context. This means a FIXED LOD would then return the min & max dates I was after.

Min Date

{MIN([Baseline Date])}

Note this is a shortened notation for {FIXED : MIN([Baseline Date])}

Max Date

{MAX([Baseline Date])}

With these, we can work out

Days between Min & Max

DATEDIFF(‘day’,[Min Date], [Max Date])

which in turn we can categorise

Daily | Weekly

IF [Days between Min & Max]<=30 THEN ‘Daily’ ELSE ‘Weekly’ END

We also need to understand the day the weeks will start on.

Day of Week Min Date

DATEPART(‘weekday’,[Min Date])

This returns a number from 1 (Sunday) to 7 (Saturday) based on the Min Date selected.

Using this we can essentially ‘categorise’ and therefore ‘group’ the Baseline Date into the appropriate week.

Baseline Date Week

CASE [Day of Week Min Date]
WHEN 1 THEN DATETRUNC(‘week’,([Baseline Date]),’Sunday’)
WHEN 2 THEN DATETRUNC(‘week’,([Baseline Date]),’Monday’)
WHEN 3 THEN DATETRUNC(‘week’,([Baseline Date]),’Tuesday’)
WHEN 4 THEN DATETRUNC(‘week’,([Baseline Date]),’Wednesday’)
WHEN 5 THEN DATETRUNC(‘week’,([Baseline Date]),’Thursday’)
WHEN 6 THEN DATETRUNC(‘week’,([Baseline Date]),’Friday’)
WHEN 7 THEN DATETRUNC(‘week’,([Baseline Date]),’Saturday’)
END

Ideally we want to simplify this using something like DATETRUNC(‘week’, [Baseline Date], DATEPART(‘weekday’, [Min Date])), but unfortunately, at this point, Tableau won’t accept a function as the 3rd parameter of the DATETRUNC function.

Let’s just have a look at what we’ve got so far

So now we have all this information, we need yet another date field that will be plotted on the date axis of the Viz in Tooltip.

Date to Plot

IF [Days between Min & Max] <=30 THEN ([Baseline Date]) ELSE [Baseline Date Week] END

If you add this field to the tabular display I built out above, you can see how the value changes as you move the filter dates to be within 30 days of each other and out again.

When added to the actual viz, this field is formatted to dd mmm ie 01 Jan, and then is plotted as a continuous, exact date (green pill) field on the Columns alongside the Daily | Weekly field, with State & Subscription on Rows. The YEAR(Subscription Date) provides the separation of the data into 2 lines.

Restricting to full weeks only (in weekly view)

The requirements state only full weeks (ie 7 days of data) should be included when the data is plotted at a weekly level. For this we need to ascertain the ‘week’ the maximum date falls in

Max Date Week

CASE [Day of Week Min Date]
WHEN 1 THEN DATETRUNC(‘week’,([Max Date]),’Sunday’)
WHEN 2 THEN DATETRUNC(‘week’,([Max Date]),’Monday’)
WHEN 3 THEN DATETRUNC(‘week’,([Max Date]),’Tuesday’)
WHEN 4 THEN DATETRUNC(‘week’,([Max Date]),’Wednesday’)
WHEN 5 THEN DATETRUNC(‘week’,([Max Date]),’Thursday’)
WHEN 6 THEN DATETRUNC(‘week’,([Max Date]),’Friday’)
WHEN 7 THEN DATETRUNC(‘week’,([Max Date]),’Saturday’)
END

so if the maximum date selected is a Thursday (eg Thurs 11th June 2020) but the minimum date happens to be a Tuesday, then the week starts on a Tuesday, and this field will return the previous Tuesday date (eg Tues 9th June 2020).

And then to restrict to complete weeks only…

Full Weeks Only

IF [Daily | Weekly]=’Weekly’ THEN
[Date To Plot]< [Max Date Week]
ELSE TRUE
END

If we’re in the ‘weekly’ mode, the Date To Plot field will be storing dates related to the start of the week, so will return true for all records where the field is less than the week of the max date. Otherwise if we’re in ‘daily’ mode we just want all records.

This field is added to the Filter shelf and set to true.

Hopefully that covers off all the complicated bits you need to know to complete this challenge. My published solution is here.

Happy vizzin’! Stay Safe!

Donna

# Can you find all counties within n miles of a selected county?

I’ve been on annual leave for a few weeks. I’ve managed to catch up on all the challenges but haven’t blogged a solution for a while. It’s been a real struggle to get back into things to be honest – back to work, back to school, back to football clubs for my kids, and I’m wondering where I found the time before 😦

Trying to think about when/if I was going to get this blog out has caused me a bit of stress, which I don’t need, and so I need to change my mindset a bit…try to relax a bit … if I don’t manage to post a blog, just accept it and move on. I’ve also got to try to reduce the time it takes me to blog. I’m a ‘detail’ person, so often end up documenting to such a low level, but for my own sanity, I’m going to have to make an effort to be a bit briefer. I can’t guarantee I’ll stick to this though.. will just have to see how things pan out.

Going forward, I’m going to try to focus on the points that I think are key to the challenge, or those I found a bit tricksy.

So onto this week’s challenge. Sean Miller returned as guest poster with a COVID-19 related distance challenge.

Note – I connected to the provided hyper extract file rather than the csv file.

The core points I’m going to discuss are

• Identify counties within n miles
• How to select a county without set actions
• How to handle ‘All’ counties being selected
• Colouring based on ‘percentile’ of hospital beds
• How to only show county borders of counties within selected range

Identify counties within n miles

The number of miles is stored within an integer parameter, n miles, that is defaulted to 100.

A county to act as the ‘start point’ is stored in a set, Selected County, based on the County, State field.

The long/lat coordinates of this selected county need to be captured.

Selected County Lat

{FIXED : AVG(IF [Selected County] THEN [Latitude] END)}

Selected County Long

{FIXED : AVG(IF [Selected County] THEN [Longitude] END)}

By using a FIXED LoD calculation, the values are stored against every ‘row’ in the data set.

With these, the starting point can be determined

Selected County Start Point

MAKEPOINT([Selected County Lat],[Selected County Long])

The position of every other county – the ‘destination’ / end point – also needs to be determined

End Point

MAKEPOINT([Latitude],[Longitude])

With these, the distance between can be computed

Distance

DISTANCE([Selected County Start Point],[End Point],’miles’)

And the counties can then be restricted by

Within n miles

[Distance]<= [n miles]

which can be used as a filter and set to True.

How to select a county without set actions

This is managed via the set control feature; right click on the Selected County set and choose Show Set to display the list of counties with the option to select which ones are in or out of the set. Change the display of the control via the dropdown arrow (top right) to be Single value dropdown which automatically provides as ‘All’ option or the ability to select a single set only.

How to handle ‘All’ counties being selected

When ‘All’ is chosen via the Set Control selector, this has the effect of adding all the State, County values into the set, which means we don’t really have a starting state. So the n miles parameter is essentially redundant. But we need to make the Within n miles filter understand this.

We can manage this by first identifying how many values we have in the set

Count Selected States

{FIXED : COUNTD(IF [Selected County] THEN [County, State] END)}

There will either be 1 or if All is selected, then they’ll be 3000+, and we use the FIXED LoD, so the total is stored against all rows.

We can then update our Within n miles filter to consider this value

Within n miles

([Count Selected States]=1 AND [Distance]<= [n miles]) OR [Count Selected States] > 1

This now returns true if either one State, County is selected and the other records are within n miles OR all the states are selected (the count is > 1).

Colouring based on ‘percentile’ of hospital beds

This caused me a little bit of headscratching. I assumed ‘percentile’ would be based on a percentage of the total num_staffed_beds (note I simply renamed this field to Hospital Beds), associated to the state,counties being displayed (ie within n miles). But after building the calculation I thought, and adding it to colour and choosing the Green-Gold colour range, I didn’t get the same spread as shown in the solution.

I messaged Sean to question whether I’d made the right assumption, but while waiting for a response, I did an online search for “Tableau Percentile Rank”, and quickly spotted that a Table Calculation exists. As you can guess, I don’t use this particular calculation very much at all 🙂

So to colour, add Hospital Beds , and simply add a Percentile table calculation

How to only show county borders of counties within selected range

When working with Maps, you can use the options within the Map Layers menu to select which features of the map should be displayed. One of these options is State/Province Borders, which you might think is what is needed.

However, this will show all the state borders, which is evident if you zoom out a bit, including those for states that don’t have counties within n miles.

This isn’t the requirement – we only want to show the borders of the states which have counties within the desired range. So instead, we don’t want to show any State/Province Borders via the Map Layers. And we’ll utilise a dual axis instead, by first identifying the states that are within n miles

States within n miles

If [Within n miles] THEN [State] END

Duplicate the Latitude (generated) field on the Rows, remove all the fields on that mark, and then add the States within n miles to the Detail shelf. Set the colour of this mark to white, and swap the Latitude (generated) fields if needed, so the States are on the top, as below.

Then make dual axis 🙂

And that’s the key points that I hope will help you solve this challenge. My published version is here, which is available to be downloaded. If there’s anything I haven’t covered that you’re not sure about, feel free to contact me.

Happy vizzin’! Stay safe!

Donna

# Are Sales on track with Goal?

This week’s #WOW challenge was set by Ann primarily to demonstrate the recent 2020.2 release of relationships in Tableau. This is a change to how the data modelling works when combining data together, in this case actual sales data and sales goal (target) data. Ann provided a ‘tweaked’ version of superstore sales data for the actual sales, alongside a custom ‘goals’ data set which provided a sales goal value for each month per Segment and Category (ie the goals data was not as granular as the actual sales data).

Combining the Data

Obviously, to make use of the new relationships feature, you’ll need at least v2020.2.1 installed.

When you add the 2 data sources to the Data Source pane and add the Orders sheet from the Superstore data and then add the Sheet 1 from the Sales Goals data, you’re prompted to add a relationship if matches can’t automatically be found (ie the field names aren’t identical).

I added relationships for the Segment, Category and Month of Order Date fields, which then provides a ‘noodle’ link between the two sets of data

Note – to add multiple relationships, you have to close the field listing to then see the Add more fields option.

With relationships, Tableau is now smarter at how it manages the queries between the data. I’m not going to go into it at length (as there’s plenty of good Tableau documentation on the subject), but I’ll quickly show the difference between this and a traditional ‘join’.

In the Relationships model, if I show Sales and Goal by Month & Segment, the Goal is based on the sum of the rows associated to that Month and Segment that you can find if you looked at the source spreadsheet directly

But if I was using a traditional join with these data sets :

and then presented the same info as above

The Goal data is way off, as it’s being influenced by all the additional rows of data that have been constructed by the join clause.

If we expand to the Category level in each example above (the level at which the Goal data is stored), the Relationship version still shows the correct value

while the Join version is still incorrect, and only changing to an Average aggregation do we get the figure needed

Joining the data requires more work (calculations / knowledge) to ensure the Goal data is reporting correctly at the various levels, whereas the Relationships model ‘just works’.

Building the Table

When I started building the solution, I actually started with the YoY trend chart, but found I had to revisit it (swap a couple of pills about) as I progressed through, so I’m going to start with the table of data itself.

This turned out to be really quite tricksy… it’s just a table right… looks harmful enough…. but Ann had lobbed a few ‘gotcha’ grenades into the mix, that took a bit of head-scratching!

First up, Ann likes to use capitalisation, so I simply renamed the following fields

• Category (from Sheet 1 – the Goals data) -> CATEGORY
• Segment (from Sheet 1 – the Goals data) -> SEGMENT
• Month of Order Date (from Sheet 1 – the Goals data) -> MONTH
• Sub-Category (from Orders – the Superstore Sales data) -> SUB-CATEGORY

And I then created a hierarchy called SEGMENT which stored SEGMENT – > CATEGORY -> SUB-CATEGORY.

We also needed to ‘fake’ today so we can show sales to date. The Orders Superstore data set contains data from the start of 2016 to the end of 2019. We needed to pretend ‘today’ was 1st July 2019 and not show any sales beyond the end of the month. So I created the field

Today

#2019-07-01#

and then

ACTUAL SALES

IF DATETRUNC(‘month’,[Order Date])<= [Today] THEN [Sales] END

If we build out the basics of the table

• SEGMENT hierarchy on Rows
• MONTH (discrete, exact date, formatted to mmmm yyyy) on Rows
• Measure Names on Columns and on Filter, filtered to ACTUAL SALES and Goal
• Measure Values on Text
• MONTH on Filter, set to exclude 2016

we get

and as we expand the hierarchy to the next level, the ACTUAL SALES and Goal adjusts as expected

but when we reach the 3rd level we see a couple of issues

1. The Goal is showing the values that were displayed at the CATEGORY level, since it isn’t set at any further level. The requirement we have is to show the value ‘No Goal’ instead once we reach this point.
2. We also have a set of values with a NULL SUB-CATEGORY. This is because there are dates with goal values in the Goals data set, but we have no Sales data for those dates, so a match at the SUB-CATEGORY level can’t be found. We need to remove these records when at this level, and actually only show records where we have ACTUAL SALES, so records from July 2019 onwards should also be excluded.

At this point, I did end up having a sneak look at Ann’s solution. I had an idea of what I wanted to do, BUT I was worried that I may be missing something related to the new data model… that perhaps there was a feature of relationships that I needed to be exploiting as part of this, rather than go down the table calculation route I was heading towards. After all, part of doing these challenges for me is to make use of new features & functions if I can, so I have a reference workbook for future use. I had a read up on the blog posts related to relationships, but couldn’t spot anything obvious, hence a quick check at Ann’s solution…. nope nothing special required in respect of the data model….so instead, we need some additional calculations to help resolve all this.

What we need is some way to identify when we’re at the lowest SUB-CATEGORY level. We’ll start by counting the number of distinct sub-categories we have

# Sub Cats

ZN(COUNTD([SUB-CATEGORY]))

Wrapped in a ZN will return the value 0 when none exist.

Dropping this into the table of data, we can see the 0s for all the future Goal records, and 1 for the rest of the rows, since we’ve expanded to the SUB-CATEGORY level.

If you collapse up to the CATEGORY level, the #Sub Cats count changes, as each row typically contains more than 1 SUB-CATEGORY.

The term ‘typically’ in the above statement though is the reason we can’t assume we’re at the lowest level if the count is 1 though. It is possible for the sales against a particular CATEGORY in a MONTH to only span a single SUB-CATEGORY (Check out Home Office -> Furniture in Jan 2018). So we also need

Max Sub-Cats in Window

WINDOW_MAX([# Sub-Cats])

Which will count the maximum number of SUB-CATEGORY records for each SEGMENT and CATEGORY. In the case of Furniture Sales in the Home Office in Jan 2018, you can see below the count of sub-categories is 1 but the maximum count in the window is 4.

This shows we’re not at the SUB-CATGEORY level, as when we are, the #Sub Cats = Max Sub Cats in Window (ie 1=1)

So with that knowledge we can now derive an alternative field for the goal

SALES GOAL

IF ([# Sub-Cats] = 1 AND [# Sub-Cats]=[Max Sub-Cats in Window]) THEN 0 ELSE SUM([Goal]) END

Remove the original Goal field from the table and add this instead, setting the table calculation to compute by MONTH and you’ll see the SALES GOAL value change from a value to 0 as you expand down to the SUB-CATEGORY level.

So how do we show ‘NO GOAL’ instead of 0? Well this is a bit of a sneaky formatting trick, and one I couldn’t figure out (this step was actually the last thing I changed before publishing).

In my original SALES GOAL calculation I had set the value to be NULL rather than 0 expecting there to be an option to show the NULL values as something else (ie NO GOAL). But the feature I was looking for wasn’t available (maybe I was mistaken about when it appears.. maybe it’s disappeared in 2020.2… I still need to investigate). Anyhow, my alternative was to create a string field to store the value in, but this would make formatting the number quite cumbersome, and I was sure that wasn’t the case.

So I had to check Ann’s solution. Instead of NULL, she had set the value to be 0, and consequently used the following custom number formatting :

“\$”#,##0;-“\$”#,##0;”NO GOAL”

where the last bit is how to format 0. Very very sneaky huh!

So we’ve worked out how to identify when we’re at the SUB-CATEGORY level, and managed to display ‘NO GOAL’ when at that level, now we need to filter out some of the rows (but only when at this level).

Records to Show

IF [Max Sub-Cats in Window] = 0 THEN FALSE
ELSEIF ([# Sub-Cats] = 1 AND [# Sub-Cats]=[Max Sub-Cats in Window])
THEN //at lowest level
IF ATTR([MONTH]) < [Today] THEN TRUE
ELSE FALSE
END
ELSEIF [# Sub-Cats] = 0 AND [Max Sub-Cats in Window] = 1 THEN FALSE //still at lowest level but no sales so exclude
ELSE TRUE
END

This took a bit of trial & error to work out, and there may well be something much more simplified.

Adding this to the table, and again setting to compute by MONTH you can see which rows are reporting True and False. When collapsed to the SEGMENT or CATEGORY level, all rows are True, but the values start to vary when at the SUB-CATEGORY level.

Now, we need to work on the fields to help show us the actual & % difference.

RESULT

IF ([# Sub-Cats] = 1 AND [# Sub-Cats]=[Max Sub-Cats in Window]) THEN NULL ELSE SUM([ACTUAL SALES])-([SALES GOAL]) END

The difference should only show if we’re not at the lowest level. This needs to be custom formatted as

↑#,##0;↓#,##0

I use this site to find the arrows needed.

ACTUAL vs GOAL

[RESULT]/SUM([ACTUAL SALES])

will give me the % difference (format to percentage with 0 dp).

And the final piece of data we need (I think) is to compute a RAG status for each row. This is dependent on some parameters

GREEN (WITHIN X%)

Type Float, set to 0.1 but displayed as a percentage

and similarly

RED (ABOVE Y%)

With these we can work out

RAG

IF ABS([ACTUAL vs GOAL]) > [RED (ABOVE Y%)] THEN ‘WAY OFF TRACK’
ELSEIF ABS([ACTUAL vs GOAL]) < [GREEN (WITHIN X%)] THEN ‘ON TRACK’ ELSEIF ATTR([MONTH])>=[Today] THEN ‘NO COMPARISON’
ELSEIF [SALES GOAL]=0 THEN ‘NO COMPARISON’
ELSE ‘OFF TRACK’
END

If you add all these fields to your table then you can check how they all compare/change as you expand/collapse the hierarchy (make sure you set the compute by MONTH on all fields).

So now we’ve got all the building blocks you can create the actual table.

I’d recommend duplicating the above (and keeping as a ‘check data’ sheet). Then remove the fields you don’t need, and add the Records To Keep = True to the Filter shelf. Format the table appropriately, and add the title so you’re left with

Building the Line Chart

As mentioned above, I started with this chart, but ended up having to revisit it as I found I needed to use different pills that got created as I built the table. There is actually 1 additional field needed, which isn’t obvious until you start playing with the dashboard actions later:

GOAL

IF [SALES GOAL] > 0 THEN [SALES GOAL] END

This is needed so that when we filter by SUB-CATEGORY later and there is no goal, we only get a single line displaying the sales values, rather than also having a line plotted at 0.

• MONTH on Filter, excluding Year 2016
• MONTH (exact date, continuous) on Columns
• Measure Values on Rows and Filtered to ACTUAL SALES and GOAL
• Coloured accordingly by Measure Names

Then add TODAY to Detail and add a Reference Line to the MONTH axis, setting it to TODAY, adjusting the format of the line, and setting the fill above colour to grey.

Format the gridlines, tooltips & axis accordingly to match

Barcode / Strip Chart

For this we’re creating a bar chart using MIN(1), and fixing the axis to range from 0-1 so it fills up the entire space as follows

• MONTH on Filter excluding Year 2016
• MONTH as exact date, discrete (blue pill) on Columns
• MIN(1) on Rows
• RAG on Colour adjusted accordingly
• ACTUAL vs GOAL on Tooltip

Then use the ‘type in’ functionality to create a pill labelled ‘vs Goal’ on the Rows

Creating the Dashboard

I used a vertical container to add the components onto the dashboard, as I used blank objects that are formatted with a black background, 0 padding and a height of 4 to create the horizontal line separators.

The interactivity between the table and the other charts is managed via Dashboard Filter actions.

To get these to work properly though, and in such a way that the selected SEGMENT and/or CATEGORY and/or SUB-CATEGORY can be referenced in the title of the line chart, 3 separate dashboard actions are required, with each one specifying the field to filter by, as follows :

Once these are set up, the associated pills will appear on the Filter shelf of the Line and Barcode charts, and are then available for selection when building the title of the line chart

You should now have your completed chart.

My published version is here.

Happy vizzin’! Stay Safe!

Donna

# Can you compare a 3-day vs 14-day moving average and describe the latest trend?

This week for #WOW2020, Ann provided a table calculation feast of a challenge! This certainly is not for the faint-hearted! As well as cracking all the table calcs, the challenge features multiple views, measure swapping, parameters, BANs, filtering, sorting …. it’s got it all going on!

Ann hinted you’d probably want to start with the table, and even if there hadn’t been a table output in the display, this is what I would have done. If you’ve read enough of my blogs, you’ll know I often like to build up a ‘check data’ sheet, which just contains the data I need in tabular form as a quick reference. When working with table calculations this is an absolute must have!

So let’s build out that Check Data table to start with. I have a feeling this is going to be a lengthy blog 🙂

Initial Set up

First up, the requirements stated that the latest date would be 7 June, but I found records with a 8 June date. All the associated info for this date was null though, so I set a data source filter to exclude this. This means I wouldn’t get any issues if I needed to store the max date in a FIXED LoD calculation at any point.

I also found it easier to rename a couple of the measures provided to match the output, so rename PEOPLE_POSITIVE_NEW_CASES_COUNT to New Cases and PEOPLE_POSITIVE_CASES_COUNT to Reported Cases. I’ll refer to these renamed fields going forward.

Building all the Calculated Fields

To build out the table, we’re just going to focus on one State & County, as there’s a lot of data. So add Province State Name = Tennessee and County = Davidson to the Filter shelf.

Add Report Date (discrete exact date – blue pill) and New Cases & Reported to Rows. As you scroll down, you’ll see data starting to come in on 8 March.

We want to create our moving average calculations

3 Day Moving Avg

WINDOW_AVG(SUM([New Cases]), -2, 0)

14 Day Moving Avg

WINDOW_AVG(SUM([New Cases]), -13, 0)

Notice the number of rows to average over is 1 less than you might expect, as the current row is included, so the calculation is saying ‘current row’ and 2 | 13 previous rows.

Add these to the table, and adjust the table calculation so it is explicitly calculating by Report Date. This would have happened automatically, as the calculation would have been computing ‘down’ the table, but it’s best to fix the computation, so it doesn’t matter where the pill gets moved to in the view.

We now need to work out whether there is an increase or not between the 3-day and 14-day average.

Is Increase?

IF [3 Day Moving Avg] > [14 Day Moving Avg] THEN 1 ELSE 0 END

Is Decrease?

IF [3 Day Moving Avg] <= [14 Day Moving Avg] THEN 1 ELSE 0 END

I’m using 1s and 0s as it’s going to help with a later calculation.

NOTE – I’m assuming that if there is ‘no change’ it’ll be recorded as a decrease. This is how I interpreted the requirement, “ …whether it is an increase or a decrease (or no change)” and it wasn’t easy to find any matches anyway.

I also need some text to indicate the increase or decrease

Increase | Decrease

UPPER(IF [Is Increase?]=1 THEN ‘Increase’ ELSE ‘Decrease’ END)

The UPPER is used as that’s part of the tooltip formatting.

Let’s get these onto the view, always making sure the table calculations are set to Report Date.

We need to calculate the number of days that has been reported INCREASE in succession, and the number of days where successive DECREASE has been reported.

So first, let’s identify which rows match the previous row.

Match Prev Value?

LOOKUP([Is Increase?],-1) = [Is Increase?]

If the value of the Is Increase? field in the previous (-1) row is the same as the Is Increase? field in the current row, then this is true, else false.

Add to the view, and verify the table calculation for itself and all nested calculations being referenced, is set to Report Date.

We now have all the information we need to help us work out the number of days in the increase/decrease ‘trend’.

Days in Trend

IF (FIRST()=0) OR(NOT([Match Prev Value?])) THEN 1
ELSEIF [Increase | Decrease] = ‘INCREASE’ THEN ([Is Increase?]+PREVIOUS_VALUE([Is Increase?]))
ELSEIF [Increase | Decrease] = ‘DECREASE’ THEN ([Is Decrease? ]+PREVIOUS_VALUE([Is Decrease? ]))
END

If the row in the table is the very first entry (so there’s nothing previous to compare against), or the row in the table didn’t match it’s predecessor (ie there was a change), then we’re starting a new ‘trend run’, which obviously starts at 1.

Otherwise, if the current row we’re on indicates an increase, then we’ll add the value of the Is Increase? field (which is 1) to the previous value (which is also 1). PREVIOUS_VALUE works recursively though, so it essentially builds up a running sum, which gives our trend.

We ultimately do the same thing using the Is Decrease? column. This is why using 1 & 0s in the earlier calculation help.

Adding into the view, and setting the table calculation correctly, you should get something similar to this…

Finally, there’s one key field we need to add; something to help identify the latest row as we will need it for filtering in the table that’s displayed on the dashboard. Simply applying a standard ‘quick filter’ won’t work, as the table requires we show the 3-day & 14-day moving averages. A ‘quick filter’ to limit the data to the latest date (7th June), will show the wrong values, as the data related to the other days will be filtered out, so the table calc won’t have the information to correctly compute over.

We need to create another table calculation that we can use as a filter, and that due to Tableau’s ‘order of operations’ will apply later in the filtering process than a traditional quick filter.

Max Date

{FIXED : MAX([Report Date])}

The latest date in the whole data set.

Show Data for Latest Date

LOOKUP(MIN([Report Date]),0) = MIN([Max Date])

If the Report Date of the current row is the same as the maximum date in the whole data set, then return true.

We’ve now got all the core data components we need to create the various charts.

In the interest of time (my time in writing this out), I’m going to attempt not to describe the building of all the charts in too much detail, but just call out the useful bits you might need. If you’re attempting this challenge with the table calcs above, I’m assuming you know Tableau enough to not need everything defined to the lowest level.

The whole report is driven off a parameter which the user must enter a State – County combo.

You’ll need a calculated field to store the combo

State – County

[Province State Name] + ‘ – ‘ + [County]

and then create a parameter (State – County Parameter) off of this (right click, Create -> Parameter) which will create a string parameter with all the permutations.

When displaying on the dashboard, set this to be of type Type In

BAN

The BAN is a basic summary of the latest trend for the entered state county.

We need to filter the sheet to the value entered in the parameter

Is Selected State County?

[State – County Parameter] = [State – County]

Add this to the Filter shelf as true, along with the Show Data for Latest Date.

Add the relevant fields to the Text shelf to display the required text. The Report Date needs to be custom formatted to ddd, mmm d to get the Sun, Jun 7 display

Map

For the map, as well as filtering the latest date, we’re also going to need to filter just to the state only (not state & county) as above. So I created

Is Selected State?

LEFT([State – County Parameter], FIND([State – County Parameter],’-‘)-2) = [Province State Name]

This is unpicking the State – County combined string stored in the parameter, to just find the State part and compare to the Province State Name.

Build a filled map based on County and filter to the latest date and the selected state. I set the Map Layers to that below, which seems to match up

You’ll need to set both the Is Selected State County? and Increase|Decrease fields to the Colour shelf.

Bar & Line Chart

You’re going to need a few more calculated fields for this.

Moving Avg Selector

for the user to choose what the line should display. I’ve set it to an intger parameter that displays text

We then need a field to show on the display depending on what’s been selected in the parameter

Moving Avg to Display

If [Moving Avg Selector] = 3 THEN [3 Day Moving Avg] ELSE [14 Day Moving Avg] END

You’ll need a Dual Axis chart plotting New Cases and Moving Avg to Display against Report Date (continuous exact date)

The data only starts from 8th March, so I added Report Date to filter to start from 8th March. 8th March is also added as a constant reference line.

Table

Based on the State / County entered, the table is filtered to show the data for the latest date for all the counties in the state entered. Although not stated in the requirements, the first row is the county selected, with the rest ordered by Reported Cases.

You can get the selected county to the top, by adding Is Selected State County as a hidden field to the Rows, and moving ‘True’ to the top.

And that should be everything you need to build the dashboard, which is pretty much just stacking all the sheets one on top of each other in a single column.

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

# Which products are most profitable?

I’m starting to write this blog with a bit of uncertainty today as I’m not ultimately sure where I’m going to end up….

I wasn’t even sure I was going to pen an entry this week…. I found Luke’s challenge tough, and the path I took full of multiple wrong turns, that meant trying to write out a comprehensible ‘how I did it’ quite tricky, as reviewing now I’m questioning ‘what did I do that for…’.

Whilst with perseverance and a bit of inspiration from Rob Saunders, I did manage to post a working solution, I knew I wasn’t overly happy with it, particularly because my expand and collapse functions didn’t behave as I saw others do… I had to click twice to collapse.

Before starting to write, I decided to check out Luke’s workbook which he’d finally published to see if I could understand where I was going wrong.

I’d used both a set action and parameter action in my attempt; Luke had just a parameter action. In trying to understand why I ended up with a set action and see if I could do without it, my solution gradually started unravelling, as various calculated fields needed changing.

So, I think the best way to approach this blog is to rebuild my solution from scratch, using only my existing workbook as a reference, and I will attempt to write and screen shot as I build. This could take some time, and I have no idea how successful I’ll be…. I may well get to a point where I’ve taken a wrong turn again, and everything I’ve written needs to be scrapped… at which point I may just have to say ‘sorry, I tried!’……

Ok, let’s get cracking.

Top N Sub-Categories

At the ‘first’ level of the table, we need to display the top n sub-categories ordered by Profit. Those not in the top n should be displayed under an ‘All Others’ grouping, and always displayed at the bottom. The top n can vary based on the user input.

We need a parameter to define the value of the top n.

SUB-CATEGORIES TO SHOW

This is an integer parameter ranging from 1-10, defaulted to 3. I’ve titled it exactly as displayed on the output, so it’s one less change to make later (hence the capitals).

The easiest way to group the sub-categories into those in the top n, is to use a set. Right click Sub-Category and Create -> Set.

Top N SubCats by Profit

Use the Top tab to define the rules for which Sub-Category to include, referencing the SUB-CATEGORIES TO SHOW parameter and the Profit field as shown below.

SubCat Group

IF [Top N SubCats by Profit] THEN [Sub-Category] ELSE ‘All Others’ END

If the Sub-Category is in the set then the name will display, otherwise it will be grouped under the ‘All Others’.

Add SubCat Group to Rows and change the sort on the pill to sort by Field = Profit desc

This will result in All Others being listed at the top, but to resolve that, drag the Top N SubCats by Profit set onto the Rows an place in front of the SubCat Group.

Let’s get some of the measures we need out onto the canvas.

Margin

SUM([Profit])/SUM([Sales])

format to percentage with 0dp.

Add Margin, Profit & Sales onto the sheet.

Top N Products

The next level in the table displays the Top N products per SubCat Group based on their margin. For those not in the Top N, the products should be grouped under ‘All Others’, and listed at the bottom. The Top N is once again defined by a user input via a parameter.

PRODUCTS TO SHOW

Once again an integer parameter ranging from 1-10 but defaulted at 5 this time.

Unfortunately, this time we can’t use a set to define our Top N grouping. This is because the set will only consider the Top N across all Products and will not consider the fact the Products are nested per Sub-Category grouping. So we need to come up with an alternative. I did do a bit of research to find ideas, and found this article by Emma Whyte at The Information Lab to get me started : Showing a Nested Top N with Other in Tableau.

The Top N needs to be based on the order of the Margin, or the rank.

Margin Rank

RANK_UNIQUE([Margin])

Change this to be a Discrete field.

This will give us a unique ‘number’ per row displayed based on the value of Margin. If records have the same Margin value, using RANK_UNIQUE will mean they get a different rank number (as opposed to how other ranking functions work). The table calculation of INDEX() could work just as well. NOTE – there is potential though that when the Margin values are the same for different products, what makes the Top N and what doesn’t may differ, so you might find you get a slightly different list from some of the solutions you see.

Add Margin Rank and Product Name onto the sheet, and edit the table calculation on Margin Rank, to compute by Product Name only

We now want to group the Product Name based on the Top N products.

Product Name Group

IF [Margin Rank] <= MIN([PRODUCTS TO SHOW])
THEN ATTR([Product Name])
ELSE ‘All Others’ END

We also only want to show (PRODUCTS TO SHOW + 1) rows per SubCat Group ie if PRODUCTS TO SHOW = 5, we want to display 6 rows per SubCat Group, where the 6th row displays ‘All Others’. The 6th row also needs to show the Profit, Sales and Margin values associated to all the Products in the ‘All Others’ Product Name Group.

So we need to calculate some new fields that will store a revised value for Proft, Sales and Margin, depending what row we’re working with.

Sales For Others

WINDOW_SUM(IF [Margin Rank] > [PRODUCTS TO SHOW] THEN SUM([Sales]) END)

If the Product Name isn’t in the Top N, then get it’s Sales value, and then sum all of those rows that meet the same condition.

If you put this onto the sheet, and set the table calculation for each of the nested calculations (Sales for Others & Margin Rank) to be by Product Name only, you’ll see that the value displayed in every row for each SubCat Group is the sum of the values associated to the rows in the ‘All Others’ group.

So now we need a field that’s either going to display the Sales for the 1 product or the sales for the group of products, depending on what row we’re on.

Grouped Sales

IF [Margin Rank]<= MIN([PRODUCTS TO SHOW]) THEN SUM([Sales]) ELSE [Sales For Others] END

Again the table calculation settings need to be set to compute by Product Name only. This is the field we ultimately want to display, so it needs to be formatted accordingly. The Sales and Sales For Others fields can be removed.

Along similar lines, we need

Profit For Others

WINDOW_SUM(IF [Margin Rank] > [PRODUCTS TO SHOW] THEN SUM([Profit]) END)

Grouped Profit

IF [Margin Rank]<= MIN([PRODUCTS TO SHOW]) THEN SUM([Profit]) ELSE [Profit For Others] END

Margin For Others

IF [Margin Rank] > [PRODUCTS TO SHOW] THEN [Profit For Others]/[Sales For Others] END

Grouped Margin

IF [Margin Rank] <= MIN([PRODUCTS TO SHOW]) THEN [Margin] ELSE [Margin For Others] END

When added to the sheet, once again make sure all the table calculation properties for all the nested calculations are set to compute by Product Name only.

As mentioned above, we only want to show PRODUCTS TO SHOW + 1 rows, so let’s create a field we can filter by :

Show?

[Margin Rank] <= MIN([PRODUCTS TO SHOW]) + 1

Add this to the Filter shelf and set to True. Once again the table calculation needs to be set to compute by Product Name only.

We want a grand total and subtotals only at the level of SubCat Group. Add the totals by Analysis -> Totals -> Show Column Grand Totals to get the overall total. Then on the SubCat Group pill, click and add SubTotals

We’ve now got the main components of the table. We now need to get the interactivity working to allow the expand / contract on arrow selection.

Expand / Contract All Others Product Group

First up, we’ll just tidy up our table display

• Hide the In/Out Top N SubCats By Profit field
• Hide the Margin Rank field
• Hide the Product Name field.
• Remove the Margin field
• Alias the Grouped Margin, Grouped Profit, Grouped Sales fields
• Right click and Edit Alias. Name the field ‘ Margin ‘ (note the trailing and leading spaces. You can’t alias just as Margin as a field already exists with that name, but the spaces make it think it’s a new name.

We need an additional field that will store our ‘arrow’ icon. We’re going to revisit this field. For starters

IF [Margin Rank] <= [PRODUCTS TO SHOW] THEN ”
ELSE ‘►’
END

I use this site to get my icon characters from. Add this field to the sheet after the SubCate Group pill, once again setting the table calculation to compute by Product Name.

In the dashboard, the aim is to click on an arrow associated to a single SubCat Group, which will expand the Product Group Name field to display the actual Product Name (rather than ‘All Others’) with their associated Margin, Sales & Profit values, and also show a ▼ icon.

This will be achieved using Parameter Actions, for which we need a parameter :

Selected Sub Category Group

String parameter defaulted to ”

Display this parameter on the sheet, as we can start to test the interactivity ‘manually’ without the need for the dashboard. What the dashboard action will do is on ‘click’, it will be set to populate the value of this parameter with the associated SubCat Group value. We can then do some checks based off of this and set various fields accordingly. It means we need to revisit some of the fields.

First up let’s set the arrow….

Edit Product Group Header to be

IF [Selected SubCat Group] = MIN([SubCat Group]) THEN
IF [Margin Rank] > [PRODUCTS TO SHOW] THEN ‘▼’ ELSE ” END
ELSEIF [Margin Rank] <= [PRODUCTS TO SHOW] THEN ”
ELSE ‘►’
END

Test this by entering the value of ‘Copiers’ into the Selected SubCat Group parameter. The arrow against ‘All Others’ should change.

We also need to change the value of the Product Name Group to show the actual Product Name on selection, so

edit the Product Name Group

IF [Margin Rank] <= MIN([PRODUCTS TO SHOW])
THEN ATTR([Product Name])
ELSE
IF [Selected SubCat Group] = MIN([SubCat Group])
THEN ATTR([Product Name])
ELSE ‘All Others’ END
END

Again test this out by changing the value in the parameter.

But we need to make more rows show too, so

edit Show?

([Margin Rank] <= MIN([PRODUCTS TO SHOW]) + 1) OR (MIN([SubCat Group]) =[Selected SubCat Group])

The values of our measures are still the totals though, so we need to edit these fields to

Grouped Sales

IF ([Margin Rank]<= MIN([PRODUCTS TO SHOW]) OR ([Selected SubCat Group] = MIN([SubCat Group])) ) THEN SUM([Sales]) ELSE [Sales For Others] END

Grouped Profit

IF ([Margin Rank]<= MIN([PRODUCTS TO SHOW]) OR ([Selected SubCat Group] = MIN([SubCat Group])) ) THEN SUM([Profit]) ELSE [Profit For Others] END

Grouped Margin

IF ([Margin Rank] <= MIN([PRODUCTS TO SHOW]) OR ([Selected SubCat Group]=MIN([SubCat Group]))) THEN [Margin] ELSE [Margin For Others] END

However while this works if you play with setting and clearing the parameter on the sheet, it won’t quite fully work if added as a dashboard action, as while the action can set the parameter we can’t ‘clear it’.

We need to ‘tie’ the parameter action to another field

SubCat Group for Reset

IF [Selected SubCat Group] = MIN([SubCat Group]) THEN ”
ELSE MIN([SubCat Group])
END

Add this to the Detail shelf (if you want to see how it changes based on the parameter value, add it to the Rows and test changing the parameter).

It needs to be on the sheet so it can be referenced from the dashboard action.

Add the sheet to a dashboard, then add Parameter Action as below where the Target Parameter is Selected SubCat Group and the field it references is SubCat Group for Reset

And after all that, you should have a working solution. Phew!

Pretty pleased I got there without taking a detour 🙂 The table just now needs various formatting applied, which I’m going to leave to you to do 🙂 Just tweet me if you’re having problems!

The parameter action may be confusing you a bit – it took a while to really get my head round it, so I’ve tried to explain this a bit more below…

How the parameter action works

On initial load of the sheet, the Selected SubCat Group parameter is blank. So for the SubCat Group = Copiers, the SubCat Group for Reset will also be Copiers as SubCat Group is not the same as Selected SubCat Group parameter. Other fields are also set based on the fact these two fields aren’t the same (like the arrow pointing to the right etc).

When the right arrow is clicked on the dashboard, the value of SubCat Group for Reset is used to populate the Selected SubCat Group parameter. So in this example, Selected SubCat Group will now contain the value Copiers. As the Selected SubCat Group parameter is now the same as the SubCat Group, various fields change their behaviour (like the arrow now points down, and more rows are displayed). But also, the value of the SubCat Group for Reset is also changed; as the SubCat Group is the same as the Selected Sub Cat Group parameter, SubCat Group for Reset now contains a blank string.

So at the point the down arrow is now clicked again on the dashboard, the value of the SubCat Group for Reset is again used to populate the Selected SubCat Group parameter. As SubCat Group for Reset is blank, then the parameter will now be populated with a blank value, and so all the login in the fields will be based on the fact that SubCat Group is not the same as the Selected SubCat Group parameter, and the table will display just as it did on first load.

Hope that helps to demystify what’s going on… it’s certainly helped me!

Thanks for sticking with me if you got this far 🙂

The version of the challenge I built while writing this (my 3rd published version) is here.

Happy vizzin’! Stay Safe!

Donna

# Can you show the adjusted target and missing pipeline?

For #WOW2020 Week 16, Lorna set a slightly different challenge that involved data blending. Blending is a technique in Tableau used to combine data from different data sources. You can read more about it here.

Lorna’s scenario is quite a common one – you have a data source which stores some ‘actual’ data (that in a typical scenario is likely to change as you move through the year), along with a more static data source, storing plan/budget/target data for each month. This is typically created at the start of the year and rarely changes. Comparing actuals to target is a very common business requirement.

Once again, I’m going to tackle this challenge but working out all the numbers I need for each month in a tabular format, before I go onto build the viz.

Building out the data

For this challenge we have 2 data sources, the pipeline data containing multiple years and the target data just containing data for 2020. So the first this we need to do is add a filter for Closed Date from the Pipeline Data source to be the Year 2020.

The data has been specially crafted as if it’s at a particular point in time in April, in my case at the point of building it was 15 April 2020. If this was being built for a real life scenario, we’d want to be reporting based off the Today() function. To simulate this, I created a calculated field to hardcode my ‘today’ date, but if I was doing this ‘for real’, I’d have set it to TODAY().

Today

#2020-04-15#

I need to be able to report the Pipeline Data that is at Stage=Closed Won separately from data that is still in the pipeline (hasn’t been closed as won or lost). I’ll use some calculated fields for this

Closed Won

ZN(IF Stage = ‘Closed Won’ THEN [Sales] END)

Note – the ZN will display as 0 if there is no Sales.

Pipeline

IF [Stage]= ‘Negotiating’ OR [Stage] = ‘Proposing’ THEN [Sales] END

Let’s start to build the table out:

• Month of Closed Date on Rows
• Closed Won and Pipeline on Cols (as Measure Values)
• Year of Closed Date = 2020 on Filter Shelf

Let’s now add in the target from the Target Data source. This will be a blend. When we blend we need to define how to ‘join’ the data sources together. I prefer to make it obvious what fields I am blending on, so although I can use existing fields and define a rule, I prefer created explicit calculated fields so it’s clear.

The Target Data contains a record for each month, dated as per the 1st of each month. In the Target Data, create a new field

BLEND – Month

[Date]

The in the Pipeline Data, create a field named exactly the same

BLEND – Month

DATE(DATETRUNC(‘month’,[Closed Date]))

but in this case we’re truncating the Closed Date to the 1st of each month, and ensuring it too is a Date rather than Datetime data type, so the fields can match.

Now add Target from the Target Data onto the table. If you get a warning message, click ok, then click the ‘link’ symbol that is currently greyed out against the Blend – Month field in the Target Data.

The ‘link’ symbol will go red and indicate that the data is being ‘joined’ on this field. The Target values in the table will now match the values if you check the data source excel file directly, and the Target pill in the Measure Values will show a ‘database’ symbol with an ‘orange tick’ which indicates it’s from a secondary data source. The data sources listed in the Data pane (top left) will also be coloured blue (primary) and orange (secondary), which indicates data blending is being used.

We now need to start working out how much off the YTD target we are so far, so we then work out how much pipeline is potentially missing from each future month.

So first up, how much has been closed won so far this year (only considering complete months). Ie how much has been won in Jan, Feb & March?

YTD Closed

WINDOW_SUM(SUM(IF [Closed Date] < DATETRUNC(‘month’, [Today]) THEN [Closed Won] END))

If the Closed Date is before the 1st of the current month (ie April in this example), get the Closed Won value already computed, but SUM all the values we have for all the months.

Add this onto the table, and you can see the total of the Closed Won values for Jan, Feb & Mar is listed against every month.

The table calc has automatically computed ‘table down’, but I’m, going to explicitly set it, as I know I’m going to move the fields around later, and I don’t want that value to change based on where it gets moved to,

Right click on the YTD Closed pill -> Edit Table Calculation and check Month of Closed Date

We need to work out how much we should have closed in the first 3 months of the year too. So in the Target Data, create a similar calculated field

YTD Target

WINDOW_SUM(SUM(IF [Date] < DATETRUNC(‘month’, [Today]) THEN Target END))

(Note – a Today calculated field also hardcoded to 15th April 2020 needs to be added to this data source too).

Add this field into the table too, and again, set the table calculation to be explicitly set against Month of Closed Date.

In order to work out how much missing pipeline to add to each month, we need to figure out how far we’re currently ‘off’, and then distribute this value across the remaining months.

I’m doing all this in steps, so I can sense check the calcs as I go. We can work out how much we’re off by creating a new field in the Pipeline Data

Missed Sales Value

[Monthly Target (2020_04_15_WW16_Sales Pipeline)].[YTD Target] – [YTD Closed]

Basically this is YTD Target YTD Closed, but when you refer to a field from the secondary data source, the field will be prefixed by the data source name.

Add this to the table, and again verify the table calc is set explicitly.

As this is a field based on other table calcs, you will see them listed as Nested Calculations, and you need to verify each one listed is set appropriately.

To work out how many months in the year are remaining that we need to distribute the above value over, we need

Remaining Months

12 – (DATEPART(‘month’, [Today]) -1)

As Today is in April, which is month 4, then the remaining months is 12 – (4-1) = 9.

An now we can work out how much needs to be added per month

Distributed Missed Sales Value

[Missed Sales Value] / [Remaining Months]

Pop this onto the table, and verify the table calc again.

Now we have this, we can work out what the Target needs to be adjusted to for each of the remaining months to make up the shortfall, which is basically adding the monthly shortfall above to the existing Target for the month (but only for the current and future months).

IF MIN([Closed Date]) >= DATETRUNC(‘month’, [Today]) THEN SUM([Monthly Target (2020_04_15_WW16_Sales Pipeline)].[Target]) + [Distributed Missed Sales Value] END

Note – we wrap Closed Date in a MIN function as we’re working with aggregated fields, so the Date needs to be aggregated too. MAX would work just the same.

Finally we need to work out what the shortfall is in the existing Pipeline to meet the Adjusted Target (if there is any).

For the months beyond the current month, this is simply the difference between the Pipeline value and the Adjusted Target (but only if the Pipeline is less than the Adjusted Target). For the current month though, it’s the difference between the Pipeline + Closed Won values and the Adjusted Target.

Missing Pipeline

IF ZN(SUM([Pipeline])) = 0 THEN NULL

ELSEIF DATETRUNC(‘month’, MIN([Closed Date])) = DATETRUNC(‘month’,[Today]) THEN
//it’s current month, so need to consider what’s closed & what’s remaining
IF (SUM([Closed Won]) + SUM([Pipeline])) < [Adjusted Target]
THEN ZN([Adjusted Target] – (SUM([Closed Won]) + SUM([Pipeline])))
END

ELSEIF SUM([Pipeline]) < [Adjusted Target] THEN ZN([Adjusted Target] – SUM([Pipeline]))

ELSE 0
END

Add this onto the table

And we’ve now got all the pieces we need to start to build the viz. Name this sheet Check Data or similar. We want this as our reference sheet to make sure our figures remain correct.

Building the Bar Chart

Firstly, duplicate the table viz, and remove the fields we don’t need in the final display (YTD Closed, YTD Target, Missed Sales Value, Distributed Missed Sales Value).

Now move the pills as follows :

• Closed Date from Rows to Columns
• Measure Values from Text to Rows
• Measure Names from Rows to Colour shelf
• Change Mark Type to Bar

Now move Adjusted Target and Target to the Detail shelf.

Adjust the colours of the remaining measures to suit, and reorder, so that the bars a stacked with Closed Won on the bottom and Missing Pipeline on the top.

Before we deal with the target lines, we’re going to sort the Tooltip out. It’s quite tricky… it might be there’s a better way, but I had to create a few custom calculated fields to get the display required.

Creating the Tooltip

For the first 3 months, the tooltip just needs to display the Closed Won value, but from April onwards, we need to display values for Closed Won, Pipeline & Missing Pipeline, even if the values are 0. Also the first 3 months just show the Target, but the remaining months need the Adjusted Target too. These values are displayed with | symbols in between along with labels, which should only show if relevant.

Firstly, we need to make sure all the measure values displayed, are accessible regardless as to which bar we hover over. So all of the 3 measures (Closed Won, Pipeline & Missing Pipeline) need to be added to the Tooltip. This is done by holding down Ctrl as you drag each pill from the Measure Values area onto the Tooltip shelf. This has the effect of duplicating the pill, and retaining any table calc settings that have been applied.

We only want the text ‘| Adjusted Target :’ to display if there is an Adjusted Target value :

Tooltip : Adjusted Target

IF [Adjusted Target] > 0 THEN ‘ | Adjusted Target : ‘
END

Add this to the Tooltip shelf.

We only want the text ‘| Pipeline :’ to display if there is a Pipeline value

Tooltip : Pipeline

IF [Pipeline] > 0 THEN ‘ | Pipeline : ‘
END

Add this to the Tooltip shelf.

And we only want the text ‘| Missing Pipeline:’ to display if we’re in the current or future months.

Tooltip : Missing Pipeline

IF DATETRUNC(‘month’, [Closed Date]) >= DATETRUNC(‘month’,[Today]) THEN ‘ | Missing Pipeline : ‘ END

Add this to the Tooltip shelf.

Now modify the Tooltip so the various pills are referenced and formatted as required

Finally adjust the Month axis, to set the months to be displayed as abbreviated values.

Adding the Target lines

At first glance, you might think the two target lines are both reference lines. However, if you hover over the tooltip of the Target (the solid line), you’ll see you have the same tooltip as the bars. Whilst there is some ability to control the tooltip of a reference line now, you can’t reference all the pills this tooltip requires.

So the Target is actually a dual axis mark. The Adjust Target however, is a reference line.

To get the Target to display, hold ctrl & drag the Target pill from the Detail shelf to the Rows shelf (to duplicate the pill), next to Measure Values.

On the Target marks card,

• Remove Measure Names from the Colour shelf
• Change the Mark Type to Gantt
• Change the Colour to black, and add a black border too (to make the mark thicker)
• Make the chart Dual Axis and Synchronise Axis
• Uncheck Show Header on the Target axis

If you hover over the Gantt mark/Target line, you should have the same tooltip as when you hover over the bar.

The Adjusted Target is a reference line. To add this, right click on the left hand axis and Add Reference Line. Adjust settings as follows :

• Scope – per cell
• Value – Adjusted Target
• Label – None
• Tooltip – Custom, set to ‘Adjusted Target (Dashed) :’ then add Value from the selector
• Change the Line to be black and dashed

Both target lines should now be displayed. It’s just now a case of applying some formatting to remove gridlines, row & column lines, adjust font sizes and remove axis title and column titles.

Building the legend

The dashboard displays a custom colour legend. As always there are multiple ways to do this. I chose to ‘fake it’ using aliases and some values associated to a completely different and unused dimension in the data.

Duplicate the Opportunity Name dimension. I just left it as Opportunity Name (copy). On a new sheet, add Oppotunity Name (copy) to the Filter shelf, and select 5 values only.

Then right click on Opportunity Name (copy) and select Aliases. For each of the values you selected in the filter, set an alias based on the legend names to display

Then build the legend as follows

• Add Opportunity Name (copy) to Columns
• Type in MIN(1) to the Columns shelf to create a fake axis
• Add Opportunity Name (copy) to the Text shelf
• Add Opportunity Name (copy) to the Colour shelf
• Fix the axis of Min(1) to start at 0 and end at 1
• Reorder the displayed values to suit.
• Format to remove all rows/column lines and hide the headers.
• Format the Label to be centred and size font
• Clear the tooltip.

Note – I chose to copy the Opportunity Name pill just to make sure I didn’t inadvertently break anything, and to easily revert if things didn’t go to plan :-).

Now the 2 sheets can be placed on the dashboard along with a suitable title.

One final tip – to prevent the user from inadvertently clicking on the legend viz when on the dashboard, add a floating blank image and position over the top of the legend.

My published viz is here.

Happy vizzin’ & stay safe!

Donna

# Can you create a chart with a dynamic week start?

This week Ann set the challenge and suggested she had ‘distilled it down a bit’ so it wouldn’t take as long to solve….hmmmm… not sure if that really worked out for me…. I certainly managed to crack through it to begin with, but the requirement to ‘show the data when there were missing values’ seemed to end up being trickier than I thought it would be, and the trying to display a date on the tooltip for those days…. well, that was just so freakin’ hard! Given this was ‘distilled down’, I thought I must have been missing something pretty obvious to be struggling so much. I played round with so many different table calcs to try to get this to work, but ended up admitting defeat and peaked at Ann’s solution.

The missing dates bit… that ended up being a bit odd. I rebuilt my chart multiple times, starting with a table of data as a ‘check’ and then carefully converting that into the required Viz. Eventually it just seemed to work, which left me very puzzled, as I was pretty certain what I ended up with, I’d had previously… I can only assume there is something about the order of the steps I took that may have caused the problem.

As I write this blog out, I’ll be rebuilding as I go, to generate the various screen shots, so fingers crossed, if you follow step by step, you’ll get a working solution 🙂

Right let’s get onto it…

Building out the data we need

As I’ve done before, I’m going to build up the fields I need into a tabular format, so I can ratify I’ve got everything correct before I build up the viz.

First off, we need a couple of parameters to drive the inputs for the chart :

Week Ending On

A date parameter set to 24th Oct 2019, that allows all values

Include X Prior Weeks

an integer parameter set to 10 by default

From the requirements, we need to display data for the 7 days of the week up to the Week Ending On date, along with data for each of the 7 days in the previous number of weeks as stated in the Include X Prior Weeks parameter.

So first up we need to work out when the ‘latest’ week starts

Start of Selected Week

Then I want to group all the Order Dates into their respective weeks, where each week needs to start on the same day of the week as the Start of Selected Week field. eg if Start of Selected Week is Friday 18th Oct, then I want all the dates of the previous x weeks to be grouped to the Friday of the week they’re in. So all the dates from Friday 11th Oct to Thurs 17th Oct are grouped as Fri 11th Oct, dates from Friday 4th Oct to Thurs 10th Oct are grouped as Fri 4th Oct, and so on.

Now the ideal way I think I should be able to do this is use a DATETRUNC function passing in the name of the day of the week, the week should start, which is based on day associated to the Start of Selected Week field. So I wanted to write something like this:

DATETRUNC(‘week’, [Order Date],DATENAME(‘weekday’,[Start of Selected Week]))

but Tableau doesn’t like the fact the 3rd parameter is a calculation rather than simply the text ‘Friday’.

So instead I had to build up the logic as follows

Order Date Week

CASE DATEPART(‘weekday’, [Start of Selected Week])
WHEN 1 THEN DATETRUNC(‘week’, [Order Date], ‘sunday’)
WHEN 2 THEN DATETRUNC(‘week’, [Order Date], ‘monday’)
WHEN 3 THEN DATETRUNC(‘week’, [Order Date], ‘tuesday’)
WHEN 4 THEN DATETRUNC(‘week’, [Order Date], ‘wednesday’)
WHEN 5 THEN DATETRUNC(‘week’, [Order Date], ‘thursday’)
WHEN 6 THEN DATETRUNC(‘week’, [Order Date], ‘friday’)
WHEN 7 THEN DATETRUNC(‘week’, [Order Date], ‘saturday’)
END

This Order Date Week is essentially the field that represents each line on the final viz.

I also need a field that I can used to represent the x-axis on the final viz, which is labelled based on the days of the week, but (as per the challenge), is starting at a different day.

To do this, I chose to ‘baseline’ all the dates in each week, to be aligned to the ‘current’ week. What I mean by this is that I want to create a field whereby if the Order Date is a Friday, the field will store the date of the Friday of the ‘latest’ week, in this case Friday 18th Oct. Any Order Date on a Saturday will be ‘aligned’ to Saturday 19th Oct and so on.

Order Date Baseline

DATE(DATEADD(‘day’, 7 * DATEDIFF(‘week’,[Order Date Week],[Start of Selected Week]), [Order Date]))

This finds the number of weeks between the week the order date is in and the start of the ‘latest’ week, which is then multiplied by 7 to convert to a days difference, which is then added onto the Order Date itself.

So from below, you can see both Fri 13th Sept & Fri 20th Sept map to 18th Oct; Tue 17th Sept and Tue 24th Sept both map to 22nd Oct etc.

I now need another field to restrict the dates being displayed to those within the timeframe dictated by the Include X Prior Weeks parameter.

Dates to Include

[Order Date]>=DATEADD(‘week’,-1 * [Include X Prior Weeks], [Start of Selected Week])
AND [Order Date] <= [Week Ending On]

So let’s start building the check table of data as follows:

• Order Date Week on Rows as discrete, exact date
• Order Date Baseline on Columns as discrete, exact date
• Dates To Include on Filter shelf, set to True
• Sales on Text

You can see we have some gaps where there are no Sales, we need these to display 0, which we do using

Inc Null Sales

IFNULL(LOOKUP(SUM([Sales]),0),0)

This is a simple table calc that is ‘looking up’ its own value, and if it’s null, sets itself to 0 otherwise uses it’s own value.

So let’s put this on the Text shelf instead

We’re also going to need a field to define the ‘latest week’, as this is coloured differently on the viz

Is Latest Week

[Order Date Week]=[Start of Selected Week]

Bring this into the table

We’ve now got the core things we need to build the majority of the chart.

Building the chart

Firstly, duplicate the table above, then move the pills round as follows:

• Move Is Latest Week to Colour shelf, and adjust colours accordingly
• Move Order Date Week to Detail shelf
• Move Inc Null Sales to Columns shelf
• Swap the Colour Legend so True is first (makes the latest line to the front)
• Format the Order Date Baseline field to custom formatting of dddd which will just the display the day of the week rather than the full date (sneaky huh?)

Then tidy up the formatting to remove gridlines, get the axis in the right format, change the font of the axis labels etc.

And that’s the main viz… now we need to sort the tooltips.

Tooltips

On the tooltip we need to display

• The order date
• The sales value or ‘no sales’ if there are no sales
• The start and end day of the week

Let’s start backwards.

Day of Week Start

DATENAME(‘weekday’,[Start of Selected Week])

Day of Week End

DATENAME(‘weekday’, [Week Ending On])

Put these on the Detail shelf (as we’re going to need them for the title too).

Add the Sales onto Tooltip and format it to \$ with 0dp. This will display the Sales value when there is one, and ‘nothing’ if there are no sales (as opposed to the 0 that the Inc Null Sales field will show.

To get the ‘no sales’ text we need

Tooltip: No Sales

IF IFNULL(LOOKUP(SUM([Sales]),0),0)=0 THEN ‘no sales’ END

Note – this is what I did and is in my published solution, but I could have just checked whether Inc Null Sales = 0 – doh!

Add this onto the Tooltip and position directly adjacent to the SUM([Sales]) field, so either one will be displayed (they can never both display).

Right, now we need to get the date.

The problem we have is that like the no sales, if there’s no orders on a date there’s no date to display either, but unlike the ‘no sales’ value, we haven’t got a single default value like 0 to use. Instead we need to work out what the missing date should be.

I tried various things for this, but finally had to look at Ann’s solution to get this, which is :

Tooltip: Order Date

IFNULL(ATTR([Order Date]),

Urgh! Horrid right!… so what is this saying…

• If the current order date doesn’t exist, then
• lookup the previous order date and add 1 day to it, but if that is also null then
• lookup the next order date and take 1 day off it

So this is only working on the assumption that there are not 2 days in a row with no orders.

Add this to the Tooltip and format accordingly to get the layout required.

Finally amend the sheet title to match the title displayed. The title can reference the Day of Week Start and Day of Week End fields along with the Week Ending On and the Include X Prior Weeks parameters.

When you then add onto the dashboard, make the parameters floating and position them top right.

Hopefully, you’ve ended up with a working solution. I know I approached mine slightly differently from Ann’s, so it’s worth checking hers out too.

My published viz is here.

Happy vizzin’! Stay Safe!

Donna

# What is the 90-day reorder rate?

Luke set the #WOW challenge this week, stating it had a difficulty rating of 10/10, so it was with some trepidation when I sat down to tackle it. Was it going to be as difficult as Luke’s radial bar challenge from 2018 week 10….?

Looking at the viz, and reading through the requirements, it didn’t seem as bad to me – it’s a table calculations challenge, and I’m OK with those. I started using Tableau before LoDs were invented, so table calculations don’t scare too much. I’m no expert though, and it still often takes a bit of trial & error to get the settings right.

Building out the Calcs

With a lot of challenges, I often start by just trying to build a tabular view of the data to sense check I’m getting the right numbers, which I then publish onto Tableau Public along with my viz. With a table calculation challenge, building the table of data is crucial.

The 3 key pieces of existing data needed for this challenge are Customer Name, Order Date & Order ID.

To start we want to put these on the rows, but given this will initially generate a lot of rows of data, I chose to arbitrarily filter to a random set of 20 or so customers (include Noel Staavos, as he’s a slight exception, which may catch you out later).

You can see from the above, Noel has multiple orders on the same day, which we need to handle.

The first requirement states we need to order the data by total number of orders per customer, so for this we need the 1st of many calculated fields (this one isn’t a table calculation, though it could be….):

Total Orders

{FIXED [Customer Name]: COUNTD([Order ID])}

Add this as a discrete pill on the Rows (since the final viz needs to show it this way too). I’ve placed it after the Customer Name but it doesn’t have to be there at this point.

We need to sort the customers by this field, so click on Customer Name and select Sort to bring up the Sort dialog. Choose to Sort By Field, Sort Order Descending, Field Name Total Orders. Aggregation Sum

We’ve now got our data ordered in the way we need. The next step is to work out the 90-day reorder rate per customer, which we will tackle in several steps. In much of what follows, some of the calculations could well be done in one calculation, but I like to see the ‘building blocks’ to help verify the calcs are correct.

First up we need to work out how many days between each order, and to figure this out we need to compare the Order Date on each line to the Order Date of the previous line.

I want to display the date of the previous order on the same line as the current order, so create

Previous Order Date

LOOKUP(ATTR([Order Date]),-1)

This ‘looks up’ the Order Date on the previous row. If I wanted to look at the next row, the second attribute would be 1 rather than -1. Or if I wanted to look at the data in the row 2 rows before the current one, I’d use -2.

Add Previous Order Date to the Rows as a discrete field. By default it’ll probably show as ‘Null’, but this is because the table calculation is computing across the table (so is looking for a previous column containing Order Date which doesn’t exist), whereas we want to look down it (ie by Row).

Click on the triangle against the Previous Order Date pill (the triangle indicates its a table calculation), and select Edit Table Calculation

We need to change the settings so they calculate for each Customer Name. Set to Specific Dimensions, and uncheck Customer Name

Note : Its worth having a read/watch of Andy Kriebel‘s Table Calculation Overview post to help you understand table calcs better.

You can see from the above, that for Noel Staavos’ multiple order on the same day, one of the orders is comparing against an order on a different date, while the other is comparing to the other order made on the same date.

Ok, so now we have the current date & previous date on the same row, we now need to work out the number of days between the dates.

Days Since Previous Order

DATEDIFF(‘day’, [Previous Order Date],ATTR([Order Date]))

Add this to the Text shelf. Hopefully it should automatically use the same table calculation settings defined for Previous Order Date, but if the numbers look off, double check the settings. They should match.

Now we can work out if the order is within 90 days of the previous order

Reorder Within 90 Days

IF ISNULL([Previous Order Date]) THEN NULL
ELSEIF [Days Since Previous Order]<=90 THEN 1
ELSE 0 END

Note, I am purposely choosing to output 1 or 0 (ie a number) rather than true or false (a boolean), as it will make the next calculation easier.

Again add Reorder Within 90 Days to the table, and again sense check the table calculation settings if things don’t look right.

Now we have the information we need to work out the reorder rate per customer, which is the number of records where Reorder Within 90 Days is 1 as a proportion of the number of records with a Reorder Within 90 Days value of either 0 or 1, since the requirements state the first order for each customer shouldn’t be included in the metric.

90-Day Reorder Rate

ZN(WINDOW_SUM([Reorder Within 90 Days])/WINDOW_COUNT([Reorder Within 90 Days]))

Set this to be a percentage with 0 dp.

By choosing to set Reorder Within 90 Days to an integer, the WINDOW_SUM() is simply summing up the 1s & 0s in the column. WINDOW_COUNT() is just counting the number of 1s & 0s there are in the column. Wrapping with a ZN means any Customers without any reorders will report as 0% rather than NULL.

Let’s add that to the table now too. You’ll see it reports the same value down the whole table across all customers, whereas we need it to show a different value per customer (although still show the same value for all the rows of the same customer).

We need to once again adjust the table calculation settings for this field.

What you’ll notice this time though, is there are Nested Calculations within this field, so the settings need to be checked for both the Previous Order Date calculation and the 90-Day Reorder Rate calculation

Both need to be identical with Order Date and Order ID being checked.

At this point, we have enough that we could start building the main viz, but I’m going to continue building out the ‘check sheet’ with the data I need for the KPIs too.

For the Overall Reorder Rate < 90 days we need to get a count of all the reorders within 90 days across all customers, as a proportion of all reorders (ie not the 1st order for each customer).

Count Reorders < 90 days per customer

IF FIRST()=0 THEN WINDOW_SUM([Reorder Within 90 Days]) END

I’m choosing to output the value only against the 1st row for each customer. This is because I’m going to be adding up this column shortly, and if I outputted the value against each row, I’d be double-counting.

Add this field to the table, and once again verify the table calculation settings are applied to each nested calculation. If done right, you should get the correct number against the first row for each customer

Now I want a sum of this column

Total Reorders < 90 Days

IF FIRST() =0 THEN WINDOW_SUM([Count Reorders <90 days per Customer]) END

Again, I’m choosing just to show this value against the 1st row, but this time it will be the first row in the whole table, as this time the table calculation needs to considering all the customers.

Add this field to the view. It’ll probably automatically give you the right number, but it’s worth having a look at the table calculation settings.

It contains 3 nested calculations this time : Total Reorders < 90 days, Count Reorders < 90 days per Customer, Previous Order Date.

The settings for Count Reorders < 90 days per Customer and Previous Order Date should be as before, with Order Date & Order ID both checked.

The setting for Total Reorders < 90 days is different though. By default it like shows Table(down), which gives the right result, but to ensure things don’t go awry if the pill gets moved around for some reason, I like to explicitly set the computation, by changing to Specific Dimensions and selecting all 3 fields

So this gives me one of the values I need to help me work out the overall rate; I now need the other

Count All Reorders

IF FIRST()=0 THEN WINDOW_COUNT([Reorder Within 90 Days]) END

This again is a nested calculation. Previous Order Date should be set with Order Date & Order ID as usual, and Count All Reorders should be set across all 3 fields.

Now we have the numbers to work out

Overall Reorder Rate

[Total Reorders <90 days]/[Count All Reorders]

which is set to a Percentage to 0 dp.

Add this onto the table and check the calculation settings again. This time there are 4 nested calculations

• Total Reorders < 90 days : set to all 3 fields
• Count Reorders < 90 days per Customer : set to Order Date & Order ID
• Previous Order Date : set to Order Date & Order ID
• Count All Reorders : set to all 3 fields

Remember, you may have filtered the customers in your view, so you may get a different value from the solution at this point!

Right onto the final KPI – average number of reorders per customer. For this we need the total number of reorders, which we’ve already got- Count All Reorders, and number of customers.

There’s probably a simpler way of doing this, but I’m continuing down the same route I’ve been working on.

First I want a count of the customer for each customer (which is 1), then I want to sum that up.

Count Customers

IF FIRST()=0 THEN COUNTD([Customer Name]) END

then

Total Customers

IF FIRST()=0 THEN WINDOW_SUM([Count Customers]) END

When added to the table, this is again a nested calculation with Count Customers set to Order Date & Order ID, and Total Customers set to all 3.

Now we can work out

Average Reorders Per Customer

[Count All Reorders]/[Total Customers]

which is set to be a number to 1 decimal place.

Adding this to the table, and verify the nested calculations are all set properly

• Count All Reorders : set to all 3 fields
• Previous Order Date : set to Order Date & Order ID
• Total Customer : set to all 3 fields
• Count Customers : set to Order Date & Order ID

Yay! We’ve now got all the values we need to build the KPI table and the main Viz.

I would recommend naming this sheet as Check Data or similar.

Building the KPI Viz

First step, duplicate the Check Data sheet.

For the KPIs, we only need 2 of the measures we’ve created, so remove all the pills from the Measure Values section except Overall Reorder Rate and Avg Reorders per Customer.

Then remove Total Orders and Previous Order Date from the Rows.

Due to the way table calculations work, we need to keep Customer Name, Order Date & Order ID in the view, but we only need the 1st row in the whole table, as that is where the data we want has been stored.

Create a new field

First Row

FIRST()=0

Add this to the Filter shelf and select True

Hide Customer Name, Order Date & Order ID by unchecking Show Header from each pill

We need to create a fake axis to get the display we need. In the Columns type in Min(0) twice, then remove Measure Names

Expand one of the Agg(Min(0)) cards on the left, and then drag the Overall Reorder Rate pill from the Measure Values section at the bottom, so it sits directly on top of the Measure Values pill

The Avg Reorders per Customer will now be on the Label shelf of the other AGG(Min(0)) card.

Change the mark type of both cards to Text.

Uncheck Show Header against the MIN(0) pills, and format to remove all columns/row/grid & zero lines.

Now you can format the text for each mark to be the appropriate size and with the relevant wording.

Then finally turn all tooltips off, and remove the Customer Name filter if it’s still there.

Hopefully you should get the numbers that now match the solution! Rename the sheet KPIs.

Building the main Viz

Once again start by duplicating the Check Data sheet we built.

Move Order ID and Previous Order Date onto the Detail shelf.

Move 90-Day Reorder Rate onto the Rows, and set to discrete

Move Order Date to Columns and change to continuous. Remove Measure Names from Columns.

Change the mark type to shape and move the Reorder Within 90 Days pill from the Measure Values section onto the Shape shelf. Change the shapes so NULL is the triangle and 0 and 1 are both diamonds.

Add Reorder Within 90 Days to the Colour shelf too, by clicking on the one on the Shape shelf, holding down Ctrl and dragging onto Colour. This will have the affect of duplicating the pill and will retain the table calculation settings. If you can’t get this to work, then drag from the Measures pane on the right, but make sure you set the table calc settings to match.

Change the colours so Null & 0 are grey (#CAC4BE) and 1 is purple (#5557EB).

The Tooltip for each shape also needs to be coloured accordingly, based on whether the mark represents the First order, a reorder over 90 days, or a reorder under 90 days (I chose to add the logic for first). I created 3 calculated fields for this:

Tooltip: First

IF FIRST()=0 THEN ‘First Order’ END

Tooltip:Over 90

IF [Reorder Within 90 Days]=0 THEN ‘Over 90 days’ END

Tooltip:Witin 90

IF [Reorder Within 90 Days]=0 THEN ‘Over 90 days’ END

I placed all 3 of these fields on the Tooltip shelf which was then formatted as:

while all 3 fields are in the tooltip, only 1 will ever actually display a value,

Make sure the table calculation settings for all 3 tooltip fields are set with Order Date & Order ID checked.

Now we need to create the line to join the marks up.

Duplicate the Order Date pill on Columns to create 2 instances of the pill sitting side by side. Set to Dual axis, synchronise and s’move marks to the back’ so the lines are behind the shapes.

Change the mark type on the Order Date (2) card to line. The requirements state

Make sure the “line” color matches with the following mark. It must be a single color“.

Hmmmm… I don’t think that last sentence was there when I tackled the challenge… I just read the line colour ‘matches’ with the following mark, so to do this I added Previous Order Date to the Path shelf. The line graduates in colour, but from the colour of preceding mark to the colour of the following mark.

Looking at Luke’s solution now that I’ve seen this statement, he chose to use a Gantt mark type to represent the line, which meant each gantt block was coloured fully. Oh well… I didn’t quite get this then 🙂

Stop the tooltips from showing on the line, by deleting the text in the tooltip, then its just a case of re-sizing the marks, and formatting to suit. Set the background colour of the worksheet to grey.

Adding the red header on the dashboard

To get the red header displayed above the table, add, a container onto the dashboard and add the viz to sit inside it. With the container selected (identified by the blue border), set the background of the container object to red (#D81159).

Then select the viz itself (identified by the grey border), and adjust the outer padding settings, to have 0 left, bottom & right, and 4 at the top.

Hopefully this should now be pretty much everything needed to get this challenge sorted. If I’ve missed anything, please comment. My published viz is here.

It’s a pretty lengthy read, so if you’ve got this far – well done & thank you!

Happy vizzin’!

Donna

# Can you create a concatenated list of values?

The challenge for this week’s #WOW was set by Sean Miller, with a focus on creating a more ‘traditional’ list style report. The challenge was related to the structure of the data, and using Tableau to take multiple rows of data for a patient, and present as a single row, with some of the information concatenated into a string of comma separated values.

Whilst there are other tools & methods that could be used to shape the data before it hits Tableau, which may be more appropriate/performant for a business use, this is after all a Tableau challenge.

I figured out what I was going to need pretty quickly – a table calculation to compare rows to build up the string. However I did struggle with the volume of data being used. It caused Tableau to keep crashing initially, so I had to ping Sean to sense check I was on the right track, and there wasn’t something I was missing. Sean confirmed I was doing the right thing, so I decided to start building against a much smaller set of data. I restricted the data by adding a data source filter, and using a wildcard filter against Member Name starts with ‘a’.

So onto the build.

Building up the concatenated string

Add Member Name and Health Check Name to Rows to get the set of rows we need to work with.

What we’re aiming for is to get a concatenated string of each Health Check Name for each Member Name, on a row.

We’re going to achieve this by building up a string that combines the Health Check Name for the current row, with the combined list from the previous row.

This requires the use of a couple of Table Calculations. The first one is Index(). This can be used to number the rows in a table. I typically create a calculated field to store this.

Index

INDEX()

Set it to be discrete rather than continuous, and add the field to the Rows.

By default, the index is basically displaying a number per row, starting from 1 to however many rows are being displayed. This is because the table calculation has been set to compute Table Down.

We actually want the calculation to restart at 1 when the Member Name changes, ie for each Member Name.

Change this by clicking the ‘carrot’ / arrow on the Index pill and set to Compute Using -> Health Check Name

Next we need to create new calculated field that will build up the string

Health Check Name List

IF [Index]=1 THEN ATTR([Health Check Name])
ELSE PREVIOUS_VALUE(ATTR([Health Check Name])) + ‘, ‘ + ATTR([Health Check Name])
END

If we’re the first row, then store the name of the current row, else get the value from the previous row and concatenate to the current value, separating with a ‘,’.

Add this field to the Text shelf, and verify the table calculation is set to Compute by Health Check Name

You can see how the list is building up for each row, so by the time you get to the last row for the Member Name you’ve got the complete list.

Restrict to 1 row per Member Name

The last row for each Member Name is ultimately what we want. To identify this, I need another table calculation

Size

SIZE()

This does exactly what it ‘says on the tin’; returns the number of rows being displayed, and like INDEX() it can be computed over different partitions in the table.

Change it to be discrete and add to Rows, once again setting it to Compute By -> Health Check Name

Size is displaying the total number of rows associated to each Member Name.

To reduce the data to 1 row per Member Name, we just need the following:

Index = Size?

Index = Size

This returns true if the values match, so add this to the Filter shelf and set to True, and hey presto! 1 row per Member Name.

You can now start adding all the other fields to the output and remove Index & Size from the rows.

You will need to ensure Health Check Name remains on the canvas though, as otherwise the table calculations will break. You can either hide it from displaying as a column, by unchecking Show Header, or move it to the Detail shelf.

To ‘label’ the concatenated string field, so it ‘looks’ like just another regular column, I type the text directly into the Columns shelf

This gives us a ‘double’ column heading, and we can simply remove one

Min # Healthchecks filter

This is achieved by adding Size to the Filter shelf, but to get a range slider, the field needs to be a green continuous pill rather than blue.

List Must Contain filter

Right click on the Health Check Name field and Create -> Parameter

The Edit Parameter dialog will open with the list of distinct values pre-populated. Add the value ‘All’ to the bottom.

This parameter will be presented on the dashboard to allow user selection. But as it stands it won’t do anything. We need another field to read in the value selected in the parameter and filter the rows.

FILTER: Health Check Names

CONTAINS([Health Check Name List],[Health Check Name Parameter]) OR [Health Check Name Parameter]=’All’

This returns True if ‘All’ is selected or the concatenated list we built, contains the value from the parameter.

Add this to the Filter shelf, and set to True.

Note throughout this challenge, all the table calculations must be set to compute by Health Check Name. This includes Nested Calculations

It’s worth double checking, and if you do alter them, recheck the filter value is selected to True and hasn’t reset itself.

And that’s the core of the tricky stuff all sorted. The additional filters need adding for Physician & Age Bracket, formatting applied and then all displayed nicely on a dashboard.

Once I’d got all this I saved my work, and then took off my data source filter, keeping my fingers crossed it didn’t all grind to a halt… it didn’t, but it wasn’t the quickest either.

My published solution is here.

Apparently someone completed the challenge with LODs… I’m off to investigate further!

Happy vizzin’!

Donna