Can you create a common starting point?

Kyle set the challenge this week, revisiting his favourite topic – baseball. The aim was to build what I’ve often referred to as a ‘rocket chart’, as it charts progress from a single ‘launch’ date/point. However having had a quick google, I can’t see any other reference to this being used for this type of chart….no idea where it came from <shrug>.

Anyway, the requirement was to compare the profiles of when home runs (HRs) had been accumulated over the course of a player’s career, restricting to just the players who are in the all-time top 10. These players hadn’t necessarily played during the same years or even decades, so there was a need to baseline the information according to the days since they started. Kyle also threw in the requirement that this was to be an LoD based challenge only, with no use of table calculations.

Build the basic chart

As mentioned above, we first need to ascertain how many days have passed between when the player hit their first home run, and the subsequent dates. We use a FIXED LoD to work out the minimum date per player

Min Date Per Player

DATE({FIXED [Player] : MIN([Date])})

And with that we can the work out the number of days that have passed

Days Since Min Date

DATEDIFF(‘day’,[Min Date Per Player], [Date])

And with this, we can quickly build out the main crux of the chart. Add Days Since in Date to Columns, and change to be a continuous dimension. Add Career HR to Rows and amend the aggregation to use AVG rather than SUM, as I found there looked to be duplicate records for some dates for the same player. Add Player to Detail.

Colouring the lines

Kyle provided a custom colour palette to use based on the team colours of the player. I updated by preferences.tps file with this data, and closed and reopened Tableau Desktop to ensure it picked it up. For more information on working with custom colour palettes see this Tableau help article.

Along with the player colours, we also need to identify which player has been selected.

For that we need a parameter to define who the selected player is

pPlayer

string parameter using a List where the values are added from the Player dimension. this causes the default to be set to Albert Pujols.

Show the parameter on the display.

We can now create

Is Selected Player?

[Player] = [pPlayer]

which will return a boolen true/false.

Kyle stated that we should be able to set the colours without having to manually click against every Player|T or F combination.

Now I managed this when I first built my solution, but in writing this blog and trying to replicate the steps, I’m not getting the same behaviour. So I have managed to come up with another way. The gif below hopefully demonstrates, but I’ll list the steps too.

Move the Player pill from Detail onto Colour

Edit the Is Selected Player field to just return True (use // to just comment out the original calculation)

Add Is Selected Player to the Detail shelf, then click the detail icon to the left of the pill and change it to Colour. This is a way to get multiple pills on the Colour shelf. Dragging will just replace the field being used for colour.

The colour legend dialog box should display a list of <Player>, True entries (if the legend isn’t displaying go to Worksheet > Show Cards > Reset Cards – you may then have to add the parameter to the display again).

Edit the colour legend, select the MLB HR Top 10 colour palette and click Assign Palette. This will automatically assign the relevant colour to each entry, since they were added based on alphabetical order.

Re-edit the Is Selected Player field, so it is back to [Player] = [pPlayer].

The entries in the colour legend will now only list one <Player>, True entry and the rest all false.

Edit the colour legend, and multi-select (ctrl-click) all the False entries, and then select the lightest shade of grey from the Seattle Grays palette. This should give you the desired display.

Select Alan Rodriguez from the parameter control. Both Albert, False & Alan, True should now be coloured. Edit the colour legend again and manually set the Albert Pujols, False entry to the same grey shade.

Now if you select any other player, only 1 line should be coloured, and it should be coloured to the corresponding player’s colour.

Setting the Tooltip

Add Season HR to Tooltip and change the aggregation to AVG. Add Date to Tooltip too and set it to be an Attribute. Amend the tooltip accordingly.

Adding the highest season HR indicator

Firstly we need to determine what the maximum Season HR value is per player

Max Season HR Per Player

{FIXED [Player]: MAX([Season HR])}

With this, we then want to get the corresponding Career HR value for that same time.

Career HR | Max Season HR

IF [Season HR] = [Max Season HR Per Player] THEN [Career HR] END

Add this field to Rows and change the aggregation to Avg.

Set to Dual Axis, Synchronise Axis and then set the mark type to Circle. Adjust the size of the circle mark slightly if need be.

Labelling the lines

On the Line marks card, add Player and Career HR to the Label shelf. Adjust the aggregation of Career HR to Avg. Edit the label, so only line ends are labelled. Adjust the font size to something quite small, and set the colour to Match Mark Colour.

Finally remove all gridlines, row & column dividers, and hide the axis. Title the chart.

When added to a dashboard, I then used a floating text object for the introductory text and positioned the parameter as a floating object underneath the text.

My published viz is here.

Happy vizzin’!

Donna

Advertisement

Can you visualise daily and weekly sales in the same view?

This week, another guest poster, Jami Delagrange, provided the #WOW2020 challenge. And I’m not gonna lie, I found this pretty tough.

The challenge suggests you can build the data model yourself, and I thought about doing this, but then figured I’d solve the visualisation challenge itself before looking back at the data model… and I’m glad I did, because it took some time to get through the challenge (and I haven’t attempted to build the model – I’ll wait for Jami’s solution for that).

So what were the areas I struggled with….

Initially it was the mark type, and how to get the ‘bars on bars’, in such a way that the fat weekly bar was sat in the middle of the thin daily bar. I tried a variety of things and spent a lot of time, with no success, before I gave up and had to look at Rosario Guana‘s solution. The answer Gantt bars. But I’d already attempted them, and hadn’t made much progress, and even after trying again having seen that Gantt;s were needed, I was still puzzled. So I had to have another look, and realised that my problem was that I was trying to plot a measure to make a vertical axes (ie put a measure on the rows), and that wasn’t needed.

So now armed with those 2 bits of information – I needed Gantt bars and set the measures for the dual axis on the Columns – I could start figuring out all the pieces I needed.

Within this blog, I’ll focus on

  • Calcs required for plotting the Weekly Sales data
  • Calcs required for plotting the Daily Sales data
  • Building the Gantt chart
  • Comparison filter
  • Event Group filter

Weekly Sales data

There isn’t a single ‘launch date’, across all the events, but the charts all start at the ‘same point’, so we can’t be plotting the date on the axis. Instead we need to calculate the week no relative to the launch date per event, using

Week of Launch Date

DATE(DATETRUNC(‘week’,[Launch Date],’Monday’))

(the requirements state the week should start on a Monday)

Week of Actual Date

DATE(DATETRUNC(‘week’,[Actual Date],’Monday’))

Week No From Launch

DATEDIFF(‘week’, [Week of Launch Date], [Week of Actual Date]) + 1

This returns an integer starting from 1, and is essentially counting the weeks from launch up until the week of the Event itself.

For the weekly data, we also need the ticket sales for the week, the cumulative weekly ticket sales (both displayed on the tooltip), the total sales and the % of sales per week (used to colour the mark, and displayed on the tooltip).

Ticket Sales

ZN([Sold Amount])

by wrapping this in ZN the field will return 0 in the event there are no tickets sales in a week. This means a mark will exist for every week.

For the cumulative ticket sales, I just used a Running Sum quick table calculation on Ticket Sales, so didn’t create a dedicated field, but if I had if would have been RUNNING_SUM(SUM([Ticket Sales])).

Total Sales Per Event

TOTAL(SUM([Sold Amount]))

% Total Sales

RUNNING_SUM(SUM([Ticket Sales])) / ([Total Sales Per Event])

This is formatted to a percentage with 1 dp.

If we put this all out into a table as follows, then you can see how the data all holds together

  • Event Description (Dim Event) &Week No From Launch (blue discrete pill) on Rows
  • Measure Names on Columns/ Measure Values on Text, filtered to measures : Ticket Sales, Ticket Sales (running sum table calc), Total Sales Per Event, & % Total Sales.. All table calcs are set to compute using Week No from Launch.

Daily Sales Data

For each week of sales, we need to plot 7 marks for each day of sales. Again we can’t plot using the actual date of sale. Instead we need to ‘index’ each day in the week – ie number each day from 1 – 7.

The data set already has a Day of Week field associated to Actual Date, but this returns 1 for a Sunday and 7 for a Saturday, but we need 1 to be associated to a Monday, so we just need to translate this field

Day No of Week

IF [Day of Week] = 1 THEN 7
ELSE [Day of Week] – 1
END

The days plotted are also coloured based on which ‘type of day’ they are

Type of Day

IF [Actual Date] = [Launch Date] THEN ‘Launch Date’
ELSEIF [Actual Date] = [Event Date (Dim Event)] THEN ‘Event Date’
ELSE ‘Regular’
End

Once again let’s plot the info we need into a table, so you can see what’s going on:

Put Event Description (Dim Date), Actual Date (exact date, discrete blue pill), Week No from Launch, Day Of Week Abbrev, Day No of Week, Type of Day on Rows and Ticket Sales on Text.

So these are the basic building blocks we need.

Building the Gantt Chart

For the ‘daily’ view, we need

  • Event Description (DimEvent) on Rows
  • Week No from Launch (discrete, blue bill) on Columns
  • Day No of Week (Continuous, dimension, green pill) on Columns

By just placing these 3 pills in these locations, the mark type is ‘automatic’ but has immediately selected the ‘gantt bar’ type. Change it to be specific.

You can now add Type of Day to Colour and format the marks appropriately (change the colour, set a border around each mark).

For the ‘weekly’ view, we just need a way of creating a mark that ‘spans’ the 7 days, as we’ve already ‘segmented’ the display by Week No From Launch. I typed in MIN(0) to Columns for this

I then ‘typed’ in Min(1) on the Marks shelf and added it to the Size shelf

Editing the MIN(0) axis to fix it from 0 to 1, fills the space, and then adjust the Size of this mark to make it narrower.

If you now make this ‘dual axis’, you’ve got the view you need

And now it’s just a case of formatting the colours, the Tooltips for each mark, hiding axis headers etc.

Comparison Filter

This filter changes the order in which the rows are displayed.

This is driven by a Parameter which contains the values ‘Event Year’ and ‘YoY Event’ and a calculated field

Display

IF [Comparison] = ‘Event Year’ THEN LEFT([Event Description (DimEvent)],4)
ELSE RIGHT([Event Description (DimEvent)],2)
END

Add the Display to the Rows in front of the Event Description (DimEvent) field and hide it.

Event Group Filter

Another parameter Event Group is created which I’ve set to store integers but display as a string

I also then created

Event Group Filter

([Event Group] = 1 AND QUARTER([Event Date (Dim Event)])<=2) OR ([Event Group] =2 AND QUARTER([Event Date (Dim Event)])>2)
OR [Event Group]=0

which is added to the Filter shelf and set to true.

And hopefully that should be all you need to get this challenge sorted. My published viz is here, and if you download, I also have other sheets which try to breakdown some of the information.

Happy vizzin’! Stay safe!

Donna

Can you build a comparative line chart with dynamic inputs?

My initial thoughts when I first saw this week’s challenge posted by Ann Jackson, was this would be pretty straightforward. And on the whole it was. I’ve dealt with defining specific date periods for comparison in the past, both with other #WorkoutWednesday challenges and within my job too. But Ann threw a curveball with her dynamic input requirement using what she termed as ‘custom buttons’.

The date listing didn’t phase me – the date forms part of the dataset, so displaying that to look like formatted ‘buttons’ (aka circle marks) was something I knew I could do.

It was the time range options that had me puzzled. The values ‘Last 3 Months’, ‘Last 6 Months’ etc aren’t values associated to any field in the dataset. Typically they would be options of a parameter. However parameters can’t be formatted in the way Ann presented. The closest you can get with a traditional parameter is a radio button list, where you have limited formatting options – certainly nothing as colourful as Ann’s requirement.



When I came to tackle this challenge, as this was the area I was most stumped over, I chose to just use the basic parameter concept to start with to get everything else working as I hoped. When I finally figured out how to do it, I adapted what I’d built to incorporate my change. For the purposes of this write up though, I’m going to start with the ‘dynamic inputs’.

Pick Time Range

From observing the PICK TIME RANGE selector in Ann’s published viz and interacting with it (hovering over the marks, clicking on an option etc) I deduced I needed to build a viz using Circle marks, and a row per option. The problem, as discussed above, was there was no dimension in the dataset that I could use to give me the values I wanted.

I needed to create one, but how?

By using a type of data densification technique, that I’ve used on other challenges, to essentially ‘fake’ a new field. My fellow #WorkoutWednesday pal, Rosario Gauna, recently co-authored a blog all about this technique, which you can read here. As a consequence I’m not going to go into too much detail about this.

I created the following calculated field

Pick Time Range

IF MONTH([Order Date]) = 1 THEN ‘LAST 3 MONTHS’
ELSEIF MONTH([Order Date]) = 2 THEN ‘LAST 6 MONTHS’
ELSEIF MONTH([Order Date]) = 3 THEN ‘LAST 9 MONTHS’
ELSEIF MONTH([Order Date]) = 4 THEN ‘LAST 12 MONTHS’
ELSE ‘YEAR TO DATE’
END

Given the dataset contains records for every month in the 4 years, I could guarantee there would be a record for each of these options.

I used the MIN(0) trick (or MIN(0.0) to create the display I was after, as this ensured I could align the text to the right of the circle mark I needed. This was plotted against my new Pick Time Range field (which has Show Header unchecked). Using MIN(0.0) meant I had a bit more precision when it came to fixing my axis to keep everything starting from the left when I eventually put on the dashboard. My axis were fixed from -0.01 to 0.1.

To make a value look like it is ‘selected’ did still require the use of a parameter….

Time Range

… and another calculated field

Time Selected

[Pick Time Range]=[Time Range]

This was then added to the Colour shelf, with the colours adjusted accordingly

Finally to make the colour change ‘on selection’, I needed a Parameter Action.

First I added the sheet created above to a dashboard. By default, this automatically displays the Time Range parameter and the colour legend to the dashboard. Leave these for now. I then selected Dashboard > Actions > Add Action > Change Parameter, and configured the options as below

When an option is clicked on, the Time Range parameter can be seen to change, and in turn, the selected option changes to pink.

The legend, and the Time Range parameter can now be removed from the display on the dashboard. This bit is done, onto the Date Selector 🙂

Pick End Month

Initially this follows similar principles to the above. I created a calculated field to store the month of the date

Order Date Month

DATE(DATETRUNC(‘month’,[Order Date]))

This sets every date to be the 1st of the associated month. I formatted this date to display as <month> <year>, and added these to a view in the same way I did the above, using MIN(0.0), a circle mark and, right aligning the label. The dates need to be sorted to descending.

Another parameter is needed to store the ‘selected’ end month, which I defaulted to 01 October 2019 (aka October 2019)

Date

But this time, we don’t just need to highlight the date selected, we need to colour the circles based on whether the date is within the ‘current period’, which in turn is based on the Time Range selected above, or within the ‘previous period’.

So if the end month is October 2019 and the Time Range is LAST 6 MONTHS, the Current period is the dates from May 2019 to October 2019, and the previous period is the 6 months prior to that.

To determine all this, I need to determine the start and end of the current period and the start and end of the previous period.

The end of the current period is simple, its the date selected

End Date Current Period

DATETRUNC(‘month’,DATE([Date]))

this is probably a bit overkill, as the default date is 1st October and all the other dates have been truncated to be the 1st of the month, but this just ‘makes sure’ 🙂

To get the end of the previous period, I need to either go back a number of months from the end date, or set to the same date of the previous year, if Year To Date is selected. I need to use a DATEADD function, but I need the number of months as an integer, so I need another calculated field to help with this, which will return me a number based on the value in the Time Range parameter

Time Value

CASE [Time Range]
WHEN ‘LAST 3 MONTHS’ THEN 3
WHEN ‘LAST 6 MONTHS’ THEN 6
WHEN ‘LAST 9 MONTHS’ THEN 9
WHEN ‘LAST 12 MONTHS’ THEN 12
ELSE 0
END

So now I have this, I can work out

End Date Previous Period

IF [Time Value] = 0 THEN
//take off a year
DATEADD(‘year’,-1,[End Date Current Period])
ELSE
//go back a few months
DATEADD(‘month’,([Time Value] * -1), [End Date Current Period])
END

and

Start Date Current Period

IF [Time Value] = 0 THEN
//1st of Jan for current year
DATETRUNC(‘year’,[End Date Current Period])
ELSE
//calc start time period as 1 month on from previous period end
DATEADD(‘month’, 1, [End Date Prior Period])
END

and

Start Date Prior Period

IF [Time Value] = 0 THEN
//1st of Jan for previous year
DATETRUNC(‘year’,[End Date Prior Period])
ELSE
//calc start time period (x months – 1) before end date
DATEADD(‘month’, ([Time Value]*-1)+1, [End Date Prior Period])
END

With all the dates now sorted, we can determine which colour each month should be set to

COLOUR:Date Selector

IF [Order Date Month]>=[Start Date Current Period] AND [Order Date Month]<=[End Date Current Period] THEN ‘blue’ ELSEIF [Order Date Month]>=[Start Date Prior Period] AND [Order Date Month]<=[End Date Prior Period] THEN ‘dark grey’
ELSE ‘grey’
END

which can then be added to the Colour shelf of my date list sheet.

To get the Date parameter to change on selection of the sheet, we need to use parameter actions again, so add the sheet to the dashboard, and create a new Dashboard Action

Now we’re ready to build the line chart…

Period by Period Line Chart

This is essentially a ‘Sales Over Time’ chart, but we can’t just plot a date on the columns shelf as we need to be able to compare the time periods, ie the value of sales associated to the date of the current start period must be plotted in the same ‘column’ as the value of the sales associated to the the date of the previous start period.

This is managed using the table calculation INDEX() to essentially number the dates from 1 to x for each period band (which I could use my COLOUR:Date Selector field to segregate). This concept of plotting dates from a set point is sometimes referred to as a rocket chart, although there are other names for them.

My Order Date Month field has to exist on the Detail shelf, and and the Index is set to compute using that field too


However, as you can see, I have 3 date periods – the dates based on the current period, the dates in the previous, and all other dates (the light grey line). We don’t want these, so another calculated field…

Dates To Include

([Order Date Month]>=[Start Date Prior Period] AND [Order Date Month]<=[End Date Prior Period]) OR ([Order Date Month]>=[Start Date Current Period] AND [Order Date Month]<=[End Date Current Period])

… is added to the Filter shelf and set to True (the dates to display must either be within the Current Period, or within the Previous Period).

Add this to the Dashboard, and you can now start to see how the line chart will change as the dates are changed and the time period are changed.

So final step now…

Title

The title of the dashboard updates based on the inputs selected, so the title is just built on another sheet, placing all the relevant fields on the Text shelf and formatting accordingly

Adding this onto the dashboard too, and you have all the components this challenge requires. You just need to sort any formatting/placement changes.

My version can be viewed here.

Happy vizzin!

Donna