# How often is Sean listening to his favourite songs?

A “scatter plot with a twist” music based challenge this week was posed by Sean Miller, using data from his last.fm account. On the surface, a simple scatter plot of each song, comparing the number of days between first and last listen against total plays. But click on song, and you get to see a timeline depicting days since first play vs cumulative plays, on the same chart. Hmmm…

The data set provided consists of 1 row per song per date played since 2017. For the scatter plot, we’re looking to summarise the data at a song level. For the timeline related to the selected song, we need to have the data at the song and date level. Ultimately we’re mixing levels of granularity within a single chart.

This certainly took some thinking. Creating the 2 charts independently was pretty straightforward, but trying to put them together took some thought. I knew I was going to want to use a set action to drive the interactivity and apply different logic based on whether a song was selected or not, but it took a bit of trial and error to get a solution.

First up, the data provided contained some timestamp date fields, but these were string data types. I chose to use the Timestamp UTC field to get a ‘proper’ date. I actually did this by duplicating the field, renaming it to Date Played and changing the datatype to a date. This generated a calculation which is below… I’d have never of typed this myself đź™‚

Date Played

DATE(IF NOT ISNULL( DATEPARSE ( “MM/dd/yyyy HH:mm”, [TimeStamp UTC] ) ) THEN DATEPARSE ( “MM/dd/yyyy HH:mm”, [TimeStamp UTC] ) ELSEIF NOT ISNULL ( DATEPARSE ( “MM/dd/yyyy hh:mm:ss a”, [TimeStamp UTC] ) ) THEN DATEPARSE ( “MM/dd/yyyy hh:mm:ss a”, [TimeStamp UTC] ) END)

Now we have that, we can work out the first date a song was played

Min Date Per Song

{FIXED [Song ID]:MIN([Date Played])}

and the latest dates

Max Date Per Song

{FIXED [Song ID]:MAX([Date Played])}

and then we can derive the days between

Days since first listen

DATEDIFF(‘day’, [Min Date per Song], [Max Date per Song])

We can also get the total plays per song using

Total Plays per Song

{FIXED [Song ID]: [Total Plays]}

and with these calculated fields, we can build the basic scatter, using Total Plays per Song as a range filter, and setting both the axes not to start at zero.

So far so good. Now let’s think about the timeline. We need to identify a ‘selected song’ to help build this, so let’s create a set by right clicking on Song ID > Create > Set, and selecting a single option

We need to capture the number of plays on a date

Total Plays

COUNT([2021_05_26_WW21_My Streaming Activity.csv])

This is the default count field generated (the equivalent of the Number of Records if you’re on older versions of Tableau).

And we also need to capture the number of days from first play date to the current date, as we need to plot on a consistent axis when putting the charts together (ie we can’t plot with date on the axis).

Days to Date

DATEDIFF(‘day’,[Min Date per Song],[Date Played])

Let’s put this out into a table so you can see what’s going on.

Add the Selected Song = True to the filter shelf

There are days when there were multiple plays, so the Days To Date field needs to be set to AVG rather than SUM, to get the correct figure.

For Total Plays we need to plot the cumulative value, so we can set a quick table calculation against the Total Plays field of Running Sum (right click on Total Plays field). We’re going to explicitly set the table calculation to compute by Date Played as when depicted on a viz, the default of table down, might not give the correct values.

So with these fields, we can build the timeline viz (duplicate the table sheet and move the fields around). Set the mark type to line, and change the setting on the Path to make a stepped chart.

Ok, so now we have the 2 charts and hopefully understand what we’re aiming for. But how do we now go about getting everything onto a single chart?

We’re going to need a dual axis chart, since we have different mark types in play. And we’re going to want to plot different measures depending on whether we’re working with the selected song or not.

We’re going to build the data up in a table to get the logic for the fields we need. To just test the concept, we’ll filter to just a few songs, including the one in the Selected Song set. Add Song ID to the filter shelf and filter to a few songs.

Then build out a table as below:

What we’re aiming for, is for the records where In/Out of the set is Out, we want to plot the information we’re getting from the 1st two columns, but for the records where In/Out the set is In, we want the information from the other columns.

So let’s build this out.

Days to Plot

IF ATTR([Selected Song]) THEN AVG([Days To Date]) ELSE SUM([Days since first listen]) END

Plays to Plot

IF ATTR([Selected Song]) THEN RUNNING_SUM([Total Plays]) ELSE SUM([Total Plays per Song]) END

Pop these in the table, making sure any fields which are table calculations are set to compute by Date Played.

If you scroll to find where the data changes from a song out of the set to the one in, you can see how the two new fields are working.

So lets try plotting the chart out using these fields instead.

• Days to Plot on Columns
• Total Plays Per Song on Rows
• Song ID on Detail shelf
• Plays to Plot on Columns
• Date Played (set to exact date) on Detail of the All Marks card.
• Set the table calculation of Plays to Plot to compute by Date Played only.
• Change mark type of the Plays to Plot to Line and set the Path to stepped line
• Add Total Plays Per Song to Filter shelf and set to range from 50 to 100 (just to make the chart less busy).

You should end up with the below

You’ll notice we have the line of circles in the top chart, which is plotting a mark per day for the Selected Song. If we remove the Date Played pill from the Detail shelf of the Total Plays Per Song marks card, we lose these marks including the mark for the Selected Song too, which we need.

To resolve this, we need another field.

Is Last?

LAST()=0

Last Plays to Plot

IF [Is Last?] THEN Sum([Total Plays per Song]) END

This is just saying give me the total plays for the last mark in the list. Add these to the table to see what’s going on, making sure to set the table calc to compute by Date Played only

• Set the marks type on the Last Plays to Plot to circle
• Remove Measure Names from the Colour shelf, and set the mark colour to #3dde3c. Reduce the opacity to about 90%. Add a dark grey border.
• Add Selected Song to the Size shelf, and adjust so the selected song is larger than the others.
• On the Plays to Plot marks card, again remove Measure Names from the Colour shelf, and set the colour to black.
• Set the Path to stepped line.
• Click on the right hand axis and select Move marks to back
• Reduce the Size of the mark.
• Add Selected Song to the Colour shelf, and adjust the colours so the line is black and the tiny dots for all the other marks that you can see in the circles, is set to #3dde3c

You’ve now got the core chart, which needs to be further tidied to remove grid lines, axes, add tooltip etc. Once done, you can add to a dashboard, where you can then set the interactivity.

Add a dashboard action to change set values that sets the Selected Song set on Select. Set to work on single-select only.

Finally, you’ll find that if you select a mark, while you’ll get the trend line, the other points will now ‘fade out’

Now if you click on and off a mark, you should get the trendline show and disappear, and all marks remain at the same transparency throughout.

Hopefully you’ve got enough now to complete this challenge. My published viz is here.

Happy vizzin’! Stay Safe!

Donna