What is the distribution of total orders by customer?

Luke Stanke set the challenge this week, and posted a ‘sneak peak’ on Twitter before the challenge was formally released

Challenges from Luke can sometimes be on the harder end of the scale, so with the bit of extra time available and only the gif in the tweet as a clue , I had a play to see if I could get close at all. And surprisingly I could, so once it was formally issued, it was just a case of tidying up some of my calcs.

Calculations

First up we need to establish the basic calculations

# Customers

COUNTD([Customer ID])

# Orders Per Customers

{FIXED [Segment],[Customer ID]: COUNTD([Order ID])}

Plotting these out onto a table with Segment we get

Note #Orders Per Customers needs to be set to be a discrete dimension.

So while this shows us a summary of the number of customers, we’ve only got 1 mark showing the summarised count. When plotting the chart we need something else in the view that will generate more marks. This is Customer ID.

At this point in order to help us build up our tabular view to ‘see’ what’s going on, I’ll filter the table to just show Segment = Consumer, and I’ll add Customer ID to Rows

As expected, our # Customers is now showing a count of 1 per row, but the data is now expanded as we have now got a row (ie a mark) per customer for each # Orders Per Customer ‘bucket’. But we still need a handle on the total customers in each ‘bucket’.

Customers Per Order Count

WINDOW_SUM([# Customers])

Adding this to the view and setting the table calc to compute based on Segment & Customer ID, we get the summarised value back again.

But we don’t want to actually show that number of marks; the number of circles to plot on the chart is dependent on a user parameter:

pMarkIndicator

Based on the requirements, if the number of customers is 15 and the user parameter is 5, then 3 circles should be drawn (15 / 5 =3 ), but if the number of customers was 14, only 2 circles should be drawn (14 / 5 = 2.8), ie the number of circles will always be set to the integer of the equal or lesser value (essentially the FLOOR() function). This can also be achieved by

Marks to Plot

INT([# Customers per Order Count]/[pMarkIndicator])

For some reason FLOOR can’t be used in the above as a table calculation is being used, but INT does the job just fine, and adding to the tabular view and adjust the table calculation accordingly we get

ie for the Consumer Segment, 6 customers have made 1 order in total, so based on batching the customers into groups of 5, this means 1 circle should be displayed. Whereas, 30 customers have made 3 orders in total, so 6 circles should be displayed.

But we can’t actually reduce the amount of rows (ie marks) displayed – we either have 1 row (by removing Customer ID) or a row per customer. But that’s fine, we don’t need to.

What we want is something in our data to group each row into the relevant batch size.

First up let’s generate an ID per row for each customer that restarts for each #Orders Per Customer.

Index

INDEX()

Add this as a discrete pill to Rows, and adjust the table calc

Now we have this, we calculate which ‘column’ each customer can sit in, a bit like what we would do if we were building a small multiple table, arranging objects in rows and columns (see here for an example of what I mean).

Cols

[Index]%([Marks to Plot])

This uses the modulo (%) notation which returns the remainder of the division sum. Lets put this on the view

For customers who have only ordered once, and where we’re only going to plot 1 mark, the Cols value is the same (0) for all rows.

Whereas for customers who have ordered twice, and where we want to plot 2 marks, the Cols value is either 0 or 1.

We’ve now got a value we can use to plot on an axis. We’re still going to plot a mark for each Customer ID, but some marks will be plotted in the same position, ie on top of each other, which therefore looks like just one mark.

Let’s show this more graphically, by duplicating the sheet, deleting some pills, moving some around, changing some to continuous, and setting the mark type to circle as below

Change the pMarksIndicator parameter and the number of circles will adjust as required.

So far, so good. We’ve got the right number of marks, it’s just not looking as nice and symmetrical as it should be.

We need to shift the marks to the left. But how far it shifts is dependent on whether we’ve plotted an odd or even number of marks.

If we have an odd number, the middle mark should be plotted at 0. If we have an even number the middle two marks should be plotted at -0.5 and +0.5 respectively. The calculation below will achieve this

Cols Shifted

IF [Marks to Plot]%2 = 0 //we’re even
THEN [Cols] – ([Marks to Plot]/2) + 0.5
ELSE //we’re odd
[Cols] – (([Marks to Plot]-1)/2)
END

To demonstrate this, I’ve added Cols Shifted along side Cols on the viz (this time make sure all the table calculation settings (including the nested calcs) are applied to compute based on Customer ID only which is different from the calcs above)..

Now you can see how it all works, you can remove the Cols and the Segment from the Filter shelf.

And now its just a case of applying the various formatting to clean up the display, and adding to the dashboard.

My published version is here.

Happy vizzin’! Stay Safe!

Donna

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s