Tuesday, 8 November 2016

Cross Database Joins - Bethany Lyons and Alex Ross

Bethany Lyons and Alex Ross

The unexpected solution to many tough analytics problems

Bethany has looked forward to delivering this session for a long time as CDJ (Cross Database Joins) can be used to solve so much

Often most analysts have read only access so don't have the chance to create data
Identify (create vs deriving)
Understand (complexity and volume vs performance)
Use - focus on CDJs as the solution

Example 1
Taking a 12 month subscription from just one row and spreading it across the 12 months (on a monthly basis) to show the monthly revenue. Use a simple excel scaffold of month number and a key of 1 to create the product join.

To create the filter of finding when the contracts will earn revenue up to, Bethany used a Boolean filter calf but then added it to the data source filters to cut down on the processing done locally on your machine.

V10.2 adding joins based on calcs

Example 2

Counting staff employed at any single point in time - use missing values to fill the gaps on the table calcs

A scaffold of all dates is needed though and the just return those that are after hire and pre-termination date.

What if the scaffold creates huge amounts of rows?
I.e. If you have seconds a tool is active, you can scaffold on a day level and the create a calc that then counts a full day’s worth of seconds (86,400) but if a partial day then use a datediff() calc

No comments:

Post a Comment