Survival Analysis 01: Data Prep(Python)
Loading...

Our goal over the next few notebooks is to examine how two core Survival Analysis techniques can be applied to better understand patterns around customer attrition in a subscription model. In this notebook, we will prepare a publicly available dataset for analysis. This data will then serve as the basis for some exploratory analysis and modeling intended to assist us in understanding and predicting customer churn.

Note This notebook has been revised as of July 20, 2020

Step 1: Access the Raw Data Files

The dataset we will be working with is from the KKBox Churn Prediction Challenge hosted on Kaggle in 2018. Access to the data will require you to authenticate to Kaggle and agree to the organization's terms and conditions. Once you've done this, you can download the data, decompress it and upload it to your preferred cloud storage layer. For the purposes of this blog, we assume this has already been done and each file has been uploaded to an appropriate subfolder under a mount point named /mnt/kkbox/.

While there are numerous files available with this dataset, we will be making use of the transaction log and member files only. The information in some of the other files could be very interesting for this analysis. However, only a fraction of the subscriptions referenced in the transaction log have data in these other files. Instead of limiting our analysis to a small number of customers for whom we can create a more complete picture of service utilization, we'll focus our analysis to the more foundational information in the transaction log and member files.

Under our mount point, i.e. /mnt/kkbox/, it is expected that the transactions.csv and transactions_v2.csv files are uploaded to a subfolder named transactions and members_v3.csv is uploaded to a subfolder named members. With these files loaded to the proper locations, we can now read the files into Delta Lake-backed tables to make them more easily consumable in subsequent steps:

import shutil
from pyspark.sql.types import *
# delete the old database and tables if needed
_ = spark.sql('DROP DATABASE IF EXISTS kkbox CASCADE')

# drop any old delta lake files that might have been created
shutil.rmtree('/dbfs/mnt/kkbox/silver', ignore_errors=True)
# create database to house SQL tables
_ = spark.sql('CREATE DATABASE kkbox')
# transaction dataset schema
transaction_schema = StructType([
  StructField('msno', StringType()),
  StructField('payment_method_id', IntegerType()),
  StructField('payment_plan_days', IntegerType()),
  StructField('plan_list_price', IntegerType()),
  StructField('actual_amount_paid', IntegerType()),
  StructField('is_auto_renew', IntegerType()),
  StructField('transaction_date', DateType()),
  StructField('membership_expire_date', DateType()),
  StructField('is_cancel', IntegerType())  
  ])

# read data from csv
transactions = (
  spark
    .read
    .csv(
      '/mnt/kkbox/transactions',
      schema=transaction_schema,
      header=True,
      dateFormat='yyyyMMdd'
      )
    )

# persist in delta lake format
( transactions
    .write
    .format('delta')
    .partitionBy('transaction_date')
    .mode('overwrite')
    .save('/mnt/kkbox/silver/transactions')
  )

# create table object to make delta lake queriable
spark.sql('''
  CREATE TABLE kkbox.transactions
  USING DELTA 
  LOCATION '/mnt/kkbox/silver/transactions'
  ''')
Out[9]: DataFrame[]
# members dataset schema
member_schema = StructType([
  StructField('msno', StringType()),
  StructField('city', IntegerType()),
  StructField('bd', IntegerType()),
  StructField('gender', StringType()),
  StructField('registered_via', IntegerType()),
  StructField('registration_init_time', DateType())
  ])

# read data from csv
members = (
  spark
    .read
    .csv(
      '/mnt/kkbox/members/members_v3.csv',
      schema=member_schema,
      header=True,
      dateFormat='yyyyMMdd'
      )
    )

# persist in delta lake format
(
  members
    .write
    .format('delta')
    .mode('overwrite')
    .save('/mnt/kkbox/silver/members')
  )

# create table object to make delta lake queriable
spark.sql('''
  CREATE TABLE kkbox.members 
  USING DELTA 
  LOCATION '/mnt/kkbox/silver/members'
  ''')
Out[10]: DataFrame[]

Step 2: Construct Membership Event Dataset

With the data now loaded into a more accessible format, let's begin constructing subscription lifespan information. To do this, we need to examine the membership-relevant information contained in the transaction logs. As is typical of many applications, the transaction log data records quite a bit more information than just subscription updates. We'll need to weed out entries that are not relevant to our analysis and begin stitching together a picture of changes to subscriptions that indicate when a customer joined, how the subscription changed over time, and ultimately when a customer abandoned the subscription, should that event have transpired.

Given some limitations of the dataset and some quirks of the KKBox business model addressed in the Kaggle challenge documentation, this process is quite involved. To assist us in developing the logic required, let's narrow our focus to one customer, msno = WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=, examining the transaction log information associated with this person:

%sql  -- all transactions for customer msno WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=

SELECT *
FROM kkbox.transactions a
WHERE a.msno='WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8='
ORDER BY a.transaction_date, a.membership_expire_date
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014911912015-01-102018-06-020
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-01-212018-07-030
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-01-222018-08-030
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-01-252018-09-030
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014911912015-02-102018-10-010
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-02-212018-10-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-02-222018-11-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-02-252018-12-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014911912015-03-102019-01-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-03-212019-02-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-03-222019-03-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-03-252019-04-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014911912015-04-102019-05-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-04-212019-06-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-04-222019-07-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=410014912015-04-252019-08-250
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=410011912015-05-102019-09-250
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-05-212019-10-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-05-222019-11-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-05-252019-12-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014911912015-06-102020-01-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-06-212020-02-250
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-06-222020-03-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-06-252020-04-250
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014911912015-07-102020-05-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-07-212020-06-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-07-222020-07-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-07-252020-08-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-08-102020-09-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-08-212020-10-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-08-222020-11-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-08-252020-12-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-09-102021-01-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-09-212021-02-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-09-222021-03-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-09-252021-04-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-10-102021-05-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-10-212021-06-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-10-232021-07-300
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-10-252021-08-300
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-11-102021-09-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-11-212021-10-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-11-222021-11-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-11-252021-12-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-12-102022-01-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-12-122022-01-181
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413012912912015-12-122022-02-070
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912015-12-242013-12-261
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413012912912015-12-242014-08-111
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912016-01-102016-02-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912016-02-102016-03-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912016-03-102016-04-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912016-04-102016-05-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912016-05-102016-06-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912016-06-102016-07-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912016-07-102016-08-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912016-08-102016-09-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912016-09-102016-10-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=4130999912016-09-252016-09-251
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=413014914912016-09-252016-09-251
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=4130999912016-09-252016-09-251
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=4130999912016-09-252016-10-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=4130999912016-09-252016-10-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=4130999912016-09-252016-10-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=4130999912016-10-242016-11-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=4130999912016-11-242016-12-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=4130999912016-12-242017-01-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=4130999912017-01-242017-02-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=4130999912017-02-242017-03-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=4130999912017-03-242017-04-240

The transaction log is quite busy, often with multiple entries recorded on given transaction date (transaction_date) as can be observed for this customer on several dates including 2016-09-25. Based on documentation and the analysis of others who have used this dataset, we might consider records with a value of zero or less for payment plan days (payment_plan_days) as not relevant for our analysis. This still leaves a lot of entries in the log for a given transaction date.

Looking a little more closely at the transaction entries on a given date, it appears many are changing the subscription's expiration date (membership_expire_date). There is no way to determine which is the final entry for a given date as the time part of the transaction date field is truncated. Given this limitation, we'll make the assumption that the expiration date furthest into the future on a given transaction date is the expiration date on the account at that point in time. While this may not be a perfectly valid assumption, it appears that subscription change events trigger follow up transactions on later transaction dates and the expiration date associated with these seems to be more stable. So while we may not have the information exactly right on a given transaction date, we can still get an accurate understanding of the expiration date as we look over the range of entries associated with the account:

%sql  -- drop payment_plan_days of zero or less
      --   and select largest exp date on a trans date  

SELECT 
  a.msno, 
  a.transaction_date as trans_at, 
  MAX(a.membership_expire_date) as expires_at 
FROM kkbox.transactions a
WHERE  
  a.payment_plan_days > 0 AND 
  a.msno='WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8='
GROUP BY
  a.msno, a.transaction_date
ORDER BY msno, trans_at, expires_at
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-01-102018-06-02
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-01-212018-07-03
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-01-222018-08-03
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-01-252018-09-03
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-02-102018-10-01
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-02-212018-10-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-02-222018-11-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-02-252018-12-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-03-102019-01-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-03-212019-02-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-03-222019-03-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-03-252019-04-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-04-102019-05-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-04-212019-06-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-04-222019-07-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-05-212019-10-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-05-222019-11-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-05-252019-12-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-06-102020-01-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-06-212020-02-25
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-06-222020-03-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-06-252020-04-25
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-07-102020-05-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-07-212020-06-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-07-222020-07-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-07-252020-08-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-08-102020-09-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-08-212020-10-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-08-222020-11-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-08-252020-12-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-09-102021-01-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-09-212021-02-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-09-222021-03-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-09-252021-04-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-10-102021-05-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-10-212021-06-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-10-232021-07-30
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-10-252021-08-30
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-11-102021-09-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-11-212021-10-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-11-222021-11-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-11-252021-12-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-12-102022-01-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-12-122022-02-07
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-12-242014-08-11
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-01-102016-02-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-02-102016-03-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-03-102016-04-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-04-102016-05-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-05-102016-06-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-06-102016-07-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-07-102016-08-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-08-102016-09-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-09-102016-10-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-09-252016-10-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-10-242016-11-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-11-242016-12-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-12-242017-01-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-01-242017-02-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-02-242017-03-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-03-242017-04-24

Examining the more condensed transaction history for this customer, we can see another quirk in the data occurring on Deceber 24, 2015 (trans_at = 2015-12-24). On this date, the expiration date on the account is reset to a date that occurs prior to subscription initiation. The next transaction date, the account is reset to a future date.

The backdated record is likely due to some kind of subscription management activity such as a change in auto-renewal status or the like. Regardless of the reason, it's clear the backdated value should not be considered for churn identification purposes. Instead, we might simply add logic so that if a backdated expiration date appears in the log, it is reset to the transaction date with which it is associated:

%sql  -- correct expiration dates that have been backdated

SELECT
  x.msno,
  x.trans_at,
  CASE   -- if expiration date is prior to transaction date, then expiration date = transaction date
    WHEN x.expires_at < x.trans_at THEN trans_at
    ELSE expires_at 
    END as expires_at
FROM (
  SELECT
    a.msno, 
    a.transaction_date as trans_at, 
    MAX(a.membership_expire_date) as expires_at 
  FROM kkbox.transactions a
  WHERE  
    a.payment_plan_days > 0 
  GROUP BY
    a.msno, a.transaction_date
  ) x
WHERE x.msno='WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8='
ORDER BY msno, trans_at, expires_at
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-01-102018-06-02
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-01-212018-07-03
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-01-222018-08-03
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-01-252018-09-03
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-02-102018-10-01
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-02-212018-10-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-02-222018-11-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-02-252018-12-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-03-102019-01-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-03-212019-02-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-03-222019-03-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-03-252019-04-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-04-102019-05-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-04-212019-06-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-04-222019-07-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-05-212019-10-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-05-222019-11-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-05-252019-12-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-06-102020-01-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-06-212020-02-25
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-06-222020-03-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-06-252020-04-25
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-07-102020-05-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-07-212020-06-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-07-222020-07-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-07-252020-08-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-08-102020-09-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-08-212020-10-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-08-222020-11-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-08-252020-12-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-09-102021-01-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-09-212021-02-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-09-222021-03-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-09-252021-04-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-10-102021-05-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-10-212021-06-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-10-232021-07-30
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-10-252021-08-30
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-11-102021-09-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-11-212021-10-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-11-222021-11-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-11-252021-12-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-12-102022-01-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-12-122022-02-07
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-12-242015-12-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-01-102016-02-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-02-102016-03-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-03-102016-04-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-04-102016-05-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-05-102016-06-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-06-102016-07-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-07-102016-08-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-08-102016-09-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-09-102016-10-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-09-252016-10-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-10-242016-11-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-11-242016-12-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-12-242017-01-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-01-242017-02-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-02-242017-03-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-03-242017-04-24

Finally, we must note that the overall transaction dataset terminates on March 31, 2017. In that regard, we are observing the transaction history backward from an effective current date of April 1, 2017.

In the Kaggle challenge documentation associated with this dataset, KKBox explains that customers should not be considered as having churned until 30-days have passed since a subscription's expiration with no changes to the account that push the expiration date forward. With this in mind, we need to place a dummy entry into the transaction dataset for April 1, 2017 that will prevent us from considering accounts that have not yet completed the 30-day expiration window as churned. To fully understand the effect of this entry, you'll need to take a look at how we identify churn in the next section of this notebook:

%sql  -- add dummy transaction entries for 2017-04-01 to prevent misclassification of churn in subsequent steps 

SELECT
  x.msno,
  x.trans_at,
  CASE 
    WHEN x.expires_at < x.trans_at THEN trans_at
    ELSE expires_at 
    END as expires_at
FROM (
  SELECT
    a.msno, 
    a.transaction_date as trans_at, 
    MAX(a.membership_expire_date) as expires_at 
  FROM kkbox.transactions a
  WHERE  
    a.payment_plan_days > 0
  GROUP BY
    a.msno, a.transaction_date
  UNION ALL
  SELECT DISTINCT  -- dummy entries to protect churn calculations
    msno,
    TO_DATE('2017-04-01') as trans_at,
    TO_DATE('2017-04-01') as expires_at
  FROM kkbox.transactions
  ) x
WHERE x.msno='WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8='
ORDER BY msno, trans_at, expires_at
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-01-102018-06-02
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-01-212018-07-03
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-01-222018-08-03
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-01-252018-09-03
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-02-102018-10-01
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-02-212018-10-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-02-222018-11-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-02-252018-12-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-03-102019-01-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-03-212019-02-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-03-222019-03-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-03-252019-04-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-04-102019-05-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-04-212019-06-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-04-222019-07-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-05-212019-10-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-05-222019-11-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-05-252019-12-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-06-102020-01-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-06-212020-02-25
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-06-222020-03-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-06-252020-04-25
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-07-102020-05-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-07-212020-06-26
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-07-222020-07-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-07-252020-08-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-08-102020-09-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-08-212020-10-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-08-222020-11-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-08-252020-12-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-09-102021-01-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-09-212021-02-27
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-09-222021-03-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-09-252021-04-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-10-102021-05-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-10-212021-06-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-10-232021-07-30
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-10-252021-08-30
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-11-102021-09-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-11-212021-10-29
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-11-222021-11-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-11-252021-12-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-12-102022-01-28
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-12-122022-02-07
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-12-242015-12-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-01-102016-02-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-02-102016-03-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-03-102016-04-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-04-102016-05-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-05-102016-06-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-06-102016-07-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-07-102016-08-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-08-102016-09-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-09-102016-10-10
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-09-252016-10-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-10-242016-11-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-11-242016-12-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-12-242017-01-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-01-242017-02-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-02-242017-03-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-03-242017-04-24
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-04-012017-04-01

Step 3: Identify Churn Events

We now have a relatively clean history of membership-relevant events for this customer. This will allow us to compare each transaction to its immediately preceding transaction to see if it resulted in a new expiration date being set. Such events may represent subscription renewals or extensions:

%sql  -- compare membership-relevant transactions to prior transactions to identify expiration changes

WITH trans AS (  -- membership event dataset (from previous section)
  SELECT  -- -----------------------------------------------
    x.msno,
    x.trans_at,
    CASE 
      WHEN x.expires_at < x.trans_at THEN trans_at
      ELSE expires_at 
      END as expires_at
  FROM (
    SELECT
      a.msno, 
      a.transaction_date as trans_at, 
      MAX(a.membership_expire_date) as expires_at 
    FROM kkbox.transactions a
    WHERE  
      a.payment_plan_days > 0
    GROUP BY
      a.msno, a.transaction_date
    UNION ALL
    SELECT DISTINCT 
      msno,
      TO_DATE('2017-04-01') as trans_at,
      TO_DATE('2017-04-01') as expires_at
    FROM kkbox.transactions
    ) x
  )      -- -----------------------------------------------
  SELECT
    msno,
    LAG(expires_at) OVER(PARTITION BY msno ORDER BY trans_at) as previous_expires_at,
    trans_at,
    expires_at,
    CASE   -- idnetify meaningful events that change expiration date
      WHEN (LAG(expires_at, 1) OVER(PARTITION BY msno ORDER BY trans_at)) IS NULL THEN 1  -- new customer registration
      WHEN (LAG(expires_at, 1) OVER(PARTITION BY msno ORDER BY trans_at)) != expires_at THEN 1 -- change in expiration date
      ELSE 0
      END as meaningful_event
  FROM trans
  WHERE msno='WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8='
  ORDER BY msno, trans_at, expires_at
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=null2015-01-102018-06-021
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-06-022015-01-212018-07-031
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-07-032015-01-222018-08-031
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-08-032015-01-252018-09-031
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-09-032015-02-102018-10-011
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-10-012015-02-212018-10-291
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-10-292015-02-222018-11-261
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-11-262015-02-252018-12-241
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-12-242015-03-102019-01-241
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-01-242015-03-212019-02-241
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-02-242015-03-222019-03-271
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-03-272015-03-252019-04-271
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-04-272015-04-102019-05-271
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-05-272015-04-212019-06-261
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-06-262015-04-222019-07-261
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-07-262015-05-212019-10-261
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-10-262015-05-222019-11-261
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-11-262015-05-252019-12-271
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-12-272015-06-102020-01-261
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-01-262015-06-212020-02-251
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-02-252015-06-222020-03-261
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-03-262015-06-252020-04-251
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-04-252015-07-102020-05-261
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-05-262015-07-212020-06-261
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-06-262015-07-222020-07-271
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-07-272015-07-252020-08-271
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-08-272015-08-102020-09-271
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-09-272015-08-212020-10-281
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-10-282015-08-222020-11-281
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-11-282015-08-252020-12-291
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-12-292015-09-102021-01-281
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-01-282015-09-212021-02-271
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-02-272015-09-222021-03-291
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-03-292015-09-252021-04-281
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-04-282015-10-102021-05-291
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-05-292015-10-212021-06-291
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-06-292015-10-232021-07-301
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-07-302015-10-252021-08-301
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-08-302015-11-102021-09-291
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-09-292015-11-212021-10-291
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-10-292015-11-222021-11-281
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-11-282015-11-252021-12-281
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-12-282015-12-102022-01-281
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2022-01-282015-12-122022-02-071
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2022-02-072015-12-242015-12-241
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-12-242016-01-102016-02-101
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-02-102016-02-102016-03-101
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-03-102016-03-102016-04-101
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-04-102016-04-102016-05-101
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-05-102016-05-102016-06-101
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-06-102016-06-102016-07-101
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-07-102016-07-102016-08-101
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-08-102016-08-102016-09-101
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-09-102016-09-102016-10-101
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-10-102016-09-252016-10-241
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-10-242016-10-242016-11-241
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-11-242016-11-242016-12-241
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-12-242016-12-242017-01-241
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-01-242017-01-242017-02-241
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-02-242017-02-242017-03-241
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-03-242017-03-242017-04-241
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-04-242017-04-012017-04-011

If we narrow our focus to just the meaningful membership events, we can now examine the days between the previous expiration date and the current transaction date (where the expiration date is pushed forward). If more than 30 days has passed since expiration without a meaningful transaction, i.e. one that moves the expiration date forward, we would consider that churn has taken place.

Of course, this calculation depends on a transaction having been recorded after the expiration date. With many accounts, customers appear to have walked away for 6-months to a year to then renew their account. In these scenarios, a churn event would be recorded with the subsequent activation forming a new potential chain of meaningful membership events. In situations where no subsequent transactions occur, our dummy records inserted for April 1, 2017 will ensure we can correctly identify a churn event as having taken place:

%sql  -- identify churn events as those there the next transaction is 30+ days from prior expiration date

WITH trans AS (  
  SELECT  
    x.msno,
    x.trans_at,
    CASE 
      WHEN x.expires_at < x.trans_at THEN trans_at
      ELSE expires_at 
      END as expires_at
  FROM (
    SELECT
      a.msno, 
      a.transaction_date as trans_at, 
      MAX(a.membership_expire_date) as expires_at 
    FROM kkbox.transactions a
    WHERE  
      a.payment_plan_days > 0
    GROUP BY
      a.msno, a.transaction_date
    UNION ALL
    SELECT DISTINCT  
      msno,
      TO_DATE('2017-04-01') as trans_at,
      TO_DATE('2017-04-01') as expires_at
    FROM kkbox.transactions
    ) x
  )      
  
  SELECT
    msno,
    previous_expires_at,
    trans_at,
    expires_at,
    CASE  -- identify churn events
      WHEN DATEDIFF(trans_at, previous_expires_at) > 30 THEN 1
      ELSE 0
      END as churn_event
  FROM (
    SELECT
      msno,
      LAG(expires_at) OVER(PARTITION BY msno ORDER BY trans_at) as previous_expires_at,
      trans_at,
      expires_at,
      CASE 
        WHEN (LAG(expires_at, 1) OVER(PARTITION BY msno ORDER BY trans_at)) IS NULL THEN 1  
        WHEN (LAG(expires_at, 1) OVER(PARTITION BY msno ORDER BY trans_at)) != expires_at THEN 1 
        ELSE 0
        END as meaningful_event
    FROM trans
    )
  WHERE 
    meaningful_event=1
    AND msno='WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8='
  ORDER BY msno, trans_at, expires_at
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=null2015-01-102018-06-020
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-06-022015-01-212018-07-030
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-07-032015-01-222018-08-030
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-08-032015-01-252018-09-030
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-09-032015-02-102018-10-010
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-10-012015-02-212018-10-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-10-292015-02-222018-11-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-11-262015-02-252018-12-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-12-242015-03-102019-01-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-01-242015-03-212019-02-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-02-242015-03-222019-03-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-03-272015-03-252019-04-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-04-272015-04-102019-05-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-05-272015-04-212019-06-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-06-262015-04-222019-07-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-07-262015-05-212019-10-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-10-262015-05-222019-11-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-11-262015-05-252019-12-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-12-272015-06-102020-01-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-01-262015-06-212020-02-250
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-02-252015-06-222020-03-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-03-262015-06-252020-04-250
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-04-252015-07-102020-05-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-05-262015-07-212020-06-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-06-262015-07-222020-07-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-07-272015-07-252020-08-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-08-272015-08-102020-09-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-09-272015-08-212020-10-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-10-282015-08-222020-11-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-11-282015-08-252020-12-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-12-292015-09-102021-01-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-01-282015-09-212021-02-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-02-272015-09-222021-03-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-03-292015-09-252021-04-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-04-282015-10-102021-05-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-05-292015-10-212021-06-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-06-292015-10-232021-07-300
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-07-302015-10-252021-08-300
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-08-302015-11-102021-09-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-09-292015-11-212021-10-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-10-292015-11-222021-11-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-11-282015-11-252021-12-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-12-282015-12-102022-01-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2022-01-282015-12-122022-02-070
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2022-02-072015-12-242015-12-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-12-242016-01-102016-02-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-02-102016-02-102016-03-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-03-102016-03-102016-04-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-04-102016-04-102016-05-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-05-102016-05-102016-06-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-06-102016-06-102016-07-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-07-102016-07-102016-08-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-08-102016-08-102016-09-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-09-102016-09-102016-10-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-10-102016-09-252016-10-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-10-242016-10-242016-11-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-11-242016-11-242016-12-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-12-242016-12-242017-01-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-01-242017-01-242017-02-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-02-242017-02-242017-03-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-03-242017-03-242017-04-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-04-242017-04-012017-04-010

Notice in the results that the downstream transaction, whether a resubscription or simply a dummy transaction, is the one flagged as being associated with the churn. Pulling the flags back one transaction date associates the flag with the event where we'd intuitively understand churn to have occurred:

%sql -- move churn flag to prior transaction record

WITH trans AS ( 
  SELECT  
    x.msno,
    x.trans_at,
    CASE 
      WHEN x.expires_at < x.trans_at THEN trans_at
      ELSE expires_at 
      END as expires_at
  FROM (
    SELECT 
      a.msno, 
      a.transaction_date as trans_at, 
      MAX(a.membership_expire_date) as expires_at 
    FROM kkbox.transactions a
    WHERE  
      a.payment_plan_days > 0
    GROUP BY
      a.msno, a.transaction_date
    UNION ALL
    SELECT DISTINCT  
      msno,
      TO_DATE('2017-04-01') as trans_at,
      TO_DATE('2017-04-01') as expires_at
    FROM kkbox.transactions
    ) x
  )      
  SELECT
    msno,
    previous_expires_at,
    trans_at,
    expires_at,
    LEAD(churn_event, 1) OVER (PARTITION BY msno ORDER BY trans_at) as churned -- adjust churn flag assignment
  FROM (
    SELECT
      msno,
      previous_expires_at,
      trans_at,
      expires_at,
      CASE
        WHEN DATEDIFF(trans_at, previous_expires_at) > 30 THEN 1
        ELSE 0
        END as churn_event
    FROM (
      SELECT
        msno,
        LAG(expires_at) OVER(PARTITION BY msno ORDER BY trans_at) as previous_expires_at,
        trans_at,
        expires_at,
        CASE 
          WHEN (LAG(expires_at, 1) OVER(PARTITION BY msno ORDER BY trans_at)) IS NULL THEN 1  
          WHEN (LAG(expires_at, 1) OVER(PARTITION BY msno ORDER BY trans_at)) != expires_at THEN 1
          ELSE 0
          END as meaningful_event
      FROM trans
      )
    WHERE meaningful_event=1
    )
  WHERE msno='WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8='
  ORDER BY msno, trans_at, expires_at
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=null2015-01-102018-06-020
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-06-022015-01-212018-07-030
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-07-032015-01-222018-08-030
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-08-032015-01-252018-09-030
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-09-032015-02-102018-10-010
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-10-012015-02-212018-10-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-10-292015-02-222018-11-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-11-262015-02-252018-12-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-12-242015-03-102019-01-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-01-242015-03-212019-02-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-02-242015-03-222019-03-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-03-272015-03-252019-04-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-04-272015-04-102019-05-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-05-272015-04-212019-06-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-06-262015-04-222019-07-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-07-262015-05-212019-10-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-10-262015-05-222019-11-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-11-262015-05-252019-12-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-12-272015-06-102020-01-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-01-262015-06-212020-02-250
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-02-252015-06-222020-03-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-03-262015-06-252020-04-250
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-04-252015-07-102020-05-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-05-262015-07-212020-06-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-06-262015-07-222020-07-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-07-272015-07-252020-08-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-08-272015-08-102020-09-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-09-272015-08-212020-10-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-10-282015-08-222020-11-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-11-282015-08-252020-12-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-12-292015-09-102021-01-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-01-282015-09-212021-02-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-02-272015-09-222021-03-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-03-292015-09-252021-04-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-04-282015-10-102021-05-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-05-292015-10-212021-06-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-06-292015-10-232021-07-300
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-07-302015-10-252021-08-300
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-08-302015-11-102021-09-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-09-292015-11-212021-10-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-10-292015-11-222021-11-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-11-282015-11-252021-12-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-12-282015-12-102022-01-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2022-01-282015-12-122022-02-070
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2022-02-072015-12-242015-12-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-12-242016-01-102016-02-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-02-102016-02-102016-03-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-03-102016-03-102016-04-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-04-102016-04-102016-05-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-05-102016-05-102016-06-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-06-102016-06-102016-07-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-07-102016-07-102016-08-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-08-102016-08-102016-09-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-09-102016-09-102016-10-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-10-102016-09-252016-10-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-10-242016-10-242016-11-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-11-242016-11-242016-12-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-12-242016-12-242017-01-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-01-242017-01-242017-02-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-02-242017-02-242017-03-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-03-242017-03-242017-04-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-04-242017-04-012017-04-01null

This now leaves our dummy transaction records as having a NULL flag value. By removing these, we now have an accurate subscription history log for our customer:

%sql  -- remove the dummy transactions added earlier

WITH trans AS (  
  SELECT  
    x.msno,
    x.trans_at,
    CASE 
      WHEN x.expires_at < x.trans_at THEN trans_at
      ELSE expires_at 
      END as expires_at
  FROM (
    SELECT 
      a.msno, 
      a.transaction_date as trans_at, 
      MAX(a.membership_expire_date) as expires_at 
    FROM kkbox.transactions a
    WHERE  
      a.payment_plan_days > 0
    GROUP BY
      a.msno, a.transaction_date
    UNION ALL
    SELECT DISTINCT  
      msno,
      TO_DATE('2017-04-01') as trans_at,
      TO_DATE('2017-04-01') as expires_at
    FROM kkbox.transactions
    ) x
  )      
SELECT *
FROM (
  SELECT
    msno,
    previous_expires_at,
    trans_at,
    expires_at,
    LEAD(churn_event, 1) OVER (PARTITION BY msno ORDER BY trans_at) as churned 
  FROM (
    SELECT
      msno,
      previous_expires_at,
      trans_at,
      expires_at,
      CASE
        WHEN DATEDIFF(trans_at, previous_expires_at) > 30 THEN 1
        ELSE 0
        END as churn_event
    FROM (
      SELECT
        msno,
        LAG(expires_at) OVER(PARTITION BY msno ORDER BY trans_at) as previous_expires_at,
        trans_at,
        expires_at,
        CASE 
          WHEN (LAG(expires_at, 1) OVER(PARTITION BY msno ORDER BY trans_at)) IS NULL THEN 1 
          WHEN (LAG(expires_at, 1) OVER(PARTITION BY msno ORDER BY trans_at)) != expires_at THEN 1 
          ELSE 0
          END as meaningful_event
      FROM trans
      )
    WHERE meaningful_event=1
    )
  )
WHERE churned IS NOT NULL  -- remove dummy transaction entries
      AND msno='WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8='
ORDER BY msno, trans_at, expires_at
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=null2015-01-102018-06-020
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-06-022015-01-212018-07-030
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-07-032015-01-222018-08-030
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-08-032015-01-252018-09-030
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-09-032015-02-102018-10-010
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-10-012015-02-212018-10-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-10-292015-02-222018-11-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-11-262015-02-252018-12-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2018-12-242015-03-102019-01-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-01-242015-03-212019-02-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-02-242015-03-222019-03-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-03-272015-03-252019-04-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-04-272015-04-102019-05-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-05-272015-04-212019-06-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-06-262015-04-222019-07-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-07-262015-05-212019-10-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-10-262015-05-222019-11-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-11-262015-05-252019-12-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2019-12-272015-06-102020-01-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-01-262015-06-212020-02-250
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-02-252015-06-222020-03-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-03-262015-06-252020-04-250
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-04-252015-07-102020-05-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-05-262015-07-212020-06-260
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-06-262015-07-222020-07-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-07-272015-07-252020-08-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-08-272015-08-102020-09-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-09-272015-08-212020-10-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-10-282015-08-222020-11-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-11-282015-08-252020-12-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2020-12-292015-09-102021-01-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-01-282015-09-212021-02-270
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-02-272015-09-222021-03-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-03-292015-09-252021-04-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-04-282015-10-102021-05-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-05-292015-10-212021-06-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-06-292015-10-232021-07-300
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-07-302015-10-252021-08-300
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-08-302015-11-102021-09-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-09-292015-11-212021-10-290
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-10-292015-11-222021-11-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-11-282015-11-252021-12-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2021-12-282015-12-102022-01-280
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2022-01-282015-12-122022-02-070
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2022-02-072015-12-242015-12-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2015-12-242016-01-102016-02-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-02-102016-02-102016-03-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-03-102016-03-102016-04-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-04-102016-04-102016-05-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-05-102016-05-102016-06-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-06-102016-06-102016-07-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-07-102016-07-102016-08-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-08-102016-08-102016-09-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-09-102016-09-102016-10-100
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-10-102016-09-252016-10-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-10-242016-10-242016-11-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-11-242016-11-242016-12-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2016-12-242016-12-242017-01-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-01-242017-01-242017-02-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-02-242017-02-242017-03-240
WAMleIuG124oDSSdhcZIECwLcHHHbk4or0y6gxkK7t8=2017-03-242017-03-242017-04-240

Let's remove the customer-specific constraint, clean up our final SELECT-list and convert our query to a view to make its use across all customer subscriptions a little easier going forward: