Marketing Analytics RM Report

Marketing AnalyticsGroup 2: Logan Moore, Jennifer Eickert, Madeline Rynkiewicz, Lauryn Jashinski

Model Overview

Special Considerations

The main factors affecting the performance of our model was how we optimized the attributes selected and the parameters within the decision tree.

1. Weight by Gini Index: We ran six different weighting operators and Gini Index provided the most balanced results.

2. Select by Weight: This easily allowed us to choose the top 10 attributes to base the model off of.

3. Replace Missing Values: After trial and error, the best model was predicting after changing missing attribute values to averages.

4. Filter Examples: A rigorous process of examining the weights of each of the attribute, the mean and standard deviation of each attribute, and the overall effect of outliers on the model ensued to find the best prediction. (*Important note: ‘Custom Filters’ can only be applied in RapidMiner 6, downloaded on Logan’s personal computer)

Trial and Error Process for Filtering

eqpdays 1 eqpdays 1 eqpdays 1

months 0.644121 months 0.797907 months 0.790851

retcalls 0.45587 mou 0.248054 mou 0.232883

webcap 0.37495 retcalls 0.19691 retcalls 0.193697creditde 0.28094 webcap 0.160545 webcap 0.160489changem 0.266191 incalls 0.136913 incalls 0.137575changer 0.243359 creditde 0.120039 creditde 0.12025mou 0.156781 changem 0.110988 changem 0.111546

retaccpt 0.135313 changer 0.106593 outcalls 0.105707

phones 0.079818 outcalls 0.105258 unansvce 0.104361

Chi-Square Info Gain Gini

eqpdays 1 mou 1 retcalls 1

retcalls 0.623101 changem 0.494791 eqpdays 0.804925

webcap 0.59594 eqpdays 0.489207 webcap 0.684583

creditde 0.515741 revenue 0.084119 changer 0.60161mou 0.500321 changer 0.075769 creditde 0.498742incalls 0.372892 unansvce 0.072891 months 0.487047retaccpt 0.353905 outcalls 0.066249 changem 0.467689phones 0.334674 incalls 0.031656 retaccpt 0.326418

outcalls 0.331368 blckvce 0.02 mou 0.215435

changem 0.327312 months 0.018341 callwait 0.182593

Deviation UncertaintyCorrelation

eqpdays < 907

months < 44

mou < 1000

retcalls < 1webcap NA NAcreditde NA NAincalls < 42outcalls < 95

changem > -500 < 500

retaccpt < 1

changer < 204

MASTER FILTER @95%

5. Decision Tree: The Gini Index was used within the decision tree. This corresponds to the weighting measure by the Gini Index. Decision trees are the least restrictive of all models and do not assume normal distributions. This is especially useful since some attributes had shown that the distribution of their values was subject to skewnewss. A trial and error process was used to maximize the parameters (shown below).

Base Optimized

Model 1 Performance

Training

Validation

Scoring

Filters

This model has a solid performance because the ‘No’ validation is well above 40% and the ‘Yes’ Validation has a relatively high validation of 76.95%. The balanced prediction of ‘Yes’ and ‘No’ in the scoring data can be held with reasonable confidence for ‘Yes’. The validating model actually performs better than the training data, which is an anomaly, but does further indicate its solid all-around performance. Five filters were chosen that removed outliers of highly weighted attributes. This process adequately scrubbed the data. More research could be conducted into individual responses that contain outlier values, which may boost both ‘Yes’ and ‘No’ validation performances. This is a very rigorous process, even with adequate RapidMiner operators, which pervades the scope of this course.

Model 2 Performance

Training

Validation

Scoring

Filters

By simply removing ‘outcalls’ from the custom filter, the performance of the model drastically changed. This model can predict churn with 86% confidence, a 9% increase from the previous model. However, the retention prediction drops considerably (21%). Ultimately, the marginal gain in validation performance for churn skews this model and it appears that too many customers are predicted to will now churn.

Model 3 Performance

Training

Validation

Scoring

Filters

When only selecting the 6 most important attributes and filtering them for them accordingly, the performance validation of this model reflects the higher ‘yes’ of Model 2 and the higher ‘no’ of Model 1, in relation to whether or not a customer will churn. Once again, the validation performance of retention seems to be too low, where the prediction is too far out of balance.

Profile of Churning Customers

Customers that churn are expected to have fewer days with their equipment and less months than loyal customers. They will not place calls to the retention team or accept retention offers. They are also more likely to have lower/poor credit.

Documents

Marketing Analytics RM Report