DM: RE: churn

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Subscribe]
DM: RE: churn

From: Randy Kerber
Date: Fri, 10 Apr 1998 05:12:23 -0400 (EDT)

Hallberg,
While many of the factors mentioned by previous responders can 
cause an 
increase in the error, none, aside from the suggestion to look for a 
data-encoding error, seems likely to explain such a dramatic drop 
(from 90% to 
11%).
I would tend to doubt either hypothesis B or C, though the wrong 
software 
bug could certainly cause a large error.
As for hypothesis B, despite all of the noise made by tool 
vendors telling 
you that you need a supercomputer to munge through millions of data 
points, 8000 
data points is usually quite adequate (unless you have way too many 
variables).  Except in rare circumstances, having a million data 
points 
would likely only cause a small increase in accuracy. 
One question I have is what exactly is meant by "90% of 
real churn 
anticipated"?  Does this mean that 90% of the time you 
predict churn 
it really is churn?  90% correct predictions?  You predict 
churn for 
90% of the cases that actually are churn?  It can make a big 
difference.  After all, if I always predict "churn", 
then I will 
have 100% according to the third definition.
It would be helpful to know the accuracy results of a test set 
for the same 
time period.  That would give some indication as to how much the 
neural net 
itself might be overfitting.  But overfitting in this sense, 
unless it's 
really extreme, generally increases the error no more than a few 
percent.
However, even if the test set indicates that the model is not 
overfitting, 
it really only tells you the expected accuracy "for that time 
period".  Once you train on one time period to predict the 
next (which 
you can't avoid in this situation) then all bets are off as to 
accuracy 
estimates on test sets.  The accuracy achieved on a test set 
relies on the 
assumption that the data points are coming from the same population 
as the 
training set, which is not true when moving to the future time 
period.
Thus, seasonality could cause a major drop in accuracy.  It 
doesn't 
need to be based on seasons, though.  It just depends on changes 
in the 
world from one time period to the next.  Consider a contrived, 
exaggerated 
example.  Suppose that for the July-Sept. period, the key 
indicator was the 
length of a certain kind of call, where the churners made shorter 
calls (say 3-5 
minutes) and the non-churners made longer calls (say 7-10 
minutes).  Then 
assume that for whatever reason, in the following period people 
generally double 
the typical length of their calls (perhaps you offer incentives for 
longer 
calls).  Now, the neural net is hardwired to predict churn for 
the 3-5 
minute range, but the churners are now making 6-10 minute calls, 
right in the 
range the model is wired to predict "non-churn".  

Actually, this is nothing particular with neural nets.  
Decision 
trees, rules, and clustering would all get snared in the same 
trap.  Of 
course the problem with neural nets is you're given little clue as to 
what the 
model has actually learned.  With the other methods there is at 
least the 
possibility that you'll anticipate the trouble.  If you see a 
rule about 
churners and 3-5 minute calls and you know that call length is 
increasing, it 
should make you suspicious.  The bottom line is, that it is 
COMPLETELY up 
to the ANALYST to avoid this trap.
The other prime candidate to explain your results is quite 
different.  
It could be that your model is completely valid!  You didn't say 
what the 
typical churn rate is, but I would guess (and hope, for the sake of 
the 
business) that churners over a 3-month time period are a significant 
minority.  In the dataset you created, there are equals numbers 
of each 
group.  At this point it would really help to know exactly what 
the 90% 
accuracy means.  
Assume that for the actual churners you predict churn 90% of the 
time and 
for the non-churners you predict churn 10% of the time.  Also 
assume that 
the total population is 100,000 and that the actual churn rate is 
1%.  In 
this case, you will predict "churn" for 900 of the 1000 
actual 
churners, and predict "churn" for 9900 of the 99,000 actual 
non-churners.  Therefore, you predict "churn" 10,800 
times (900 + 
9900) but are only correct 8.33% of the time (900 / 10,800)!  
Thus, you 
have gone from 90% on the training set to 8.33% in the real world, 
even if there 
is no overfitting and no seasonality.
I don't know if this can explain your situation.  It would 
be 
necessary to know the actual numbers.
Randy Kerber
Knowledge Discovery Group
NCR Human Interface Technology Center
kerber@best.com
Randy.Kerber@atlantaga.ncr.com
(408) 244-0624
fax: (408) 260-8551
  Dear Friend
  
  I have now a problem and would like to share with you and 
to receive, 
if possible your opinion.
  Hereinafter the picture of situation:
  1.I'm presently involved in a customer profiling project 
for a large mobile operator.
  2.The goal is to set up a system able to anticipate the 
likelyhood of churn of  customers
  3.As a pilot step I extracted call records for 10000 active 
customers plus 4000  churned
  4.Using SPSS neural connection I made up a neural network 
based on a set of 4000  active+4000 churned
  5.The data was: calling patterns of july, agoust and 
september the target was:  churn/no churn situation in december
  6.The results was promising: 90% of real churn anticipated, 
with a cut-off probability of 80%
  7.The same network was used on october, nov, dec. data to 
anticipate march churn the results dropped to a terrific 11% 
with the same cut-off of 80%: totally  useless
  
  
  I have formulated some hypotheses
  A.The low time span (three month) is affected by seasonality
  B.The data used are not sufficient to build a reliable network
  C.The tool (SPSS Neural Connection) is not reliable
  
  Could you give your opinion?
  Many thanks in advance
Prev by Date: DM: Probabilistic prediction of football
Next by Date: DM: Hallberg Rassy's problem
Prev by thread: DM: Hallberg Rassy's problem
Next by thread: DM: Probabilistic prediction of football
Index(es):
- Date
- Thread