Retention is regularly quoted as the no1 priority of CEOs and CMOs (Forbes 2011, 2014). Despite the recent advances in (machine-learning based) prediction tools for customer defection, companies have difficulties effectively managing defection. As a result, defection rates remain high. This puts pressure on the customer intelligence departments to develop more reliable predictive and prescriptive analytics for retention management.
My research agenda is articulated around the following questions: whom to target, when to target, with what retention incentive, and so what is the (expected) return-on-investment of the action and how can we engage retained customers. These are they building blocks of a data-driven customer-centric decision support system for retention management. The goal is to build the system on top of the data infrastructure of organizations and to inform them in real time about the optimal targeting policy. Customers interact with organizations through myriad touch points in multiple channels and media, and each interaction provides useful information on the customer’s attitude and future behavior throughout the customer journey.
Customer-Centric Decision Support System for Retention Management
- Whom to Target?
- With What Incentive?
- When to Target?
- And so What?
- Whom to Target?
- With What Incentive?
- When to Target?
- And so What?
In my earlier work together with Christophe Croux, I have shown how bagging and boosting, two ensemble combination methods from machine learning based on classification trees, can substantially improve the accuracy in predicting churn (i.e. whether a customer defects the company, e.g. cancels her subscription in a contractual setting), and yield more profitable retention campaigns. Part of the challenge when modeling churn comes from the fact that the dependent variable is a rare event calling for the use of a balanced sampling scheme. In this paper, we show that balanced sampling is recommended, and propose two bias correction methods in order to deal with the bias introduced by the sampling scheme. In a sequel paper, we propose an improvement of the bagging algorithm, called trimmed bagging that – unlike the original bagging – can be used in conjunction with low-variance base classifiers, like support vector machines.
In my on-going research, I focus on two key challenges that arise when answering the “whom to target” question. First, the profitability of retention campaigns does not only depend on the accuracy of churn predictions but also on other elements which we take into account in the project with Sunil Gupta described below. Second, the quest for the best prediction method that would outperform all others across all business contexts and applications is a myth. In a project with Bas Donkers and Peter Verhoef, we show that analysts are better off using a portfolio approach that balances the risks and returns associated with each method.
- Managing Churn to Maximize Profits, with Sunil Gupta (Harvard Business School), forthcoming at Marketing Science
This project focuses on the “whom to target” decisions in order to maximize the return on investment of retention incentives. While both research and business generally advocate to target customers based on their defection probability, we develop a new algorithm based on Stochastic Gradient Boosting that allows to target customers based on the expected profit that the targeting action would generate. The algorithm uses a loss function that takes the determinants of profitability into account: (1) future spending, (2) probability of accepting the offer, and (3) incentive cost, and optimizes the ROI of the retention campaign using gradient descent. We already tested our methodology on various data sets that we obtained from different companies (Belgian TV subscription provider; US telco; membership organization) and are looking for more applications. Our work already received press coverage from several media, including Forbes.
- Whom to Target: Managing Risk when Predicting Retention, with Bas Donkers (ESE) and Peter Verhoef (Groningen)
This project tackles an important problem that organizations and academics alike face when using predictive analytics: the predictive performance of the algorithms currently available (including the most advanced machine learning tools) show a huge variability in performance across business contexts, and thus convey a sizeable risk for those who rely on them. We propose to capitalize on this risk (rather than seeing it as a problem) by developing a portfolio diversification algorithm for predictive analytics. The method builds on the risk-return tradeoff and uses the notion of efficient portfolio to combine prediction methods optimally. The project is based on a large Monte Carlo simulation study and the analysis of 13 real data sets (insurance, banking, telco, public transportation, etc.). To ensure the feasibility of this project, I made use of the LISA cluster computing facilities which allowed us to run more than 1 Mio models
Many firms rely on reactive management, mostly because they lack sufficient analytic resources, or for fears of financial consequences associated with false positives and negatives. As a result, they wait for unhappy clients to call in to cancel their contract and react by offering special incentives to change their mind. Reactive actions require less analytics but often come too late. Consequently, firms inflate the financial encouragements” to change their clients’ decisions. The advantages and drawbacks inherent to reactive vs. proactive interventions suggest that firms might be better off combining both approaches, and in particular, determining for every customer, the optimal contact timing. Surprisingly, there is barely any research that guides decision makers in this choice. Existing work either focuses exclusively on who to target in proactive campaigns, or on the conditions for successful win-back offers. Proactive and reactive interventions are hardly ever considered jointly. In a project with Maurits Kaptein and our PhD student Zoltan Puha, we address this challenge.
- Proactive vs Reactive Retention Strategies using Multi-Armed Bandit, with Maurits Kaptein (Jheronimus Academy of Data Science) and Zoltan Puha (Tilburg)
This project focuses on the “when to target” question. The goal is to determine, in real time, the optimal time to target customers. A key decision is whether to contact a customer either before he indicates she wants to churn, or to wait and give her a special offer after she calls to cancel her subscription. We are currently developing a theoretical framework to the underlying act/react problem. A key challenge are the omitted counterfactuals. To address this, we rely on a multi-armed bandit model combined with machine learning classification tools and Thompson sampling to balance exploration and exploitation.
So far, there is not much work on the best choice of retention offers to make customers in order to retain them. In fact, most research does not even use data from actual retention campaigns. A few exceptions observe customer responses to one type of offer. The latter are usually in the form of “thank you” emails, small gifts or discounts for future purchases.
- Agency as Retention Incentive, with Emilie Estrezon (ULB) and Bram Van Den Berg (RSM)
This project focuses on the “what retention offer” to make and, in particular, on the potential for non-monetary incentives to foster customer retention. Based on two large-scale field experiments we ran last year among two well-known charities in Belgium, we developed a novel framework based on the notion of agency theory in order to retain donors and nudge them to donate more. The manipulation consists of giving donors the perceived ability to choose between projects they would invest in. Our results show a large positive impact on donor retention and future donation behavior.
A key element for a good decision support system is to be able to answer the question “and so what did we gain.” Evaluating the return-on-investment of retention campaigns remains a challenge. First, it requires an accurate estimation of the customer lifetime value (CLV) of the retained customers. In a project with Nicolas Glady and Christophe Croux, we propose to use copulas to improve CLV prediction across four distinct product categories, namely online music albums sales, securities transactions, and utilitarian and hedonic fast-moving consumer good retail sales. Our methodology allows organizations to understand the relationship between the various sources of a customer’s value, i.e. how much she spends, how often and when she defects.
Key challenges remain when evaluating the profitability of retention campaigns. First, as organizations do not observe the counterfactuals of not targeting a retention campaign, one needs to rely on randomized field experiments to draw causal inferences. In the project “managing churn to maximize profits” which I describe above, we use the Imbens and Rubin causal inference framework and inverse propensity score to make so-called “offline evaluations” of the profitability of the retention actions.
Second, the value of customers goes beyond the notion of customer lifetime value. In fact, customers also contribute to the welfare of an organization through their engagement and their ability to convince other customers to join via word-of-mouth. In an earlier project together with Tammo Bijmolt and colleagues, we discuss the state of the art of models for customer engagement analytics and the problems that are inherent to calibrating and implementing these models. Moreover, in an on-going project with Rik Pieters and our PhD student Constant Pieters described next, we investigate how word-of-mouth propagates.
- The Mediating Role of Customer Satisfaction on Referral Cascades, with Constant Pieters (Tilburg) and Rik Pieters (Tilburg)
Information cascades in the domain of product referrals emerge when the recipient of a referral to use a service or acquire a product acts on it and passes the referral on in his or her network. Referrals are a key source of information for customers, help them in their purchase decisions, and constitute a key engine of growth for firms. Conventional wisdom suggests that information cascades emerge from higher satisfaction levels among referred customers than non-referred customers, which results from the matchmaking ability of referrers. Surprisingly few studies have actually tested this belief. Based on three independent studies, with different methodologies, we find that indeed satisfaction mediates the impact of being referred on the likelihood of referring, but that it only accounts for less than half of the referral propagation effect. The mere effect of receiving information via referral increases the probability of referring on turn, thus bringing more information on the market and leading to longer information cascades.