Improving response from the house file: beyond RFM (Part 1)

October 18, 2010


One sure source of income for any retailer is people who have bought from them before. If the product you’ve sold is good (and this includes customer service), it is a reasonable assumption that the customer would want to buy from you again.

If you believe this assumption, then a large part of your marketing budget should be used in attracting these people to buy from you again. You can spend money on ensuring top quality after sales customer service, for instance. But here we talk about your direct marketing efforts: reaching these customers in your ‘house file’ via catalogs, mailers, emails and anything else that might showcase the products you want to sell them, products that they might want to buy.

These leads to a lot of relevant questions: What to sell to which customer? How much of your budget should you allocate for direct marketing to existing customers? We do have answers to all of those questions! Do contact us if you’d like to hear more. We’ll come back to those questions later and in this series of posts, focus on response modeling.

Response modeling is just a term for the techniques used towards identifying which customers to focus on first, when precious marketing dollars are being spent in attracting them back. After all, your CEO is not going to give you an unlimited budget! In a perfect world, if you only knew which 50K out of your 500K customers are going to buy from you again in the next two months, you’d mail those 50K people your catalog (or other mailers, emails et al) and they’d see your products, dutifully buy them! But it is not a perfect world, alas. So you need a magic wand: good response modeling! The better the response modeling techniques, the better you can maximize revenue from your customers while minimizing marketing cost, enhancing your ROI. If your response model is good, you’ll end up contacting the more responsive customers – they’ll return and buy. This would not only increase the size of your L12M (last 12 month buyers) file, but it shall be full of people who are more prone to returning and buying from you.


The industry standard for identifying who to mail is called RFM (Recency, Frequency, Monetary Value). Some companies employ a slight variation, or an extended version as well to improve results. It is intuitive, easy to understand and interpret. However, it has limited utility.

The idea is simple: if a customer purchased from you recently, or purchases from you frequently, or has spent a lot of money with you, they should be sent a sent a catalog, because they would certainly want to buy from you again. There is certainly no argument that ‘how many days it has been since the customer last bought from us?’ (Recency), ‘how many times has a customer bought from us?’ (Frequency) and ‘how much revenue has the customer generated for us?’ (Monetary value) are all excellent variables to try and predict whether the customer shall return to make another purchase. This has been used for years and definitely yielded some good results.

Limitations of RFM

However, this technique has its limitations. At best, this is a model that uses these three highly intuitive predictor variables to predict whether a customer shall return or not. For instance, it severely restricts how companies use their own data. There are so many variables that can be derived from one’s data that can serve as additional excellent predictors.

It limits the terms in which marketing people think of predicting response. Typically, most companies divide all their customers into RFM ‘segments’. Each segment is defined by values of these three attributes: R, F and M. For example, one segment could be ‘Customers who last bought 3-6 months ago, who have bought 3-5 times, who have spent $100 or more’. Then, all segments are ranked from most to least valuable and all customers from the top X most valuable segments are mailed. This makes intuitive sense, but then it limits the marketing thought process towards these segments and makes it hard for people to think in any other terms.

Thus, good, responsive customers might end up in not so valuable segments and thus get ignored for mailings. The opposite might happen as well – not so great customers might end up in valuable segments and can get mailed. A response model that ranks customers by their value, as opposed to the value of their segment can take care of this problem.

Mathematically, another limitation is that often the joint effects of variables get ignored – at best, the RFM segmentation is a CHAID type analysis.

Next up, is a series of examples of our approach to response modeling. At all of our clients, where we have implemented a response model, we have seen at least a 20% improvement over RFM models. We have ensured we approached each client’s data in a customized manner, given the data some pre-modeling treatment (removing outliers etc) and employed the most suitable modeling techniques so as to produce the best results. Keep tuned!