Why Data mining in CRM?

“CRM is about acquiring and retaining customers, improving customer loyalty, gaining customer insight, and implementing customer-focused strategies. A true customer-centric enterprise helps your company drive new growth, maintain competitive agility, and attain operational excellence.” SAP

Customer Relationship Management (CRM) is a business philosophy involving identifying, understanding and better providing for your customers while building a relationship with each customer to improve customer satisfaction and maximise profits. It’s about understanding, anticipating and responding to customers’ needs.

To manage the relationship with the customer a business needs to collect the right information about its customers and organise that information for proper analysis and action. It needs to keep that information up-to-date, make it accessible to employees, and provide the know how for employees to convert that data into products better matched to customers’ needs.

The secret to an effective CRM package is not just in what data is collected but in the organising and interpretation of that data. Computers can’t, of course, transform the relationship you have with your customer. That does take a cross-department, top to bottom, corporate desire to build better relationships. But computers and a good computer based CRM solution, can increase sales by as much as 40-50% – as some studies have shown.

This is where Data Mining, Artificial Intelligence, and intelligent search applications come in. Wait, back up a minute, what are all these terms, you ask…

A good CRM application will provide the facility for the business to store and manage data they collect on their customers, and products. A better CRM will have the ability to group the data, convert them to information and display them in its search results whenever a user types in a word that may match the group of keywords associated to the question.

Okay, before we proceed, let’s get an understanding of what data mining is about…

Data mining, a branch of computer science[1] is the process of extracting patterns from large data sets by combining methods from statistics and artificial intelligence with database management. Data mining is seen as an increasingly important tool by modern business to transform data into business intelligence giving an informational advantage. It is currently used in a wide range of profiling practices, such as marketing, surveillance, fraud detection, and scientific discovery.

The related terms data dredging, data fishing and data snooping refer to the use of data mining techniques to sample portions of the larger population data set that are (or may be) too small for reliable statistical inferences to be made about the validity of any patterns discovered. These techniques can, however, be used in the creation of new hypotheses to test against the larger data populations.

Basically data can be collected, they would just be random numbers and words. Arranging this data into meaningful information was a tedious and arduous task for people compiling the data. The theoretical knowledge from the statisticians were converted into programming languages and data mining applications were developed in due course.

The manual extraction of patterns from data has occurred for centuries. Early methods of identifying patterns in data include Bayes’ theorem (1700s) and regression analysis (1800s). The proliferation, ubiquity and increasing power of computer technology has increased data collection, storage and manipulations. As data sets have grown in size and complexity, direct hands-on data analysis has increasingly been augmented with indirect, automatic data processing. This has been aided by other discoveries in computer science, such as neural networks, clustering, genetic algorithms (1950s), decision trees (1960s) and support vector machines (1980s). Data mining is the process of applying these methods to data with the intention of uncovering hidden patterns.[2] It has been used for many years by businesses, scientists and governments to sift through volumes of data such as airline passenger trip records, census data and supermarket scanner data to produce market research reports. (Note, however, that reporting is not always considered to be data mining.)

A primary reason for using data mining is to assist in the analysis of collections of observations of behaviour. Such data are vulnerable to co linearity because of unknown interrelations. An unavoidable fact of data mining is that the (sub-)set(s) of data being analysed may not be representative of the whole domain, and therefore may not contain examples of certain critical relationships and behaviours that exist across other parts of the domain. To address this sort of issue, the analysis may be augmented using experiment-based and other approaches, such as Choice Modelling for human-generated data. In these situations, inherent correlations can be either controlled for, or removed altogether, during the construction of the experimental design.

Data mining commonly involves four classes of tasks:[12]

  • Clustering – is the task of discovering groups and structures in the data that are in some way or another “similar”, without using known structures in the data.
  • Regression – Attempts to find a function which models the data with the least error.
  • Association rule learning – Searches for relationships between variables. For example a supermarket might gather data on customer purchasing habits. Using association rule learning, the supermarket can determine which products are frequently bought together and use this information for marketing purposes. This is sometimes referred to as market basket analysis.

With technology growing in leaps and bounds, Data mining has been considered to be added into customer relationship management applications. Rather than randomly contacting a prospect or customer through a call center or sending mail, a company can concentrate its efforts on prospects that are predicted to have a high likelihood of responding to an offer. More sophisticated methods may be used to optimise resources across campaigns so that one may predict which channel and which offer an individual is most likely to respond to — across all potential offers. Additionally, applications could be used to automate the mailing. Once the results from data mining (potential prospect/customer and channel/offer) are determined, this applications can be programmed either automatically to send an e-mail or regular mail or with the few steps a user has to click a button and mails to customers can be sent in bulk. Of course, the issues of bulk mail and spamming should be given due consideration here, it would be at the onus of the business to ensure their mass mailing is not construed as spam.

Finally, in cases where many people will take an action without an offer, uplift modeling can be used to determine which people will have the greatest increase in responding if given an offer. Data clustering can also be used to automatically discover the segments or groups within a customer data set.

Businesses employing data mining may see a return on investment, but also they recognise that the number of predictive models can quickly become very large. Rather than one model to predict how many customers will churn, a business could build a separate model for each region and customer type. Then instead of sending an offer to all people, who are likely to appear on the search, it may only want to send offers to customers. And finally, it may also want to determine which customers are going to be profitable over a window of time and only send the offers to those that are likely to be profitable. In order to maintain this quantity of models, they need to manage model versions and move to automated data mining.

An example of a CRM application would be in a car manufacturing business (assuming they sell directly to end users). If they maintained a database of which customers buy what type of product, and when, how often they make that purchase, what type of options they choose with their typical purchase, their colour preferences, whether the purchase needed financing etc., the manufacturer knows what marketing material to send out, what new products to promote to each customer, what preferences/options may swing the sale, whether a finance package should be included in the marketing material and when would be a good time to target each customer. They could use the information to build a relationship with the customer by reminding customers of service dates, product recalls, and maybe even to send the customer a birthday card.

A good place to start would be to make a list of your objectives and the benefits your organisation hopes to achieve. When looking at CRM solutions you want to check the features and functionality “out of the box”

– customisation is all very nice but it takes time and may not be as easy as you think

– supported platforms in terms of hardware, operating systems, databases, online activities and online ordering systems etc., (not just your back office systems but third-party software you use too)

– integration with those systems

– global perspective

– price – preferably a one-off purchase price with no annual licence fee.

Therefore, if you are looking to grow your business in leaps and bounds, and you know the way to it is to grow your customer base, to improve your relationship with your customer, to actually be able to get insights on your customer buying behavior and pattern, then you need a CRM application.

Not just any CRM application. A CRM application that can collect the right information about the customers and organise that information for proper analysis and action. An application that is able to keep information up-to-date, is accessible to employees, and the employees have the know how for  to convert that data into products to better matched the customers’ needs.

The secret to an effective CRM package is not just in what data is collected but in the organising and interpretation of that data. Computers can’t, of course, transform the relationship you have with your customer. That does take a cross-department, top to bottom, corporate desire to build better relationships. But computers and a good computer based CRM solution, can increase sales by as much as 40-50% – as some studies have shown.


^ Clifton, Christopher (2010). “Encyclopedia Britannica: Definition of Data Mining”. http://www.britannica.com/EBchecked/topic/1056150/data-mining. Retrieved 2010-12-9.

^ Kantardzic, Mehmed (2003). Data Mining: Concepts, Models, Methods, and Algorithms. John Wiley & Sons. ISBN 0471228524. OCLC 50055336.

^ Alex Guazzelli, Wen-Ching Lin, Tridivesh Jena. PMML in Action: Unleashing the Power of Open Standards for Data Mining and Predictive Analytics. CreateSpace, 2010

^ a b The Data Mining Group (DMG). The DMG is an independent, vendor led group which develops data mining standards, such as the Predictive Model Markup Language (PMML).

^ PMML Project Page

^ Alex Guazzelli, Michael Zeller, Wen-Ching Lin, Graham Williams. PMML: An Open Standard for Sharing Models. The R Journal, vol 1/1, May 2009.

^ Y. Peng, G. Kou, Y. Shi, Z. Chen (2008). “A Descriptive Framework for the Field of Data Mining and Knowledge Discovery”. International Journal of Information Technology and Decision Making, Volume 7, Issue 4 7: 639 – 682. doi:10.1142/S0219622008003204.

^ Proceedings, International Conferences on Knowledge Discovery and Data Mining, ACM, New York.

^ SIGKDD Explorations, ACM, New York.

^ International Conference on Data Mining: 5th (2009); 4th (2008); 3rd (2007); 2nd (2006); 1st (2005)

^ IEEE International Conference on Data Mining: ICDM09, Miami, FL; ICDM08, Pisa (Italy); ICDM07, Omaha, NE; ICDM06, Hong Kong; ICDM05, Houston, TX; ICDM04, Brighton (UK); ICDM03, Melbourne, FL; ICDM02, Maebashi City (Japan); ICDM01, San Jose, CA.

^ Fayyad, Usama; Gregory Piatetsky-Shapiro, and Padhraic Smyth (1996). “From Data Mining to Knowledge Discovery in Databases”. http://www.kdnuggets.com/gpspubs/aimag-kdd-overview-1996-Fayyad.pdf. Retrieved 2008-12-17.