Main

June 07, 2007

Recency Frequency Monetary Value: Primer

RFM Analysis

RFM analysis evaluates the recency, frequency, and monetary value of customer behaviors to determine the likelihood that a given customer will respond to a campaign. It is an empirical method that has long been applied to campaign planning and optimization as an alternative to segmenting a customer base by demographic means. The results of RFM analysis provide marketing departments the financial justification for specific marketing campaigns. RFM analysis provides the measurements for success from both planned and actual perspectives. The RFM analytic engine performs two processes:

    * Segmentation — Similar to clustering, RFM analysis segments customers into different target groups. The main distinction is that RFM analysis is focused on specific aspects of customer behavior, namely:

        1   Recency — When was the last purchase made? The most recent customers are sought based on the assumption that most-recent purchasers are more likely to purchase again than less-recent purchasers.

        2   Frequency — How often were purchases made? The most frequent customers are sought based on the assumption that customers with more purchases are more likely to buy products than customers with fewer purchases.

        3   Monetary value — What was the amount of the purchase? The biggest-spending customers are sought based on the assumption that purchasers who spent the most are more likely to purchase again than small spenders.

    * Response rate calculation — Once segments have been established, the response rates for each customer segment is calculated based on historical data of actual response rates of past campaigns. The results of the response rate are then saved and then used during the segment building process. This building process models target groups by specifying attributes and building customer profiles for use in marketing activities such as running a campaign.

For the analysis to be effective, representative data must be used from prior campaigns. A campaign is considered sufficiently similar if the nature of the campaign and the customer target groups hold similar attributes. If historical data cannot be found for the representative target group, investments in learning must be made by launching new campaigns targeting the desired representative group so that RFM analysis can be applied. Using random, non-representative data for RFM analysis can render it useless.

To segment the customers, one needs to know which customers to segment, how many RFM segments to determine, and where to get the values for RFM analysis.  The segmentation starts with recency, then frequency, and finally monetary value. First, customers are ranked into recency segments and given a score based on the number of segments. Within the recency segments, the frequency segments are then ranked and scored. Finally, within the frequency segments, monetary value scores are determined.

Figure 7.4 RFM segmentation process.

Another consideration is the number of segments that are created vis-à-vis the amount of records available and its impact on the response rate accuracy. For example, if the defaults of five segments per RFM were configured, then 125 segments would be calculated (5×5×5 = 125). Out of a customer base of 1,250,000 customers (10,000 customers per segment), one response would affect the response rate calculated by a hundredth of a percent (1/10,000). Out of a customer base of 1,250 customers (10 customers per segment), the difference of one response could swing the response rate by 10 percent. For smaller customer bases, the number of segments should be decreased. There are two options for handling customers that do not have data for the evaluation periods under consideration. The first option is the default, where all those customers fall to the bottom of the RFM analysis in the segment with the score of 111. The second option is to place these customers into a special segment with a score of 000, so that they fall out of the RFM analysis entirely (preventing any skewing of the segmentation results if there are enough customers missing data).

RFM


Hosting by Yahoo!

August 23, 2006

BEx Broadcaster

The BEx Broadcaster may be accessed from various entry points, depending on the object you are attempting to distribute. Broadcaster settings may be maintained for each of the distribution objects supported. Distributable objects include the following:

    *  Web template — An HTML container for Web items (for example, grids, charts, and so on) that bind to queries or query views

    *  Queries — See the section, “Query Processing Overview,” toward the beginning of this chapter.

    *  Query Views — See the section, “Query Processing Overview,” toward the beginning of this chapter.

    *  Workbooks — An Excel container for BEx Analyzer Queries

    *  Reports — An HTML-structured layout and format for queries and queries views


Hosting by Yahoo!

Association Analysis Output Term Used by SAP BW

Association Analysis Output Terms 

To configure association analysis in SAP BW and understand its output, one must first understand certain data mining terms:

    *  SupportThis is a percentage of how often a collection of tem in an association appears. For example, five percent of the total purchases at an airport sundry shop support the sale of both toothbrush and toothpaste.

    *  ConfidenceFrom a statistical perspective, confidence has a close association with conditional probability. It is the percentage likelihood that a dependent item occurs in a data set when a lead item has occurred. For the diapers (lead item) and beer (dependent item) association, confidence could be expressed as a 50 percent probability of beer being purchased given a sale of diapers. This number is observational rather than predictive.

    *  LiftThis is a measure of the effectiveness of an association analysis by taking the ratio of the results with and without the association rule. Lift can be used to eliminate records that do not have a true association but are picked up in the analysis because they appear frequently. More explicitly, lift is confidence divided by support for a given association. The actual mathematical formula the system applies is as follows: Lift is equal to the actual support of the lead and dependent items divided by the ratio of actual support for lead item to actual support of dependent item. Fractions are used in the equation rather than percentages.


Hosting by Yahoo!

August 16, 2006

Does SAP BW 'Close the loop'?

Closing the Loop with SAP BW

SAP BW provides a few options for extracting data for use in interface programs that feed data streams into operational or analytic applications:

    *  DataMart interface From a technical point of view, the DataMart interface is nothing more than the BI Service API.  The DataMart interface allows building complex information logistics models, as shown in the topology examples in Chapter 5 (assuming both local and global data warehouses are on SAP BW). All InfoProviders available are at the DataMart interface’s disposal. However, use of the DataMart interface is restricted to SAP BW systems.

    *  RetractorsThese are dedicated ABAP programs reextracting data from SAP BW into an SAP ERP system. Retractors are a relatively new concept and are currently only available for a few applications, including SAP SEM-BPS and SAP CRM.

    *  SAP BW Java ConnectorsThis is an official API for executing queries and retrieving query results. Custom programs can access OLAP cubes to retrieve query results for any purposes. By using the either the ODBO or XMLA interface through the Java SDK, any front-end applications in almost any programming language can access SAP BW query results. However, using OLAP results is not recommended for use with large result data sets, as it is optimized for interactive applications, not extractions.

    *  Open Hub Services These allow you to define controlled and monitored data-export processes, exporting data target contents into a flat file, a flat database table, or an application for further processing. Open Hub Services may end up as the Dodo bird did.... time will tell.

Are these the options good enough? Share your comments with us and look for our perspectives and commentary in future posts. 



Hosting by Yahoo!

August 15, 2006

Analytic Processes

Data mining methods share similar processes:

    *  A data mining model is createdThe configuration of the models differs, depending on the data mining method (especially the model variables). The model typically consists of two levels of configuration: one for the model itself and one for the modeling variables. The configuration settings for model variables and parameters differ per model.

    *  The model is trainedTraining allows the model to learn against a subset of data. The source for the training exercise should take this into account by selecting enough data to appropriately train the model but not too much that performance is adversely affected. Where predictions are not needed (such as association analysis), training is not needed. A model has a status indicator used to identify whether or not it has been trained.

    *  The model is evaluatedAfter a model has been trained, it may be evaluated to determine whether it was trained appropriately. A sample of historic data is used to verify the results (but not the historic data to be used for predictions). A model has a status indicator used to identify whether or not it has been evaluated.

    *  Predictions are madeOnce a model is appropriately trained and evaluated, it is ready to make predictions. Predictions may be performed either online or as a batch process, depending on the volume of data. Predictions should be made on yet another source for historic data, separate from what was used in training and evaluation. The actual output of data mining depends on the method picked. Decision trees output probability trees, while clusters show percentage slices of an overall pie. A status indicator shows whether a model has been predicted.

    *  The results are stored or forwardedPredicted values may be stored in InfoObjects. For example, customer classifications generated by scoring models may be saved to customer master data. When loading the results as a characteristic attribute, you must take care to match the metadata of the master data with the metadata of the model fields. More explicitly, the domain values in the model must match the domain values of the master data. When loading cross-selling rules determined via association analysis to SAP CRM, you must specify the logical system name and target group to export the data. Predicted values may also be exported to file or another target type. Once these results are stored or forwarded, they may be used for operational purposes or “embedded” into decision points in a business processes. For example, a bank may decide to offer or deny credit to a potential customer based on the customer’s attributes and behavior. This may be done in an automated fashion if, for example, the loan request is performed online.


Hosting by Yahoo!