« Gone but not forgotten: Transfer Rules | Main | Authors' Biography »

Analytic Processes

Data mining methods share similar processes:

    *  A data mining model is createdThe configuration of the models differs, depending on the data mining method (especially the model variables). The model typically consists of two levels of configuration: one for the model itself and one for the modeling variables. The configuration settings for model variables and parameters differ per model.

    *  The model is trainedTraining allows the model to learn against a subset of data. The source for the training exercise should take this into account by selecting enough data to appropriately train the model but not too much that performance is adversely affected. Where predictions are not needed (such as association analysis), training is not needed. A model has a status indicator used to identify whether or not it has been trained.

    *  The model is evaluatedAfter a model has been trained, it may be evaluated to determine whether it was trained appropriately. A sample of historic data is used to verify the results (but not the historic data to be used for predictions). A model has a status indicator used to identify whether or not it has been evaluated.

    *  Predictions are madeOnce a model is appropriately trained and evaluated, it is ready to make predictions. Predictions may be performed either online or as a batch process, depending on the volume of data. Predictions should be made on yet another source for historic data, separate from what was used in training and evaluation. The actual output of data mining depends on the method picked. Decision trees output probability trees, while clusters show percentage slices of an overall pie. A status indicator shows whether a model has been predicted.

    *  The results are stored or forwardedPredicted values may be stored in InfoObjects. For example, customer classifications generated by scoring models may be saved to customer master data. When loading the results as a characteristic attribute, you must take care to match the metadata of the master data with the metadata of the model fields. More explicitly, the domain values in the model must match the domain values of the master data. When loading cross-selling rules determined via association analysis to SAP CRM, you must specify the logical system name and target group to export the data. Predicted values may also be exported to file or another target type. Once these results are stored or forwarded, they may be used for operational purposes or “embedded” into decision points in a business processes. For example, a bank may decide to offer or deny credit to a potential customer based on the customer’s attributes and behavior. This may be done in an automated fashion if, for example, the loan request is performed online.

TrackBack

TrackBack URL for this entry:
http://renditionx.com/blog-mt/mt-tb.fcgi/47


Hosting by Yahoo!

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)