7.2 An outline of the data preparation procedureFive months of transactional data was retrieved for the needs of the mining applicationsplanned (value‐based segmentation, RFM cell segmentation, cross‐selling model) coveringthe 5 first months of 2012. The original data, as we’ll see in the next paragraphs, containedmultiple lines per order transaction, recording the specifics of each transaction: transactiondate, invoice number, store id, payment type, amount, and item code. However, customer‐level data were required for all mining applications. Therefore, through extensive datapreparation, the detailed raw information had to be transformed and aggregated to summarizethe purchase habits of each cardholder, providing a customer “signature,” a unified view ofthe customer. The usage aspects summarized at a customer level included:● The frequency and recency of purchases● The total spending amount● Relative spending amount per product group● Size of basket (average spending amount per transaction)● Preferred payment method● Preferred period (day/time) of purchases● Preferred store● Tenure of each customerTable 7.1 presents the initial data for two fictional cardholders (Card IDs C1, C2) andthree purchase transactions (Invoices INV1, INV2, INV3). Each transactional record (line)contains information about, among other things, the invoice’s number, the date and time oftransaction, the code of the product purchased, and the paid amount.The data preparation procedure is presented in detail in the next paragraphs. Its mainsteps are outlined below.
đang được dịch, vui lòng đợi..
