First task you should complete is a data investigation


A major retail chain that specialises in electronics goods has hired you as their Data Scientist. Over the past 12 months they have embarked on the rollout of a Loyalty/Affinity card for their customers.

The retail chain would like to complete the roll out of the Loyalty/Affinity card to the remainder of their stores and their customers. You can been assigned to the project and your roll is to build a predictive model that can be used to determine, from their database of customers, who are most likely to join their Loyalty/Affinity card.

You have been provided with a file containing the customers who have been involved in the Loyalty/Affinity card project so far. The data set contains both the customers who have joined the Loyalty/Affinity card and those that can not joined.

The target variable in the data set that indicates if the customer has taken up a Loyalty/Affinity card is the AFFINITY.

You will need to load the customer file, which is in CSV format, into your SAS Enterprise Miner workspace. From there you will analyse the data, develop a number of predictive models, evaluate these models to determine which one gives the best results and then to make your recommendations.

Data Description

Electronics Data 
Name   Description 
CUST_ID    Unique identifier of each customer 
CUST_GENDER    The gender of the customer  M  or  F 
AGE    The current Age of the customer. You can assume this is correct and up to date 
CUST_MARITAL_STATUS    Marital Status of the customer  
COUNTRY_NAME    The country where the customer lives 
CUST_INCOME_LEVEL    The salary range for the customer 
EDUCATION    The highest level of education the customer has completed 
OCCUPATION    The current occupation category for the customer 
HOUSEHOLD_SIZE    The number of people in the household of the customer. This number includes the customer 
YRS_RESIDENCE    How long the customer has lived at their current residence 
AFFINITY_CARD    Target Variable. 0 = no affinity card,  1 = has taken an affinity card 
BULK_PACK_DISKETTES    Indicator for this item purchased.  0 = no purchase,  1 = purchased 
FLAT_PANEL_MONITOR    Indicator for this item purchased.  0 = no purchase,  1 = purchased 
HOME_THEATER_PACKAGE    Indicator for this item purchased.  0 = no purchase,  1 = purchased 
BOOKKEEPING_APPLICATION    Indicator for this item purchased.  0 = no purchase,  1 = purchased 
PRINTER_SUPPLIES    Indicator for this item purchased.  0 = no purchase,  1 = purchased 
OS_DOC_SET_KANJI    Indicator for this item purchased.  0 = no purchase,  1 = purchased 
Y_BOX_GAMES    Indicator for this item purchased.  0 = no purchase,  1 = purchased

See the separate instructions on the Notes webpage for how you can load external data into your account on the SAS OnDemand server.

Required Tasks

You are required to produce a report (following the CRISP-DM report, as much as possible) detailing your work investigating the data and classifying the provided data.

The first task you should complete is a data investigation exercise, where you will document the characteristics and other information that you can determine about each Feature.

You will need to work through/develop a number of classification models. To do this you need to use the data mining tool used in class. In this tool you can have a number of different classification techniques and within each of these you can modify the various parameter settings.

You will need to develop a number of classification models. When you have developed all of your models (using the appropriate classification techniques available in the tool), you will have to evaluate them and identify the classification model and configuration that gives the best or most appropriate answer.

Attachment:- Assignment_Data.csv

Request for Solution File

Ask an Expert for Answer!!
Data Structure & Algorithms: First task you should complete is a data investigation
Reference No:- TGS01164076

Expected delivery within 24 Hours