You will perform data mining steps data preprocesing, Database Management System

You will perform data mining steps data preprocesing

Data Mining Project

In this project you will use the sentiment labelled sentences dataset provided in the following link: https://archive.ics.uci.edu/ml/datasets/Sentiment+Labelled+Sentences This dataset contains review sentences labeled (classified) as positive and negative such as the following two sentences from imdb movie reviews:

Wasted two hours. 0
Saw the movie today and thought it was a good effort, good messages for kids. 1

If the sentence is labeled as 0, it means a negative comment, if it is labeled as 1 it means a positive comment. There are 3 different files (imdb_labelled.txt, amazon_cells_labelled.txt, yelp_labelled.txt) each containing 500 positive and 500 negative sentences. (amazon and yelp datasets contain more number of instances but the ones labeled as 0 or 1 should be considered only). This data is used in the following paper: Dimitrios Kotzias, Misha Denil, Nando de Freitas, Padhraic Smyth: From Group to Individual Labels Using Deep Features. KDD 2015: 597-606

You will perform data mining steps (data preprocesing, classification) on this dataset and write your results in a project report in the form of a IEEE conference paper.

Steps:

a. Literature review: You should read the following paper to learn what has been done before on this problem: https://www.cs.cornell.edu/home/llee/papers/sentiment.pdf. You should write the summary of this work with your own sentences and this summary will be in the "Related Work" section of your paper.

b. Dataset characteristics: Data description, size, training, test, number of attributes, attribute lists, type of attributes, range of attributes, etc. In this dataset, each distinct word should be considered as an attribute/feature.

c. Data preprocessing: Normalization, missing values, outlier detection, smoothing, attribute reduction/attribute selection, sampling etc.

d. Data mining tasks (Classification): Use Weka (preferred) or any other data mining tool. Perform classification experiments using different algorithms including at least decision trees, naïve bayes, rule learning. Performance analysis with measures covered in the lecture. Discuss the results.

Project Paper: Write your project report in the form of a conference paper. (January 22, 2016) Follow the IEEE template in here: https://www.ieee.org/publications_standards/publications/conferences/2014_04_msw_usltr_format.doc.

Your paper should contain the following sections:

1. Abstract: one paragraph summary of your paper

2. Introduction: Describe the sentiment classification problem, why it is important to classify sentiments (give the motivation). Finally mention what are the contributions of your work in this paper.

3. Related work: your should write the summary of the paper in step a). https://www.cs.cornell.edu/home/llee/papers/sentiment.pdf.

4. Sentiment Classification: You should write data mining steps that you performed in steps b, c, and d except the classification results.

5. Experimental Results: you should report classification results with measures covered in the lecture. You should also discuss the results in this section.

6. Conclusion: Briefly summarize the paper and state your opinions about what can be done to improve classification accuracy further.

View Complete Question

Request for Solution File

Ask an Expert for Answer!!

Database Management System: You will perform data mining steps data preprocesing

Reference No:- TGS01243727

Expected delivery within 24 Hours

Have a Question? (oR Write a Review)

Write atleast 100 words!!

Request for Solution File

Ask an Expert for Answer!!

Database Management System: You will perform data mining steps data preprocesing

Reference No:- TGS01243727

Have a Question? (oR Write a Review)

Recent Questions Asked Database Management System

Q : Grammy phone is a cellular firm that reported a net income

Q : A newly issued bond pays its coupons once a year its coupon

Q : You recently purchased a stock that is expected to earn 25

Q : Select one 1 us publicly traded company and review its most

Q : You will perform data mining steps data preprocesing

Q : Louise manufacturing uses 2200 switch assemblies per week

Q : Write about e-cigarettes topic and just write summry and

Q : Your portfolio has a beta of 154 the portfolio consists of

Q : What is the difference between manufacturing and non

Discuss aaron becks cognitive therapy

Case study of failure in school reform

Create a comprehensive digital toolkit

Describe student affairs career exploration

Reflect about interpersonal communication skills

Introduce new hr students to the topic of competencies

Changing from being traditional to employer of future

Request for Solution File

Ask an Expert for Answer!!

Database Management System: You will perform data mining steps data preprocesing

Reference No:- TGS01243727

Recent Questions Asked Database Management System

Q : Grammy phone is a cellular firm that reported a net income

Q : A newly issued bond pays its coupons once a year its coupon

Q : You recently purchased a stock that is expected to earn 25

Q : Select one 1 us publicly traded company and review its most

Q : You will perform data mining steps data preprocesing

Q : Louise manufacturing uses 2200 switch assemblies per week

Q : Write about e-cigarettes topic and just write summry and

Q : Your portfolio has a beta of 154 the portfolio consists of

Q : What is the difference between manufacturing and non

Asked Questions