Solved: How does the algorithm pick the attributes for splitting, Database Management System

How does the algorithm pick the attributes for splitting

Assignment

1. A local retailer has a database that stores 10,000 transactions of last summer. After analyzing the data, a data science team has identified the following statistics:

• {battery} appears in 6,000 transactions.
• {sunscreen} appears in 5,000 transactions.
• {sandals} appears in 4,000 transactions.
• {bowls} appears in 2,000 transactions.
• {battery, sunscreen} appears in 1,500 transactions.
• {battery, sandals} appears in 1,000 transactions.
• {battery, bowls} appears in 250 transactions.
• {battery, sunscreen, sandals} appears in 600 transactions.

Answer the following questions:

a. What are the support values of the preceding itemsets?

b. Assuming the minimum support is 0.05, which itemsets are considered frequent?

c. What are the confidence values of {battery}->{ sunscreen} and {battery, sunscreen}->{ sandals} ? Which of the two rules is more interesting?

d. List all the candidate rules that can be formed from the statistics. Which rules are considered interesting at the minimum confidence 0.25? Out of these interesting rules, which rule is considered the most useful (that is, least coincidental)?

2. Describe how logistic regression can be used as a classifier

3. In a decision tree, how does the algorithm pick the attributes for splitting?

4. A data science team is working on a classification problem in which the dataset contains many correlated variables, and most of them are continuous. The team wants the model to output the probabilities in addition to the class labels. Which classifier should the team consider using? Why?

5. Fit an appropriate ARIMA model on the following datasets included in R. Provide supporting evidence on why the fitted model was selected, and forecast the time series for 12 time periods ahead.

a. faithful: Waiting times (in minutes) between Old Faithful geyser eruptions
b. JohnsonJohnson: Quarterly earnings per J&J share
c. sunspot.month: Monthly sunspot activity from 1749 to 1997

6. Choose a topic of your interest, such as a movie, a celebrity, or any buzz word. Then collect 100 tweets related to this topic. Hand-tag them as positive, neutral, or negative. Next, split them into 80 tweets as the training set and the remaining 20 as the testing set. Run one or more classifiers over these tweets to perform sentiment analysis. What are the precision and recall of these classifiers? Which classifier performs better than the others?

Format your assignment according to the following formatting requirements:

1. The answer should be typed, double spaced, using Times New Roman font (size 12), with one-inch margins on all sides.

2. The response also include a cover page containing the title of the assignment, the student's name, the course title, and the date. The cover page is not included in the required page length.

3. Also Include a reference page. The Citations and references should follow APA format. The reference page is not included in the required page length.

View Complete Question

Solution Preview :

Prepared by a verified Expert

Database Management System: How does the algorithm pick the attributes for splitting

Reference No:- TGS02969733

Now Priced at $65 (50% Discount)

Recommended (91%)

Rated (4.3/5)

Have a Question? (oR Write a Review)

Write atleast 100 words!!

Solution Preview :

Prepared by a verified Expert

Database Management System: How does the algorithm pick the attributes for splitting

Reference No:- TGS02969733

Have a Question? (oR Write a Review)

Recent Questions Asked Database Management System

Q : Discuss about jail and prison culture and subculture

Q : What is the modular multiplicative inverse of 17 mod 43

Q : Discuss kirznerian entrepreneur moving towards equilibrium

Q : Examine advantages of using a gantt chart over a pert chart

Q : How does the algorithm pick the attributes for splitting

Q : Determine the new utility function of a consumer

Q : Describe some of the preparations you will need

Q : What is person sitting on the podium behind the president

Q : Identify a hypothetical organization to use as the basis

Effects of contrasting childhood experiences

Determine when your problem space is sufficiently narrowed

How improve your understanding and interpersonal skills

Who has participated in collusion

Teachable moments of death and dying

Problem regarding psychology is transcendence

Problem about distressed during the session

Solution Preview :

Prepared by a verified Expert

Database Management System: How does the algorithm pick the attributes for splitting

Reference No:- TGS02969733

Recent Questions Asked Database Management System

Q : Discuss about jail and prison culture and subculture

Q : What is the modular multiplicative inverse of 17 mod 43

Q : Discuss kirznerian entrepreneur moving towards equilibrium

Q : Examine advantages of using a gantt chart over a pert chart

Q : How does the algorithm pick the attributes for splitting

Q : Determine the new utility function of a consumer

Q : Describe some of the preparations you will need

Q : What is person sitting on the podium behind the president

Q : Identify a hypothetical organization to use as the basis

Asked Questions