Stata data set


Assignment:

Logistics:

This project is due on the day of the “final” by the start -time of the “final”. Late submissions will be penalized. And this project is in a way a “take- home final”.

From now on we will use a portion of class time to discuss the project. Coding details will be available in class and during office hours consultations with me. But the rest of the material is your responsibility. You can consult other classmates, but the answers you turn in should be yours and not duplicate answers. That would be plagiarism.

Three versions of the data set have been posted on BB. Use whichever version is suitable for you. For the Excel version you have to first copy and “paste special” it into a STATA data set. The data consists of movie gross receipts and other characteristics of the movie. We are interested in investigating this data set to answer various questions.

Answer the following questions and use the space provided for answers.Be sure to attach the relevant computer printouts. This is required for full credit.

There are multiple years’ worth of data but we are going to confine ourselves mainly on year 2009 after doing preliminary investigation of the entire dataset.

Questions: 1

(Use summarize, describe command with year condition, i.e. summarize if year== 2005)

a. How many movies do we have data for in each year? Why is that?

b. “runtime” refers to the length of the movie. What is the max. runtimefor any movie in 2008? The minimum runtimein 2006?

c. How many qualitative variables are in the data set? Which one(s)? List them.

Now we are going to zero in on the 2009 data set. Issue the command:

keep if year==2009

To save this new data set issue it a name (go up to file →save as→ xyz)

Questions:2

Is a film's box-office revenue related to it's budget? Answer in the space below after doing the following.

.  diagram with twoway

.  correlate with pwcorr

.  regress rdgmillions budgetmillions

Questions:3

We want to examine how the gross receipts are connected to the movie’s budget. So run a multiple linear regression of real domestic gross (rdgmillons) on budgetmillions and the following control variables – runtime, totaltheaters, theaterrun,majorstudio and star.

Explain the results below, but first write down the estimated equation in its proper format.

Be sure to include in your discussion the following: R2 and adjusted R2, and the interpretation of the F-value for the regression.Also interpret the sign, magnitude, statistical significance of the estimated coefficients for the following variables: budgetmillions and majorstudio. Discuss in words and numbers – no tables.

Use the .reg command but correct for any inherent problems from which this type of data suffers.

Questions:4

How much of an issue is multicollinearity in this model? Use the pwcorr with the sig option and examine.

Questions:5

Estimate, using logistics regression, the effect of having a star in the film and the budget have on:

a.  whether or not a film receives a picture nomination.

b.  whether or not a film is given a “r” rating.

(These will be two separate logistics regressions using the logit command. In the space below, write down the two regressions in their proper format, including the correct format of the “y” variable. Then go on to Q.6)

Questions:6

Based on the interpretation of the logistic regressions above, answer the questions below:

a. what happens to the odds of nomination when there is a star in the movie? How do you know? Explain briefly.

b. What happens to the odds of a “r” ratingwhen the budget increases? How do you know? Explain briefly.

Questions:7

Using the margins command for the two estimated equations in (5) above, calculate the following:

a. the probability of a “r” rating, when there is a star in the movie, for a movie with an average budget.

b. the probability of a “r”rating, when there is no star in the movie, for a movie with a budget of $57 million.

c. the probability of getting a nomination, when there is a star in the movie, for a movie with a budget of $175 million.

d. the probability of getting a nomination, when there is no star in the movie, for a movie with a budget of $222 million.

Assume that you have been asked to answer two questions using the movie data set:

1. How do the revenues made by a movie (rdgmillions) depend on the total number of theaters (totaltheaters) that show the movie? Use 2008 data.

2. Does this relationship differ between 2005 and 2008?

Attachment:- copy_of_movie_data_master-1_3_2.zip

Request for Solution File

Ask an Expert for Answer!!
Business Economics: Stata data set
Reference No:- TGS01750468

Expected delivery within 24 Hours