this problem is intended to be more open-ended


This problem is intended to be more open-ended than previous assignments, so you can get a feel for what it's like to do an actual project. I am providing some basic guidance so you'll have some idea of what I want to see, but the specifics of what you choose to do will be up to you. As always, your write-up should be your own work. You may talk with other students about the project, but this is your problem to solve.

You will have two datasets to choose from to complete this part of the assignment. Use only one of these, not both.

n The dataset names.dta contains information regarding job postings and resumes of applicants. This dataset is the basis of a remarkable (and quite famous) paper by Bertrand and Mullinathan published in 2004. You will probably want to use the variable call_back as your dependent variable.

n The dataset seatbelts.dta is a panel dataset containing state-specific information about motor vehicle fatalities over a 15 year period. This dataset formed the basis of a paper by Cohen and Einav in 2003.

Each dataset has a description file detailing the variables it contains.

Presentation Rules:

All responses must be typed. 

STATA tables that are cut and pasted in are NOT acceptable. You need to create summary tables that present the information in a reader-friendly fashion. Imagine you are writing this up for your new boss, who doesn't have time to go through countless pages of STATA output. You need to distill the essential information you want to convey in a few easy to read tables. Here's an example I have fabricated out of thin air.

Dependent Variable: Goldfish Lifespan (days)

Independent Variables

Model 1

Model 2

Model 3

Bowl Cleaned ≥1 per week (Dummy variable)

25.12 (.005)

21.23 (.008)

15.11 (.001)

Food per day (grams)

1.378 (.034)

  1.455  (.044)

1.144 (.021)

Food per day ^2 (grams)

 

-.00034 (.002)

-.00045 (.003)

Water Temperature (F)

 

 

-.15 (.14)

n=435, p-values in parenthesis, using robust standard errors.

 I believe to do a thoughtful job analyzing these datasets will take a minimum of 2-3 typed pages (not including any tables). I am not setting a page requirement, but if you are coming in low, you probably aren't thinking hard enough.

Answer the Following

a) Which dataset are you using?

b) What is your dependent variable?

c) What is the question you want to use the dataset to try and answer? What is your key independent variable(s)? [State these very clearly. It is important you know exactly what it is you are trying to accomplish before you try to accomplish it.]

Example: Does cleaning a fishbowl increase the lifespan of a goldfish? My wife says yes. Easy for her to say. When's the last time she cleaned Ludwig's bowl? Try never. That's right, never. But I digress.

d) What is your expected answer to the question? [This is your chance to develop your ideas as to what you think the relationship between the variables is.]

Example: [Yours should be much longer and thoughtful.] I think a thick layer of organic compounds (aka "scum") on the inside of a fishbowl is a sign of a healthy biosphere. A fish will benefit from the rich broth of nutrients a non-cleaned bowl will provide. Those that provide a "sterile," or "habitable," environment are doing their fish a disservice. Thus, I expect there to be a negative relationship between cleaning a bowl and the lifespan of a goldfish.

e) What control variables do you think you should use? For each, what do you think would be the relationship between these control variables and the dependent/key independent variables? Alternatively, how would the omission of these variables impact your central analysis?

Example:[Your's should be much longer and thoughtful.] I think I should control for food consumption. Fish need food to survive. Giving the fish food should increase its lifespan. It is also possible that people who feed their fish periodically will be more likely to clean the fishbowl. I thus expect omitting food consumption will lead to a positive bias in the coefficient on cleaning the bowl.

f) What is the functional form of your regression? Why have you chosen this form?

Example: I have cross-sectional data from 435 random selected households. I have no reason to believe that any of the CLRM assumptions aren't being met, so I will use OLS to estimate the model. I will use robust standard errors in case there is heteroskedasticity of unknown form. The only novel functional form issue is that I will include a quadratic term for food, in case very high food consumption is bad for fish. I've heard rumors that Timmy Jorgenson gave his fish like a pound of food before vacation and when he got back he'd swelled up to the size of a cantaloupe. For real. But I digress.

g) Run your regressions. You should have several different variations of the model, using different control variables and possibly slightly different functional forms. Compile the important numbers in a table, as I did above.

h) Describe your results. Obviously, you should highlight the signs and significance levels of your key variables, and any interesting findings for the control variables. How do the findings change based on specification of the model? Are the results consistent with your initial guess?

i) Propose (at least one) interaction that includes your key independent variable. Why do you think this interaction is potentially important? You should present these results in a table, as well. Be sure to interpret these results carefully.

j) Conclude your paper by summing up your major findings. Suggest related questions you'd be interested in and ways you would want to try to investigate the ideas here more thoroughly.

Request for Solution File

Ask an Expert for Answer!!
Microeconomics: this problem is intended to be more open-ended
Reference No:- TGS0490236

Expected delivery within 24 Hours