Lab reducing crime - draft report you will create an


Lab: Reducing Crime

Introduction - Your team has been hired to provide research for a political campaign. They have obtained a dataset of crime statistics for a selection of counties in North Carolina.

Your task is to examine the data to help the campaign understand the determinants of crime and to generate policy suggestions that are applicable to local government.

You may work in a team of up to 3 students. This is not a requirement, but we strongly encourage you to form a group and believe it will add considerable value to the exercise.

When working in a group, do not use a "division-of-labor" approach to complete the lab. All students should participate in all aspects of the final report.

Timeline - The lab takes place over three weeks, with a deliverable due each week.

Stage 1: Draft Report. You will create an intermediary report focused on model building but without statistical inference (no standard errors).

Stage 2: Peer Feedback. Teams will exchange reports and provide each other with feedback.

Stage 3: Final Report. You will create a final report, which includes a complete assessment of the classical linear model assumptions, standard errors, and other elements of statistical inference.

The Data - The data is provided in a file, crime.csv. It was first used in a study by Cornwell and Trumball, researchers from the University of Georgia and West Virginia University.

Stage 1: Draft Report -

In the first stage of the project, you will create a draft report that addresses the concerns of the political campaign. Your report will include a model building process, culminating in a well formatted regression table that displays a minimum of three model specifications. In fact, your draft report will be very similar in structure to your final report, but won't include standard errors or a full assessment of the classical linear model assumptions, which we will cover in units 12 and 13.

Here are some things to keep in mind during your model building process:

1. What do you want to measure? Make sure you identify variables that will be relevant to the concerns of the political campaign.

2. What transformations should you apply to each variable? This is very important because transformations can reveal linearities in the data, make our results relevant, or help us meet model assumptions.

3. Are your choices supported by EDA? You will likely start with some general EDA to detect anomalies (missing values, top-coded variables, etc.). From then on, your EDA should be interspersed with your model building. Use visual tools to guide your decisions.

4. What covariates help you identify a causal effect? What covariates are problematic, either due to multicollinearity, or because they will absorb some of a causal effect you want to measure?

At the same time, it is important to remember that you are not trying to create one perfect model. You will create several specifications, giving the reader a sense of how robust your results are (how sensitive to modeling choices), and to show that you're not just cherry-picking the specification that leads to the largest effects.

At a minimum, you should include the following three specifications:

  • One model with only the explanatory variables of key interest (possibly transformed, as determined by your EDA), and no other covariates.
  • One model that includes key explanatory variables and only covariates that you believe increase the accuracy of your results without introducing substantial bias (for example, you should not include outcome variables that will absorb some of the causal effect you are interested in). This model should strike a balance between accuracy and parsimony and reflect your best understanding of the determinants of crime.
  • One model that includes the previous covariates, and most, if not all, other covariates. A key purpose of this model is to demonstrate the robustness of your results to model specification.

Guided by your background knowledge and your EDA, other specifications may make sense. You are trying to choose points that encircle the space of reasonable modeling choices, to give an overall understanding of how these choices impact results.

You will display all of your model specifications in a regression table, using a package like stargazer to format your output. It should be easy for the reader to find the coefficients that represent key effects near the top of the regression table, and scan horizontally to see how they change from specification to specification. Since we won't cover inference for linear regression until unit 12, you should not display any standard errors at this point. You should also avoid conducting statistical tests for now (but please do point out what tests you think would be valuable).

After your model building process, you should include a substantial discussion of omitted variables. Identify what you think are the 5-10 most important omitted variables that bias results you care about. For each variable, you should estimate what direction the bias is in. If you can argue whether the bias is large or small, that is even better. State whether you have any variables available that may proxy (even imperfectly) for the omitted variable. Pay particular attention to whether each omitted variable bias is towards zero or away from zero. You will use this information to judge whether the effects you find are likely to be real, or whether they might be entirely an artifact of omitted variable bias.

Do not use techniques that are not covered in this course, unless you have obtained prior approval.

Stage 2: Peer Feedback -

In Stage 2, you will provide feedback on another team's draft report. We will ask you to comment separately on different sections. The following list is very similar to the rubric we will use when grading your final report.

1. Introduction. As you understand it, what is the motivation for this team's report? Does the introduction as written make the motivation easy to understand? Is the analysis well-motivated? Note that we're not necessarily expecting a long introduction. Even a single paragraph is probably enough for most reports.

2. The Initial EDA. Is the EDA presented in a systematic and transparent way? Did the team notice any anomalous values? Is there a sufficient justification for any data-points that are removed? Did the report note any coding features that affect the meaning of variables (e.g. top-coding or bottom-coding)?  Can you identify anything the team could do to improve its understanding or treatment of the data?

3. The Model Building Process. Overall, is each step in the model building process supported by EDA? Is the outcome variable (or variables) appropriate? Did the team consider available variable transformations and select them with an eye towards model plausibility and interperability? Are transformations used to expose linear relationships in scatter-plots? Is there enough explanation in the text to understand the meaning of each visualization?

4. The Regression Table. Are the model specifications properly chosen to outline the boundary of reasonable choices? Is it easy to find key coefficients in the regression table? Does the text include a discussion of practical significance for key effects?

5. The Omitted Variables Discussion. Did the report miss any important sources of omitted variable bias? For each omitted variable, is there a complete discussion of the direction of bias? Are the estimated directions of bias correct? Does the team consider possible proxy variables, and if so do you find these choices plausible? Is the discussion of omitted variables linked back to the presentation of main results? In other words, does the team adequately re-evaluate their estimated effects in light of the sources of bias?

6. Conclusion. Does the conclusion address the big-picture concerns that would be at the center of a political campaign? Does it raise interesting points beyond numerical estimates? Does it place relevant context around the results?

7. Throughout the report, do you find any errors, faulty logic, unclear or unpersuasive writing, or other elements that leave you less convinced by the conclusions?

Please be thorough and read the report critically, actively trying to find weaknesses. Your comments will directly help your peers get the most value out of the project.

Stage 3: Final Report -

In the final stage of the project, you will incorporate the feedback you receive, and use what you've learned about OLS inference to create a final report.

One of the most important tasks at this stage is to add valid standard errors to your regression table.

In a new section of the report, please choose one of your most important model specifications, and present a detailed assessment of all 6 classical linear model assumptions. Use plots and other diagnostic tools to assess whether the assumptions appear to be violated, and follow best practices in responding to any violations you find. Note that we only want to see this level of detail for one model specification.

For the other specifications, you should also conduct a full assessment of the CLM assumptions, but only highlight major surprises that you notice in your text.

Note that you may need to change your model specifications in response to violations of the CLM. At this point, you should also consider whether changes are appropriate to decrease standard errors for your estimates. These decisions involve tradeoffs and you should strive to be transparent about them in your report.

Note also that you may need to adjust your conclusions in response to statistical significance. Make sure that you discuss both statistical and practical significance for your key effects of interest.

You may want to include statistical tests besides the standard t-tests for regression coefficients.

We will assess your final report using a rubric that includes the elements listed above. We will also consider whether you have correctly included elements of statistical inference in your report. In particular, we will look to see whether you have correctly assessed the CLM assumptions and whether you have responded appropriately to any violations.

Attachment:- Assignment Files.rar

Request for Solution File

Ask an Expert for Answer!!
Dissertation: Lab reducing crime - draft report you will create an
Reference No:- TGS02711849

Expected delivery within 24 Hours