For this assignment you will download a data set from the


1. Constructing a Data Set

For this assignment, you will download a data set from the University of Michigan Panel Survey Of Income Dynamics. Note that during this process you will need to create an account. To access the PSID, go to https://psidonline.isr.umich.edu/. (If you are using Windows Explorer, make sure you are compatibility mode.) From the home page you will select Data, and then Data Center under the section labeled Public Use. Once you get to the Data Center, select File. From this page, expand PSID Family- Level and PSID Main Family Data. Then expand the 2005 survey from which you will select your variables.  Following the instructions given on the screen, select the following variables.

ER25017                                 AGE OF HEAD

ER25018                                 SEX OF HEAD

ER25020                                 # CHILDREN IN FU

ER25023                                 HEAD MARITAL STATUS

ER25128                                 BC21 MAIN IND FOR JOB 1: 2000 CODE (HD)

ER25160                                 BC41 YRS PRES EMP (H-E)

ER25161                                 BC41 MOS PRES EMP (H-E)

ER25910                                 G13 WAGES/SALARY OF HEAD

ER27393                                 L40 RACE OF HEAD-MENTION 1

ER27418                                 L55 HGHST COLLEGE DEGREE RECD-HD

ER27931                                 LABOR INCOME OF HEAD-2004

ER28003                                 HEAD WAGE RATE-2006

ER28047                                 COMPLETED ED-HD

After you add the data to your cart, go to your cart and complete the check-out process.  Select PDF for the Code Book type and Microsoft Excel Spreadsheet for the data output type.  Save both in a safe place.

The first thing you must do after downloading your data is to clean it. It will be easiest to do this by using the filter feature in Excel and deleting rows as appropriate. A separate handout will be posted for those of you who do not know how to use this.

a. We need to eliminate from our sample each household for which we are missing data. Typically, this will be a case where a respondent has refused to answer or does not know the answer. Eliminate observations as indicated in the table below:

Variable

Eliminate Observations with Code

AGE OF HEAD

999

HEAD MARITAL STATUS

8, 9

BC21 MAIN IND FOR JOB 1: 2000 CODE (HD)

0, 999

BC41 YRS PRES EMP (H-E)

98, 99*

BC41 MOS PRES EMP (H-E)

98, 99*

G13 WAGES/SALARY OF HEAD

0,  9,999,998,  9,999,999

L40 RACE OF HEAD-MENTION 1

0, 9

L55 HGHST COLLEGE DEGREE RECD-HD

8, 97, 98, 99

LABOR INCOME OF HEAD-2004

0

HEAD WAGE RATE-2004

0

COMPLETED ED-HD

99

*For these two variables you will need to filter twice so as to delete all observations for which both variables take the value of zero. These will be the people we will consider to have no time on their current job and hence not working. Note here that for the sake of simplicity we are assuming away some of the complexity of the actual data set. It is not strictly correct to eliminate some of these people from our data set.  Doing so will not significantly affect our results though.

Once you correctly complete this process you will have 4,256 observations. Before you do anything else, save your data set with a unique name. You should always save this version as it is.  Save the version you create in Part b under a different name.

b. A number of these variables are categorical variables. Convert the following into dummy variables as instructed:

i.  "Head Marital Status" into a single binary variable of married = 1 for married and zero otherwise.  Not married is the benchmark.

ii. "Race of Head-Mention 1" into a single binary variable of white or non-white with white as the benchmark.  White=1 if race is white and zero otherwise.

iii. "Sex of Head" into a single binary variable of female=1 for female and zero for male. In other words male is the benchmark.

iii. "Highest College Degree Recd-HD" into dummy variables for Associate Degree, Bachelor's Degree and Graduate Degree using no degree (code 0) as your benchmark.

iv. Convert the categorical variable INDUSTRY into a series of dummy variables as indicated in the table below.

Dummy Variable Category

Variable Name

Codes in Original Data

Resources, Utilities and Construction

RUC

17-77

Manufacturing

MFCTR

107-399

Wholesale and Retail Trade, Transport and Warehousing

TRDTRSP

407-639

Information, Finance and Real Estate

INFR

647-719

Professional Services

PFSRVCS

727-847

Hospitality and Entertainment

HSPENT

856-869

Other Services

OSRVCS

877-929

Public Admin and Military

PUBADMIN

937-987

Note that for convenience you are creating a dummy variable for every category. Clearly, you can't include all of them in a regression as they will then be perfectly multi-collinear. Whichever variable we leave out of a regression will be the benchmark for that particular regression.

c. You will also need to combine years and months of employment. To do this, you will need to divide months of employment by 12 and add the resulting decimal number to years of employment. You can complete this step either in Excel or in Eviews. Note that PSID also includes length of employment in days. To get an accurate number for length of employment we would combine years, months and days. For simplicity, we are only using years and months. Label this new variable "Tenure"

d. Name your variables in a logical and intuitive manner (you are welcome to use my names, but are not required to) and compute summary statistics for each. Report only the mean, median and standard deviation for each variable. Report the sample size for your overall data set. For each of your dummy variable, interpret the mean. What is it telling you?  The table below provides a check value for the mean some variables.

Variable

Mean

Variable

Mean

Variable

Mean

AGE

41.73167

INFR

 

WHITE

 

FEMALE

 

PFSRVCS

 

ASSOC

 

CHILDREN

 

HSPENT

 

BACHELORS

 

MARRIED

 

OSRVCS

 

GRADUATE

 

RUC

 

PUBADMIN

 

LABOR INCOME OF HD

44,256.72

MFCTR

 

TENURE

8.2883

WAGE_RATE

 

TRDTRSP

0.20113

WAGES

 

YEARS_EDUCATION

13.32

Save all three versions of your data set. You should have the original data from PSID, the data set that you have cleaned up in part a, and the data set with all the dummy variables as separate files.

Solution Preview :

Prepared by a verified Expert
Macroeconomics: For this assignment you will download a data set from the
Reference No:- TGS01142428

Now Priced at $50 (50% Discount)

Recommended (96%)

Rated (4.8/5)