Cmth 642 - calculate the average amount of iron by high and


Advanced Methods Assignment

1. Read the csv files in the folder.

2. Merge the data frames using the variable "ID". Name the Merged Data Frame "USDA".

3. Prepare the dataset for analysis.

4. Remove records with missing values in 4 or more vectors.

5. How many records remain in the data frame?

6. For records with missing values for Sugar, Vitamin E and Vitamin D, replace missing values with mean value for the respective vector.

7. With a single line of code, remove all remaining records with missing values. Name the new Data Frame "USDAclean".

8. How many records remain in the data frame?

9. Which food has the highest sodium level?

10. Create a scatter plot using Protein and Fat, with the plot title "Fat vs Protein", labeling the axes "Fat" and "Protein", and making the data points red.

11. Create a histogram of Vitamin C distribution in foods, with a limit of 0 to 100 on the x-axis and breaks of 100.

12. Add a new variable to the data frame that takes value 1 if the food has higher sodium than average, 0 otherwise. Call this variable HighSodium.

13. Do the same for HighCalories, HighProtein, HighSugar, and HighFat.

14. How many foods have both high sodium and high fat?

15. Calculate the average amount of iron by high and low protein (i.e. average amount of iron in foods with high protein and average amount of iron in foods with low protein).

Attachment:- Data Analytics Advanced Methods.rar

Solution Preview :

Prepared by a verified Expert
Applied Statistics: Cmth 642 - calculate the average amount of iron by high and
Reference No:- TGS02464653

Now Priced at $50 (50% Discount)

Recommended (91%)

Rated (4.3/5)