Why are robust statistics such as the median or iqr


1. Why are robust statistics, such as the median or IQR, important in the analysis of modern data sets? Give a reason why (no need to give numeric values that explain what the median and MAD are).

2. If a random variable has a normal distribution with mean of 90 and standard deviation of 30 units, what is the probability that the variable:

(a) has a value less than 75?(b) greater than 120?

3. Why would a resolution III design ever be considered for experimentation, especially considering the high level of confounding that occurs with these designs? In your answer, also explain what a resolution III design means.

4. If you are a new employee at a company, e.g. a petrochemical corporation, give some characteristic features that will make you realize an EVOP strategy is being applied when you are looking at the company's historical data.

5. You will hear about 6-sigma processes frequently in your career. What does it mean exactly that a process is "6-sigma capable"? Draw a diagram to help illustrate your answer.

6. For any least squares model, does a low value of the correlation coefficient, r, imply that the input andoutput variables are unrelated? Explain.

7. Describe why a box plot is an effective univariate summary. Note: do not explain how the box plot is calculated; rather explain how you use it.

8. An exponentially weighted moving average (EWMA) chart allows one to develop a monitoring chart with either Shewhart chart characteristics, or CUSUM (cumulative sum) characteristics.


(a) In which general situation(s) would a more CUSUM-like behaviour be important to a monitoring system?

(b) Now describe a specific example to illustrate your prior answer.

(c) How would you change your EWMA chart to exhibit more CUSUM-like behavior ?

9. A method of fitting a least squares model, LTS, Least Trimmed Squares, takes the full set of n data points and trims out (and totally ignores) a subset of the outlier points so that they do not influence the objective function. This is done as a way to get robustness to outliers.

(a) Write out the regular least squares objective function.

(b) Draw an example to show how a robust least squares model would be beneficial.

(c) Describe an alternative modification to the objective function which would also be robust to outliers.

10. Name a reason why a company (or yourself) would run a set of saturated fractional factorials

11. Why is the principle of minimizing "data ink" so important in an effective visualization? Give anengineering example of why this important.

12. Why are latent variable methods effective for dealing with modern data sets? Your answer must also clearly describe the problem faced with these modern data sets.

13. Explain the intention of blocking in experimental designs.

14. You have two production lines in your company, producing the same product, which is sold to the samecustomers. Production line TL-419 has a Cpk = 0:90 and line TL-417 has Cp = 1:2 (notice that one is Cpkand the other is Cp).

1. When should one use Cpk and when should one use Cp to assess the process capability? [2]

2. Write a few bullet points to your manager to explain which production line should receive most ofthe $200,000 annual budget for process improvements.

15. Itconstraints only allow you to run 9 experiments. You must run two experiments per day to finish theexperiments within 5 days. Each day there is a different crew of plant operators and staff - they are stronglyexpected to have an effect on the results.

Write out an experimental table that blocks for the effect of the operators. Your table must show the levelsof the 4 factors and have an additional column that indicates which day the experiment should be run (1, 2,3, 4 or 5). Give bullet point notes that outline the justification for your table.

Hint: blocking can be viewed as adding additional factor(s) to a fractional factorial, with the blocking levelsgiven by the new factor(s).

16. Your new raw material supplier has a Cpk value of 1.2 for a critical quality variable, and your previous supplier's Cpk is 0.95. Your manager doesn't understand this terminology and wants to understand why yourecommended the new supplier, even though their material is more expensive. Give a brief explanation, andan illustration (diagram) to help your manager

Solution Preview :

Prepared by a verified Expert
Advanced Statistics: Why are robust statistics such as the median or iqr
Reference No:- TGS0958487

Now Priced at $50 (50% Discount)

Recommended (95%)

Rated (4.7/5)