Manipulate the value field settings of the pivot table and


Problem 1) Our book defines a random variable (RV) as: A random variable is the numerical description of the outcome of an experiment. Formally, a random variable is a function that assigns a real number to each element of the sample set.  (Evans, p. 140). With this in mind, consider a query to a help desk that always triggers 3 related  activities--1) initially a phone discussion occurs regarding the nature of the problem, 2) a solution is generated to the problem by referring the problem to a specific subject matter expert (an SME), and 3) a review of the help query for quality assurance purposes.  There are 3 equally probable types of problem calls, 3 equally probable types of solutions, and 2 equally probable types of reviews.  They can be measured in terms of minutes that are consumed in the 3 parts of the process.  Also, 1), 2) and 3) above are independent events. Each of the possible values are shown below (obviously, this is a simplification of reality and many values would likely be possible).

Initial Contact (mins)

Resolution (mins.)

QA Review of event (mins.)

7

12

1

10.5

20.5

3

15.5

29

 

A RV is calculated that is the sum (time) of these 3 independent outcomes.

1) List the values of the RV in the row provided below.

2) Provide the probabilities of the RVs, as well as the expected value and the standard deviation.

Problem 2) Consider the 12 sample observations in the table below.  The sample data represents the opening of various accounts at a large American bank during a monthly period.  Each transaction records the account type, gender of the individual opening the account, the level of education of the individual, and the amount invested; for example, observation 4 opened an IRA, is female, has a college education, and invested 5000.  Create a pivot table with account in the columns area, education in the rows area, gender in the filter area, and investment in the values area.  If we consider this sample to be indicative of the population of all account openings in the month, we can make statements that relate to marginal and conditional probabilities.   For example, we can ask-- "Amongst Females that invested in CDs in our data, what is the probability of being College Educated?" This is equivalent to finding the P(Col|Female and CD).

1) Manipulate the Value Field Settings of the pivot table and use the pull down menus for filtering rows, columns, and filter areas to provide the conditional probability of P( HS|Female and CD ).

Problem 3) You travel with 2 business associates to Las Vegas for a professional conference on new fast-food franchise concepts.  Your success in business has brought you in contact with these individuals--an attorney and a CPA. They are what is referred to as "Smart Money": individuals that have special knowledge of a particular business and ready investment capital, in this case, to invest in fast-food franchise operations. After a number of meetings with franchisors pitching their concepts, your team settles on considering a very interesting idea for Indian Deli/Fast-Food. Upon your return to Williamsburg, VA. you contact your banker about the business trip. She is also intrigued by the idea, but acknowledges that the concept is best suited for large urban centers.  She calls a meeting with the group to discuss potential capital investment in the idea.  She directs several questions in advance to the group that she would like answered at the meeting to support the financing: (use the best available distribution from our readings)

1) The banker suggests that it is essential that at least 75% of the stores opened succeed 3 years of operation for the financing to be acceptable.  (Success is defined as a store is profitable and still open for business for 3 years.) You believe that the 3 year survival rate for a store is 0.50 and consider opening 16 test stores in the Charlotte, NC area.  What is the probability that between 12 and 16 stores will be successful after 3 years?

2) Related to question 1), what is the probability that fewer than 5 stores will be open after 3 years of operation?

3) The distribution of annual revenue is estimated to be between $2.5 M dollars and $6.5 M dollars.  Additionally, there is no way to know if any particular level of revenue is more or less likely than another. What is the probability of annual revenue being more than or equal to $3.0 M dollars.

4) In a discussion with the franchisor, he estimates that the average hourly number of customers arriving during the mid-day part of the day (12:00 a.m.--3:00 p.m.) is 220/hr.  What is probability of 300 or more arrivals occurring in the first hour of this day part (11:00-12:00)?

5) If the value of a customer purchase is estimated to be normally distributed with a mean of $4.50 and standard deviation of $3.50, what is the probability of a customer purchase being more than $8.00?

Problem 4) The manager of a project determined some time ago that the most critical task in the timely completion of a project is the use of a special instrument, a confusitron, that has an uncertain completion time (a random variable in terms of hours).  He asks you, a confusitron expert, to use your experience to specify a discrete triangular probability distribution of outcomes--most like time to complete, pessimistic time to complete, and optimistic time to complete.  Recently the project manager has learned that there is another equally critical task--framis-validation.  This task occurs immediately after the confusitron.  You are also a framis-validation expert.  The project manager asks you for a similar discrete triangular distribution of outcomes (see below). He then asks you to create a random sample of 100 observations from this distribution:

Confusitron Distribution of Hrs.

 

prob.

hrs.

optimistic

0.15

35

most likely

0.55

60

pessimistic

0.3

100

 

Framis-Validation Distribution of Hrs.

optimistic

0.15

40

most likely

0.45

50

pessimistic

0.40

105

1) Randomly sample 100 observations for both discrete triangular distributions of each task.  Create a RV that is the sum of the tasks for each observation.  Place the results in the designated area below.

2) Create a frequency distribution column graph of the 100 Sum observations below by determining the Sample Space for the RV; so, the bins for the column graph will be the unique sample space values of the graph. Make the first bin 0. (Hint: there should be 9 bin values, including 0)

3) What is the expected value of the Sum distribution? There are 2 theoretical ways to calculate it. Does it approximately match the average of your 100 observations (as it should)?

Problem 5) C-TPAT is an anti-terrorism program operated by US Customs and Border Protection (USCBP).   You are manager of a small, but vitally important port in Freeport, TX.  The port has seen a substantial increase in container traffic in recent years. USCBP has asked you to react to a proposed schedule for container inspection. You are particularly interested in the additional effort the C-TPAT inspections will generate.   You would like to simulate an inspection schedule and associated time needed to perform the inspections for the coming year.  The inspection plan will randomly select days of operation for inspection using a Bernoulli distribution with a p=0.25 for a 365 day schedule of port operation; thus, approximately 91 days will be selected.  The average arrival rate per day varies very little and has an average arrival rate of 4.8 port calls per day.  A port call occurs when a ship requests entry into port, and arrivals can be modeled with a poisson distribution.  Each inspection requires a thorough inspection of the ships manifest and associated documents. C-TPAT has provided a range of time to perform this task-- a low of 150 minutes and a high of 320 minutes. When you ask how these times might be distributed, the C-TPA representative says that she does not believe that any time is any more likely than any other time.  The actual container inspection, according to the C-TPAT representative, is normally distributed with mean of 210 minutes with a standard deviation of 40 minutes. This distribution comes from a large empirical data base of inspection times, and it can be assumed that every container inspected on a particular day will have the same time for inspection (a simplifying assumption).

1) Use the random number generation capability of the Data Analysis Tools in Excel to create a list of selected days and port calls for inspection, for a 365 day year and the related time required for each selected inspection. Finally, the number of containers to be inspected is a RV that is uniformly distributed from 9 to 23.

2) What is the average theoretical time expected for manifest and container inspection for the year given the information above?

Hint: The process should be as follows--1) determine if a day has inspection, 2) determine port calls for inspected day, 3) determine time to inspect each manifest (assume same for all port calls in a day) for inspected day, 4) determine the number of containers per port call (assume same for all port calls in a day) for inspected day, 5) determine total time for all containers for all port calls on an inspection day., and 6) add the inspection of manifest and containers for total time for the year.

This is a BONUS problem, and it is not an easy problem, but if you follow each step in the table provided, you should be able to make some good progress. You can receive partial credit for a good attempt, and you will not be penalized if you do not attempt.

Problem 6) You purchase access to important census data related to US citizens. The data is related to weight of the adult male population. You are designing a new high-tech compact folding chair and you are interested in the design issues related to weight handling capacity of your chair design.  Currently, your chair can handle up to 200 lbs.

The census data shows that weight of adult males is normally distributed with a mean of 185 lbs. and a standard deviation of 40 lbs.  A local minimum security prison for adult male offenders is interested in purchasing your chair.  You believe the prison population to be indicative of the general US population of adult males.

1) Calculate the probability of a randomly selected male in the population weighing less than or equal to 150 lbs.

2) You tour the facility with the warden and he points to a group of 25 inmates.  He asks you--What do you think of our inmates? They are white-collar criminals and, actually, a very interesting group?  They have formed a haute cuisine club and prepare some incredible meals.  But, I think they are looking a bit plump.  What do you think?  You volunteer to take a sample (the 25 inmates) and measure their average weight to determine if they are representative of the male population average of 185 lbs.  Your sample average is 204 lbs.  Create a confidence interval for the sample mean and determine if there is a significant difference in the average weight of the inmates and the regular population average of 185. Use alpha = 0.05 for the CI.

Problem 7) The adjacent list is a sample of 40 customers that have purchased a product offering through our customer loyalty program in the last 30 days.  These customer are categorized as millennials.  Your Chief Marketing Officer (CMO) claims that for those actually making an expenditure, there is little difference in the millennials' and baby boomers' average expenditure.  The average expenditure for baby-boomers that actually make an expenditure is $78.95.

1) Create a 95% CI (confidence interval) for the sample average of expenditures for millennials who actually make an expenditure. (Careful to consider only the observations that made an expenditure)

2) Does the CI in 1) include the baby-boomer rate?  Is your CMO correct?

Problem 8) After a National Championship season (2013) the W&M Ultimate Mixed Martial Arts (UMMA) team trainers, Lupe-heavy weight division, Abe-welterweight division, and Gene-flyweight division, were celebrating at the Blue Talon Bistro in Williamsburg, VA.   The conversation started as pleasant chatter, but in minutes a roaring argument was blazing!  The headwaiter finally asked the trainers if they could be quiet or leave.  Calm returned to the table and the headwaiter asked what seemed to be the problem.  Gene said that the group was arguing if there was a significant difference of performance by the fighters in the 3 weight divisions.  The headwaiter, a retired data analytics professor at W&M, said: "I have a laptop, and Excel and Minitab.  Why don't we do a test of hypothesis that at least one of the weight divisions is better than the others over the entire 3 meets?"  Lupe had a thumb drive of the points scored by 24 fighters at 3 meets in 3 UMMA weight divisions.  Use the data provided to perform the test of hypothesis and use a level of significance of 0.05.  You may use Excel or Minitab to test the hypothesis. If you use Minitab copy the output to this sheet.

1) Write the Null and Alternative Hypotheses below.

2) Is there was a significant difference in performance (average points) by the fighters in the 3 weight divisions.  (Give me the value of a measure that you use to either reject the null hypothesis or not to reject the null hypothesis.)

Attachment:- Assignment File.rar

Solution Preview :

Prepared by a verified Expert
Applied Statistics: Manipulate the value field settings of the pivot table and
Reference No:- TGS02300325

Now Priced at $75 (50% Discount)

Recommended (93%)

Rated (4.5/5)