# Bottling Company Case Study

Question

• Assignment 1: Bottling Company Case Study

Imagine you are a manager at a major bottling company. Customers have begun to complain that the bottles of the brand of soda produced in your company contain less than the advertised sixteen (16) ounces of product. Your boss wants to solve the problem at hand and has asked you to investigate. You have your employees pull thirty (30) bottles off the line at random from all the shifts at the bottling plant. You ask your employees to measure the amount of soda there is in each bottle. Note: Use the data set provided by your instructor to complete this assignment.

 Bottle Number Ounces Bottle Number Ounces Bottle Number Ounces 1 14.23 11 15.77 21 16.23 2 14.32 12 15.80 22 16.25 3 14.98 13 15.82 23 16.31 4 15.00 14 15.87 24 16.32 5 15.11 15 15.98 25 16.34 6 15.21 16 16.00 26 16.46 7 15.42 17 16.02 27 16.47 8 15.47 18 16.05 28 16.51 9 15.65 19 16.21 29 16.91 10 15.74 20 16.21 30 16.96

Write a two to three (2-3) page report in which you:

1. Calculate the mean, median, and standard deviation for ounces in the bottles.
2. Construct a 95% Confidence Interval for the ounces in the bottles.
3. Conduct a hypothesis test to verify if the claim that a bottle contains less than sixteen (16) ounces is supported. Clearly state the logic of your test, the calculations, and the conclusion of your test.
4. Provide the following discussion based on the conclusion of your test:
5. If you conclude that there are less than sixteen (16) ounces in a bottle of soda, speculate on three (3) possible causes. Next, suggest the strategies to avoid the deficit in the future.

Or

1. If you conclude that the claim of less soda per bottle is not supported or justified, provide a detailed explanation to your boss about the situation. Include your speculation on the reason(s) behind the claim, and recommend one (1) strategy geared toward mitigating this issue in the future.

• Be typed, double spaced, using Times New Roman font (size 12), with one-inch margins on all sides.  No citations and references are required, but if you use them, they must follow APA format. Check with your professor for any additional instructions.
• Include a cover page containing the title of the assignment, the student’s name, the professor’s name, the course title, and the date. The cover page and the reference page are not included in the required assignment page length.

The specific course learning outcomes associated with this assignment are:

• Calculate measurements of central tendency and dispersal.
• Determine confidence intervals for data.
• Describe the vocabulary and principles of hypothesis testing.
• Discuss application of course content to professional contexts.
• Use technological tools to solve problems in statistics.
• Write clearly and concisely about statistics using proper writing mechanics.

Sample paper

## Hypothesis Testing

The mean for the data is 15.854 ounces. The mean is a measure of central tendency, or how values in a distribution are spread. The median is also a measure of central tendency, and estimates the central value. If the data does not have outliers, the mean and median may be close. Outliers significantly affect the mean, but hardly affect the median. From the data, the mean is 15.854 and the median is 15.99. This means that data did not have significant outliers. On average, most bottles had 15.854 ounces of soda, meaning that they fell short of the required 16 ounces of soda. Cumulatively, only a few had 16 or more ounces of soda. The standard deviation indicates how values are spread around the mean. Normal standard deviation ranges between -3 and +3 standard deviations. The standard deviation of 0.661 indicates that the values fall close to the mean.

The following are the hypotheses applied in analyzing the data.

H0. The volume of soda in the bottles is equal to 16 ounces.

H1. The volume of soda in most bottles is less than 16 ounces.

The 95 percent confidence explains where the population mean would fall. From the calculations, there is a 95 percent chance that the population mean will fall between 16.0907 and 15.6172. The significance level/critical region is (1-0.95) = 0.05. From the tables, the z value is 1.65. This means that 1% of the area falls to the right of the z value while 95% of the area falls to the left in the graph. The test value is: = -1.209. The test value falls within the critical region. As such, the null hypothesis is rejected. This means that the volume of soda in the bottles is not equal to 16 ounces. In other words, there is a significant difference in the expected ounces and the volume that is actually put in the bottles.

There are a number of possible causes in this scenario. The first cause could be a type I error. A type I error occurs when the null hypothesis is rejected yet it is true (Bluman, 2008). This could occur due to miscalculations or a wrong judgement. Another possible cause is machine error. Another possibility is the presence of a defective machine that is leading to incidences of excess and/lower ounces of soda per bottle. Another cause could be malicious activities by staff to put less ounces per bottle, leading to extra ounces of soda being available for sale, although illegally.

The company should conduct a thorough analysis of the machines used in filling the soda bottles to ensure the defective machines are replaced. Alternatively, a more accurate calibration of the machine will help in standardizing the amount or volume discharged per machine. There is also need to investigate whether some employees could be intentionally manipulating the process such that extra soda bottles are produced for sale. This will help in shedding light on the source of this anomaly. It will be possible to fix the problem once the cause is identified.

### Reference

Bluman, A. G. (2008). Elementary statistics: A brief version : Allan G. Bluman. Dubuque:           McGraw-Hill Companies.

Related:

Correlation and Regression

# Correlation and Regression

Question

“Correlation and Regression” (Note: Please respond to the following two [2] items):

1. Debate the following statement: “Correlation means Causation.” Determine whether this statement is true or false, and provide reasoning for your determination, using the Possible Relationships Between Variables table from your textbook. Elementary Statistics: A Brief Version, 6th edition
Author: Allan Bluman
ISBN: 1259211274

2. Biddle and Hamermesh (1990) built a multiple regression model to study the tradeoff between time spent in sleeping and working and to look at other factors affecting sleep:
Sleep = ?0 + ?1 totwrk + ?2 educ + ?3 age + ?
where sleep and totwrk (total work) are measured in minutes per week and educ and age are measured in years. Suppose the following equation is estimated:
Sleep = 3500 – 0.15 totwrk – 11.20 educ + 2.29 age + ?

-Discuss what would happen to someone’s sleep if they choose to work more.
-Analyze whether the factors of totwrk, educ, and age are enough factors to explain the variation in sleep.
-Explain which additional factors should be explored in order to explain the variation in sleep.

Sample paper

## Correlation and Regression

The statement “correlation means causation” is not true. Correlation is a term indicating how closely related two variables or things are to one another. Correlation does not necessarily mean causation because when two things are correlated, the case is not always that one thing causes the other. According to Bluman (2006), the observed relationship between the variables may be the result of the influence of a third variable. Correlation may also occur when there interrelationships among different variables. As such, two things might be correlated but this may be due to the influence of multiple variables. Another possible reason for correlation between variables is the element of chance. The researcher may find correlation between two variables, but the finding could be a matter of coincidence.

If someone chose to work more, sleep would decline. From the equation, working more would lead to a higher negative totwrk, leading to a reduction in the total amount of sleep. The factors towrk, educ, and age are not enough factors to explain the variation in sleep. This is because the model explains about 11 percent of the variations. Some additional factors should be explored to explain the variation in sleep. One of the factors is the number of children in the family. Another factor might be stress levels of the individuals.

### Reference

Bluman, A. G. (2008). Elementary statistics: A brief version : Allan G. Bluman. Dubuque:           McGraw-Hill Companies.

Related:

Rejecting and Accepting the Null

# Rejecting and Accepting the Null

Question

Debate if “failing to reject the null” is the same as “accepting the null.” Support your position with examples of acceptance or rejection of the null. Next, give your opinion on whether or not a failed t test “proves” the null hypothesis.
Take a position on this statement: In setting up a hypothesis test, the claim should always be written in the alternative hypothesis. Provide one (1) example to support your position.

Sample paper

## Rejecting and Accepting the Null

The null hypothesis is an indication that there is no significant difference in the measured phenomena. Failing to reject the null is not the same as accepting the null. Failing to reject the null indicates that the researcher did not find conclusive evidence that there exists significant difference between the particular populations. On the other hand, accepting the null hypothesis is an indication that the researcher found conclusive evidence that there is no significant difference in the measured phenomena. For example, the null hypothesis may read “undertaking tutorial classes has no effect on students’ performance”. The evidence gathered by the researcher may not be enough to help in making the decision whether the statement is true or not. On the other hand, the evidence may indicate that there is no relationship between tutorial classes and the students’ performance, hence accepting the null.

A failed t test proves the null hypothesis since it is an indication that there is no significant difference between means of the specified populations.

In setting up a hypothesis test, the claim should actually be written in the alternative hypothesis. This is because the null hypothesis is a statement indicating there is no true difference between means of the population being observed. If there are differences, it could be due to a sampling error. On the other hand, the alternative hypothesis indicates that there is a real difference between the means of the populations under observation. Below is the example.

Null hypothesis: Taking vegetarian diet has no effect on reducing the cholesterol levels among individuals.

Alternative hypothesis: Taking vegetarian diet has a positive effect in reducing the cholesterol levels among individuals.

Related:

Normal Distribution

# Normal Distribution

Question

Go to the baseball reference Website, located at http://www.baseball-reference.com/teams/, select a baseball team from the list of teams, and analyze the team’s historical win percentage.

From the e-Activity, discuss whether or not the winning history of the team you selected follows a normal distribution. Provide a rationale to support your response.

Sample paper

## Normal Distribution

I have selected the Baltimore Orioles baseball team. The normal distribution curve is bell-shaped. Many data attributes take this pattern when the sample size is large enough. In normal distributions, the average represents the normal value while the deviations from the true value are errors. Small errors are more likely to emerge compared with big errors. As such, most scores will likely fall close to the mean. This gives rise to the normal bell-shaped distribution curve. The winning history of the team follows the normal distribution curve. This is because the probabilities of winning or losing will always fall close to the mean of the distribution. The basic tenet in normal distributions is that extreme observations become rare in the long run, with most observations falling close to the mean.

On plotting a scatter diagram for the distribution of the scores, I was able to obtain a bell shaped curve, which indicates a normal distribution. I selected the team’s historical win percentage from year 1917 to 2017. The mean for this data is 0.48503, while the standard deviation is 0.08713. This is a clear indication that the winning history of the team follows the normal distribution.

Related:

Auto Insurance Costs

# Probability of Winning Arkansas Lottery Game

Question

From the e-Activity, determine the probability of winning your state’s (Arkansas) lottery game.  Provide a rationale to support your determination.

Sample paper

## Probability of Winning Arkansas Lottery Game

In one of the Arkansas Natural State Jackpot, the players are given a set of variables, 1 to 39, from which they choose 5 random variables of their choice. If all the 5 variables match those chosen by the Natural State Jackpot moderators, then the individual scoops the jackpot prize. There are also prizes for four and three matches, albeit smaller. To determine the probability of winning the lottery game, it is important to establish the total number of ways individuals can pick the lottery numbers. Now, the number of ways an individual can pick r items from a set of n items is given by (). r represents the five winning numbers, and n the total set of numbers.

The number of ways of picking 5 numbers from a set of 39 numbers is = 575,757

As such, the probability of winning the jackpot is

The probability of matching 4 out of the 5 winning numbers is as follows:

There are 5 winning tickets, of which, 4 tickets can win. The individual must select 4 of winning numbers, and 1 of the 34 non-winning numbers. The probability is thus given by:

= ( = 0.0002953

Related:

Auto Insurance Costs

# Auto Insurance Costs

Question

Read the article titled, “Auto Insurance Costs: Where Does Your State Rank?” located at http://www.cbsnews.com/8301-505145_162-40542496/auto-insurance-costs-where-does-your-state-rank/. Be prepared to discuss.

“Data Description” (Note: Please respond to one [1] of the following two [2] bulleted items)
From the e-Activity, the table shows Average Insurance Costs by State. Select two (2) states that are of interest to you. Next, speculate on three (3) possible reasons why the states you have chosen would have a difference in average insurance costs.
From the e-Activity, select ten (10) states and calculate the mean and standard deviation for average insurance costs.  Next, calculate the mean and standard deviation for average insurance costs for all 51 states. Compare and contrast the means and standard deviations for the ten (10) states you selected and all fifty one (51) states.

## Auto Insurance Costs

The state and the federal law state that all car owners should insure their vehicle accident related risks.  Insurance companies or government agencies largely focus on making an arrangement with the car owners to provide a guarantee of compensation for particular losses as defined in the customer policy bought by a client.  However, the client has an obligation of paying monthly premiums as calculated by the auto insurance companies (Kofman and Nini, 2012). However, traffic tickets, accidents, and even credit score highly determine a clients auto insurance rate. This study will attempt to identify the reasons for different average insurance costs in different states.

According to an article on Auto Insurance Costs: Where Does Your State Rank published back in 2015, different states have different average insurance costs owing to factors such as the state laws and the judicial system in a state.  According to this article, Michigan had the highest average insurance costs with an average rate of \$2,541 followed by Louisiana with an average of \$2,453 and Vermont had the least average cost of \$995 (Cbsnews.com, 2017).  One of the major causes of this variation is the presence of uninsured drivers in the states which are a violation of the state law.   According to the article, there is a direct correlation between the number of uninsured drivers and average costs.  For example, Michigan and Louisiana have 17 % and 12 % uninsured drivers respectively.  On the same note, the level of competition among auto insurance is another major factor that contributes to this variation.  The article goes on to state that Vermont has a high competition among auto insurance companies than both Michigan and Louisiana reducing this average cost (Cbsnews.com, 2017). Finally, the level and degree of protection provided by the state are a major contributor to this cost.  For example, Michigan guarantees unlimited personal injury protection, unlike other states

### References

Cbsnews.com. (2017). Auto Insurance Costs: Where Does Your State Rank?. [online] Available             at:http://www.cbsnews.com/news/auto-insurance-costs-where-does-your-state-rank/         [Accessed 24 Jul. 2017].

Kofman, P. and Nini, G. (2012). Do Insurance Companies Possess an Informational Monopoly?               Empirical Evidence From Auto Insurance. Journal of Risk and Insurance, 80(4),   pp.1001-1026.

Related:

Data Representation