Above is a graphical representation of the chi-square distribution. What is being illustrated is the probability densities of various chi-square distributions based on degrees of freedom.

__Things to remember about the Chi-Squared Distribution:__**1. It is a continuous probability distribution.**

2. It is related to the standard normal distribution.

3. Degrees of freedom for a sample chi-square distribution will be the total number of independent standard normal variables minus one.

2. It is related to the standard normal distribution.

3. Degrees of freedom for a sample chi-square distribution will be the total number of independent standard normal variables minus one.

The chi-squared distribution is utilized for goodness-of-fit tests. Meaning, that it is used to test one set of data against another. This is undertaken in order to determine whether a model of predictability is accurate. The degrees of freedom (n-1), or the size of the sample minus one, determines the shape of the probability density curve. Alpha, or 1 minus the confidence interval, will determine the size of the rejection region. This region is defined as the right most area beneath the distribution curve. The chi-square value, is derived from utilizing a mathematical function. Once derived, this value is matched against a chi-square distribution table. The chi square value, in conjunction with the determined degrees of freedom and the alpha value, ultimately determine as to whether a relationship may be assumed to exist.

__Example:__A small motel owner has created a model which he believes, is an accurate predictor of individuals who will stay at his establishment. He presents you with his findings:

Monday: 20

Tuesday: 28

Wednesday: 18

Thursday: 25

Friday: 16

Saturday: 22

Sunday: 26

The following week, you are tasked with keeping track of guests who rent rooms at the motel. Here are your findings:

Monday: 14

Tuesday: 25

Wednesday: 22

Thursday: 18

Friday: 16

Saturday: 24

Sunday: 30

__Given your findings, and assuming a 95% confidence interval, can we assume that the motel owner's model is an accurate predictor?__

**Model <- c(20, 28, 18, 25, 16, 22, 26)**

**Results <- c(14, 25, 22, 18, 16, 24, 30)**

**chisq.test(Model, p = Results, rescale.p = TRUE)**

__Console Output:__

*Chi-squared test for given probabilities*

*data: Model*

*X-squared = 6.5746, df = 6, p-value = 0.362*

__Findings:__Degrees of Freedom (df) - 6

Confidence Interval (CI) - .95

Alpha (α) (1-CI) - .05

Chi Square Test Statistic - 6.5746

This creates the hypothesis test parameters:

H0 : The model is a good fit (Null Hypothesis).

The critical value of 12.59 is found when consulting the chi-square distribution table. Since our chi-square value is less than this value (6.5746 < 12.59), we can state, that with 95 % confidence, that the owner's model is accurate.

Cannot Reject: Null Hypothesis.

__Example:__The same small motel owner also created an additional model which he believes, is an accurate predictor of individuals who will stay at his establishment. He presents you with his findings:

Monday: 10%

Tuesday: 5%

Wednesday: 20%

Thursday: 10%

Friday: 20%

Saturday: 30%

Sunday: 5%

(Predicted percentage of total individuals who will stay throughout the week)

(Predicted percentage of total individuals who will stay throughout the week)

*The following week, you are tasked to keep track of guests who rent rooms at the motel. Here are your findings:*

Monday: 11

Tuesday: 25

Wednesday: 30

Thursday: 13

Friday: 23

Saturday: 17

Sunday: 8

*(Actual number of individuals who stayed throughout the week)*

__Given your findings, and assuming a 95% confidence interval, can we assume that the motel owner's model is an accurate predictor?__

**Model <- c(.10, .05, .20, .10, .20, .30, .05)**

**Results <- c(11, 25, 30, 13, 23, 17, 8)**

**chisq.test(Results, p=Model, rescale.p= FALSE)**

__Console Output:__

*Chi-squared test for given probabilities*

*data: Results*

*X-squared = 68.184, df = 6, p-value = 9.634e-13*

__Findings:__Degrees of Freedom (df) - 6

Confidence Interval (CI) - .95

Alpha (α) (1-CI) - .05

Chi-Square Test Statistic - 68.184

This creates the hypothesis test parameters:

H0 : The model is a good fit (Null Hypothesis).

The critical value 12.59, is found when consulting the chi-squared distribution table. Since our chi-square value is greater than this value (68.184 > 12.59), we cannot state, that with 95 % confidence, that the owner's model is inaccurate.

Reject: Null Hypothesis.

__Example:__While working as a statistician at a local university, you are tasked to evaluate, based on survey data, the level of job satisfaction that each member of the staff currently has for their occupational role. The data that you gather from the surveys is as follows:

__General Faculty__

130 Satisfied 20 Unsatisfied

__Professors__

30 Satisfied 20 Unsatisfied

__Adjunct Professors__

80 Satisfied 20 Unsatisfied

__Custodians__

20 Satisfied 10 Unsatisfied

The question remains however, as to whether the assigned role of each staff member, has any impact on the survey results. To decide this, with 95% confidence, you must follow the subsequent steps.

First, we will need to input this survey data into R as a matrix. This can be achieved by utilizing the code below:

**Model <- matrix(c(130, 30, 80, 20, 20, 20, 20, 10), nrow = 4, ncol=2)**

The result should resemble:

Once this step has been completed, the next step is as simple as entering the code:

**chisq.test(Model)**

__Console Output:__

*Pearson's Chi-squared test*

*data: Model*

*X-squared = 18.857, df = 3, p-value = 0.0002926*

__Findings:__Degrees of Freedom (df) - 3

Confidence Interval (CI) - .95

Alpha (α) (1-CI) - .05

Chi Square Test Statistic -

*18.857*

This creates the hypothesis test parameters:

H0 : There is no correlation between job type and job satisfaction (Null Hypothesis). Job type and job satisfaction are independent variables.

HA: There is a correlation between job type and job satisfaction. Job type and job satisfaction are not independent variables.

The critical value 7.815 is found when consulting the chi squared distribution table. Since our chi square value is greater than this value (18.857 > 7.815), we can state, that with 95 % confidence, that there is a correlation between job type and overall satisfaction.

Reject: Null Hypothesis.

* Source for Chi Square Distribution Image - https://en.wikipedia.org/wiki/Chi-squared_distribution

## No comments:

## Post a Comment

Note: Only a member of this blog may post a comment.