The chi-square test is a statistical test commonly used to determine whether there is a significant difference between the expected and observed frequencies in a categorical data set. It is a goodness-of-fit test, which means that it is used to assess how well the observed data fit a particular theoretical distribution.
To perform a Chi-square test, the following steps are typically followed:
- Specify the null and alternative hypotheses. The null hypothesis is usually that the observed and expected frequencies are the same, while the alternative hypothesis is that they are different.
- Collect and summarize the data. Calculate the observed frequencies and the expected frequencies for each category.
- Calculate the Chi-square statistic using the formula: $$\LARGE{\chi^2 = \sum_{i=1}^{n}\frac{(O_i - E_i)^2}{E_i}} $$ Where: • \(O_i\) is the observed frequency in category i • \(E_i\) is the expected frequency in category i • \(n\) is the number of categories
- Determine the critical value of the Chi-square statistic based on the significance level (alpha) of the test and the degrees of freedom. The degrees of freedom are calculated as the number of categories minus 1 (df = n - 1).
- Compare the calculated Chi-square statistic to the critical value to determine whether to reject or fail to reject the null hypothesis. If the calculated Chi-square statistic exceeds the critical value, the null hypothesis is rejected, and the alternative hypothesis is accepted.
The assumptions of the Chi-square test are that the data is randomly sampled from a population, that the expected frequencies are greater than 5 in each category, and that all observations are independent.
In addition to testing the goodness-of-fit of a categorical data set, the Chi-square test can also be used to test the independence of two categorical variables. The steps to perform a Chi-square independence test (Contingency Table) are similar to those outlined above, except for using a two-dimensional contingency table to organize the data and calculate the expected frequencies.
Goodness of Fit Test Calculator:
Goodness of Fit (Expected vs Observed)