(also called Two Sample Z Test for Proportions)

Two Proportions Z Test A two-proportions z-test is a statistical test used to compare the proportions of two independent samples. It is used to test a hypothesis about the difference between the proportions of the two samples and is based on the assumption that the samples are drawn from populations with a normal distribution.

## Steps in Two Proportions Z Test

To conduct a two-proportions z-test, the following steps are typically followed:

**Specify the null and alternative hypotheses.**The null hypothesis is usually that there is no difference between the proportions of the two samples, while the alternative hypothesis is that there is a difference between the proportions.**Collect data**for the two samples and**calculate the sample proportions**.**Calculate the test statistic**, which is the difference between the sample proportions, divided by the standard error of the difference.**Determine the test statistic's critical value**based on the test's significance level (alpha).**Compare the calculated test statistic to the critical value**to determine whether to reject or fail to reject the null hypothesis. If the calculated test statistic exceeds the critical value, the null hypothesis is rejected, and the alternative hypothesis is accepted.

## Conditions for Two Proportions Z-Test

To conduct a valid two-proportions z-test, the following conditions must be met:

- The samples must be drawn randomly and independently from the populations.
- The data contains only two categories: (for example) pass/fail or yes/no.
- The sample sizes must be large enough to ensure a normal distribution of the sample proportions. Specifically, both np and n(1−p) should be at least 10 for both samples.

Learn Statistics Using Excel

Plain & Simple Lessons on Descriptive & Inferential Statistics Theory With Excel Examples for Business

Learn Statistics Using Python

Learn Python from Basics • Descriptive, Inferential Statistics • Plots for Data Visualization • Data Science

Learn Statistics Using R Programming

Learn Programming in R • Descriptive, Inferential Statistics • Plots for Data Visualization • Data Science

## Typical Null and Alternate Hypothesis in Two Proportions Z-Test

**a) Two-Tail Test: **

In a two-proportions z-test, the null hypothesis is that there is no difference between the proportions of the two samples. This can be expressed as:

H0: p1 = p2

where p1 is the proportion of the first sample and p2 is the proportion of the second sample.

The alternate hypothesis is the opposite of the null hypothesis and is that there is a difference between the proportions of the two samples. This can be expressed as:

Ha: p1 ≠ p2

**b) Left Tail Test: **

A left-tailed hypothesis is one in which the proportion of the first sample is less than that of the second sample. This can be expressed as:

H0: p1 >= p2

Ha: p1 < p2

**c) Right Tail Test: **

A right-tailed hypothesis is one in which the proportion of the first sample is greater than that of the second sample. This can be expressed as:

H0: p1 <= p2

Ha: p1 > p2

## Calculating Test Statistic

The z-score represents the number of standard errors that the difference between the sample proportions is from 0. It is used to determine whether the difference between the proportions of the two samples is statistically significant. There are two approaches to calculating the z value: Pooled and Unpooled approaches.

### a) Two Proportions Z Test with Pooled Approach

The pooled proportion is the weighted average of the proportions of the two samples. It is used in the two-proportions z-test with a pooled approach to estimate the population proportion when the population variances of the two samples are assumed to be equal.

The formula for calculating the pooled proportion is as follows:

$$\LARGE{p_{pooled} = \frac{p1 \cdot n1 + p2 \cdot n2}{n1 + n2}}$$

Where p1 is the proportion of the first sample, p2 is the proportion of the second sample, n1 is the size of the first sample, and n2 is the size of the second sample.

The formula for calculating the test statistic in a two-proportions z-test with a pooled approach is as follows:

$$\LARGE{z = \frac{(p_1 - p_2)}{\sqrt{\frac{p_{pooled}(1 - p_{pooled})}{n_1} + \frac{p_{pooled}(1 - p_{pooled})}{n_2}}}}$$

Where p1 is the proportion of the first sample, p2 is the proportion of the second sample, n1 is the size of the first sample, n2 is the size of the second sample, and p_pooled is the pooled proportion.

### b) Two Proportions Z Test with Unpooled Approach

The formula for calculating the test statistic in a two-proportions z-test with an unpooled approach is as follows:

$$\LARGE{z = \frac{(p1 - p2)}{\sqrt{\frac{p1(1 - p1)}{n1} + \frac{p2(1 - p2)}{n2}}}}$$

Where p1 is the proportion of the first sample, p2 is the proportion of the second sample, n1 is the size of the first sample, and n2 is the size of the second sample.

## Calculating Critical Values

The critical values for the z-score in a two-proportions z-test depend on whether the test is a one-tail or two-tail test.

For a one-tail test, the critical value is determined based on the tail of the distribution in which the alternative hypothesis is located. For example, if the alternative hypothesis is that the proportion of the first sample is greater than the proportion of the second sample, the critical value would be the value that corresponds to the upper tail of the distribution.

For a two-tail test, the critical value is determined based on the significance level of the test and the total area of both tails of the distribution. For example, if the significance level is 0.05 and the test is two-tailed, the critical value would be the value that corresponds to the area in each tail that is equal to 0.025 (since the total area in both tails is 0.05).

## Interpreting the Results

Once you have calculated the test statistic and critical value, you can compare them to determine whether to reject or fail to reject the null hypothesis. If the calculated test statistic is greater than the critical value, you can reject the null hypothesis and accept the alternative hypothesis, indicating that there is a statistically significant difference between the proportions of the two samples.