Hypothesis testing is the process of testing validity of a hypothesis or a supposition in relation to a statistical parameter. Hypothesis testing is used by analysts to determine whether or not a hypothesis is reasonable. For example, hypothesis testing could be used to find whether a certain drug is effective or not in treating headache. It uses data from a sample to draw conclusions about a statistical parameter. Hypothesis testing is an important step as it validates statistical parameter which could be used in making conclusions or inference about population or large sample data.
Types of Hypothesis
In data sampling, different types of hypothesis is used to examine whether a sample is positive for test hypothesis or not.
- Alternative Hypothesis (H1) – This hypothesis states that there is a relationship between two variables (where one variable affects the value of other variable). The relationship that exists between the variables is not due to chance or coincidence.
- Null Hypothesis (H0) – This hypothesis states that there is no relationship between two variables. It states that the effect of one variable on another is entirely due to chance, with no empirical explanation.
- Non-Directional Hypothesis – It states that there is a relationship between two variables, but that the direction of influence is unknown.
- Directional Hypothesis – It states the direction of effect of the relationship between two variables.
Alternative hypothesis and null hypothesis is used to study data samples to find a possible pattern to form a statistical hypothesis that can be validated through hypothetical testing. Alternative hypothesis and Null hypothesis cannot be true at the same time as they are mutually exclusive. Similarly, Non-directional and directional hypothesis cannot be true at the same time as they are mutually exclusive.
Methods of Hypothesis Testing
- Frequentist Hypothesis Testing- This is the traditional approach to hypothesis testing. It involves making assumptions on current data and comparing prior knowledge about hypothesis with posterior knowledge of the hypothesis to form a conclusion on the hypothesis. One of the subtypes of this approach is Null Hypothesis Significance Testing.
- Bayesian Hypothesis Testing- It is one of the modern methods of hypothesis testing. In this method prior probability of hypothesis from past data and current data is used to find posterior probability of the hypothesis.
The Bayes factor, which is a key component of this approach, represents the likelihood ratio between the null and alternative hypotheses. This factor indicates the plausibility of either of the two hypotheses formed for hypothesis testing.
Techniques of Hypothesis Testing
There are few commonly used Tests: Z-Test, T-Test, Chi squared Test and F-Test.
- Z Test- A z test is performed on a population with independent data points that follows a normal distribution and has a sample size of larger than or equal to 30. When the population variance is known, it is used to determine whether the means of two populations are equal. Z test statistic is compared to the crucial value and the null hypothesis of z test is rejected if the z test statistic is statistically significant.
X̄ =sample average
- T Test – A t-test is an inferential statistic that is used to see if there is a significant difference in the means of two groups that are related in some way. This test is also called as Student test. It is used when variables are continuous, sample size is less than 30, and population standard deviation is not known. T statistic is used to arrive at a conclusion on whether to accept the hypothesis or reject the hypothesis.
t= Student’s t-test
m= mean of sample
µ= assumed mean
s= standard deviation
n= number of observations
- Chi squared Test – A chi-square statistic is a test that evaluates how well a model matches actual data. For using Chi squared test the data used must be random, mutually exclusive, taken from independent variables from a large sample.
c=Degrees of freedom
There are two types of χ2 test – the test of independence, and goodness-of-fit test. A χ2 test for independence can show us how likely it is that random chance can explain any observed difference between the actual frequencies in the data and these theoretical expectations.
- F Test – Any statistical test with an F-distribution under the null hypothesis is known as an F-test. It is generally used to compare statistical models that have been fitted to a data set to find which model best fits the population from which the data were sampled. To perform an F-test, the population must have an f distribution and the samples must be random. If the f test findings are statistically significant, the null hypothesis is rejected otherwise, it is not. F statistic for large samples:
σ12 = variance of the first population
σ22 = variance of the second population