Data analysis is one of the most important tools for research and any academic exercise. Here are some common statistical measures or tools widely used to interpret any data.
Mean: For any statistical data, the most commonly understood measure is the mean. Mean is simply the arithmetic average. Thus, if a data set is a temperature over 7 days of the week, the mean temperature would be adding up all the observed temperature and dividing it by 7.
Limitation of mean: When you’re dealing with a large number of statistical data points with high deviations or an uneven distribution of entries, the simple mean fails to give a complete sense of the data and some other measure is required.
Standard deviation: The standard deviation, often represented with the Greek letter sigma, is the measure of a spread of data around the mean. The standard deviation is calculated by subtracting each data point from the mean, squaring up the result, and then calculating a mean of these results.
A high standard deviation suggests the statistical data is spread widely from the mean and vice versa. The standard deviation thus gives an understanding of how significant the mean value is.
Limitations of standard deviation: While the standard deviation, read along with the mean, gives a better sense of the statistical data, it fails to capture skewness of the statistical data set under consideration.
Regression: Regression is a statistical analysis determining causality between a dependent variable (the data you’re looking to measure) and an independent variable (the data used to predict the dependent variable). Regression is critical for forecasting or identifying trends of a statistical data set.
Limitations of regression: Regression focuses too much on the trend and ignores outliers which may be critical. It also fails to explain why certain values are outliers which too is often an important focus of studying statistical data.
Sample size determination: Sampling is a quick measure for any study without collecting data for the entire data universe. However, for sampling to be meaningful, determining the right sample size is critical. Given sampling is done to save time and resources, care needs to be taken to ensure the exercise does not consider too little inputs to be meaningful.
Limitations of Sample size: Sampling is just an estimation and does not capture the full extend. Thus, analysis of any sample only gives at best the closest estimate for any statistical data.
Hypothesis testing: hypothesis testing assesses if a certain premise (or assumption) is actually true for your statistical data set. A ‘statistically significant’ hypothesis testing confirms the results are not random or by chance.
Limitations of hypothesis testing: the most common pitfalls are the Placebo Effect where the analyst is biased towards a result and the Hawthorne effect where the respondents to the exercise give a skewed response.
The 5 methods explained are the most basic and commonly used statistical tools. There exist many other measures for deeper analysis of statistical data.