Analytical Study Design in Medical Research: Cohort study

Cohort studies are observational analytical studies. As mentioned in my previous blog (, the word ‘cohort’ is derived from the Latin word ‘cohors’, which means unit. For conducting cohort type of studies, the study population is chosen from general population both exposed to a certain agent suspected for disease development and unexposed to the cause. The population is followed for a longer period of time. The incidence in disease development in exposed group is compared with the non-exposed group. Therefore, the objective of a cohort study is to find out association between a suspected cause(s) and disease. If performed correctly, cohort studies can predict results comparable to the experimental analytical studies. The following measurements can be done in a cohort study design: absolute risk or incidence, relative risk (risk ratio or rate ratio), risk difference, and attributable proportion. Cohort studies are classified as prospective and retrospective studies based on the timing of enrollment of subjects and disease outcome.

Analytical Study III_Fig1

Prospective Cohort Study

As the name suggested, prospective cohort studies are started with a population containing non-diseased subjects but all having a risk to develop a certain disease, and the investigator waits for the disease to develop. The population is divided into two groups, one with the exposure of the potent agent or environment suspected to be associated with the disease, and the other remains unexposed but with equal susceptibility to develop the disease. Then the population is followed up for a certain period of time until they develop the condition or disease. After following up the study population for a certain period of time until the disease developed, the incidences of disease in exposed and unexposed population are calculated (see the following schematic diagram). Therefore, incidence rate is the measure of disease in cohort studies. The association to disease is measured by relative risk (RR).

           Analyt Stydy III_Fig2a (1)Analyt Stydy III_Fig2b

From the above table, the incidences of disease in exposed and unexposed population, relative risk (RR) can be calculated.

Analytical Study

Alternatively, odd ratio (OR) can also be a measurement of association and is the ratio of two odds. Again, we can obtain odds from the ratio of chances of something to happen to that of not happening. In this case, the OR can be calculated as follows:


OR = a/b : c/d


Attributable risk or exposure attributed to the disease in total population or in other words population attributable risk (PAR) can also be calculated with respect to the total population.

The Framingham heart study is a good example of this type of prospective study. Started in 1948, this study is still going on. The Framingham heart study was undertaken to determine common contributing factors to cardiovascular disease. The Framingham risk score based on the Framingham study can predict 10-years cardiovascular risk in an individual with no known cardiovascular disease.

Advantages of prospective cohort study include the following:

(i)                 Better for rare exposure

(ii)               One can determine the disease incidence rate and relative risk

(iii)             More than one disease associated with a single exposure can be determined

(iv)             This study is able to establish the cause-effect

(v)               Selection and information biases are minimized

However, the study has certain limitations as well. The study requires a large population and long time to complete. Loss of subjects in long time follow-up adversely affects the outcome of the study. Prospective study is insufficient in rare diseases. Moreover, this type of study is expensive and has ethical issue too.


Retrospective Cohort study

Retrospective cohort study, also known as historical cohort study, is a type of cohort study where data are collected in past, but analyzed at present (see the inset diagram). Here, the investigator retrospectively Analytical Study identifies the exposure and the outcome information. A retrospective study design is chosen for a rare or unusual exposure for which a prospective study design will not be appropriate. In addition, retrospective study design can quickly estimate the effect of exposure on certain outcome as well as determine the disease association. This type of study is helpful in designing future studies and interventions. The data are collected from past medical records, administrative databases, conducting patients’ interviews, etc. Odd ratio is used as the measure of association between the exposure or risk factor and disease. The other measurements are same as prospective cohort study.

A classic example of retrospective study is the study conducted by Case et al (1954) to examine the excess risk of bladder cancer in men worked in the manufacturing plant of certain dye intermediate. In this study, the authors first made a list of men who worked in chemical plant manufacturing dye in UK at least six months since 1920. The investigators searched retrospectively for the cases of bladder cancer among those workers who had been employed in dye manufacturing chemical plants between 1921 to till February 1952. The number of cases of bladder cancer among these workers was then compared with the number of bladder cancer incidences in general population to determine the excess risk of bladder cancer in men exposed to certain dye intermediate.

Advantages of retrospective cohort study include the following:

(i)                 Good for rare exposure

(ii)               Unlike prospective study, this study takes relatively short time to complete

(iii)             Relatively less expensive

(iv)             Can be conducted for multiple cohort

(v)               Estimate the incidence data

(vi)             No ethical issues involve


However, retrospective cohort study design has certain disadvantages, especially, the chances of selection bias in sampling is relatively higher. Sometimes it may be difficult to identify the appropriate exposed group and corresponding control group or the group for comparison. Confounding is another issue in historical study design; loss of follow up may also bias the study results. In addition, like prospective cohort study, retrospective cohort study is not appropriate for rare diseases. Poor quality of available medical data not designed for such study often adds error in results.



1. Morabia, A (2004). A History of Epidemiologic Methods and Concepts. Birkhaeuser Verlag;  Basel: p. 1-405.

2. John-Hopkins open courseware.

3. Emily L. Harris EL. Linking Exposures and Endpoints: Measures of Association and Risk

4. Framingham heart study, a project of the National Heart, Lung, and Blood Institute and Boston University.


6. Case RAM, Hosker ME, McDonald DB, Pearson JT (1954). Tumors of the urinary bladder in workmen engaged in the manufacture and use of certain dyestuff intermediates in the British chemical industry. Part I. The role of aniline, benzidine, alpha-naphthylamine, and beta-naphthylamine. BR J Ind Med 11:75-104.

Analytical Study Designs in Medical Research

In medical research, it is important for a researcher to know about different analytical studies. The objectives of different analytical studies are different, and each study aims to determine different aspects of a disease(s) such as prevalence, incidence, cause, prognosis, or effect of treatment. Therefore, it is essential to identify the appropriate analytical study associated with certain objectives. Analytical studies are classified as experimental and observational studies. While in an experimental study, the investigator examines the effect of presence or absence of  certain intervention(s), he does not need to intervene in a observational study, rather he observes and assesses the  relation between exposure and disease variable. Interventional studies or clinical trials fall under the category of experimental study where investigator assigns the exposure status. Observational studies are of four types: cohort studies, case-control studies, cross-sectional studies, and longitudinal studies

Classification of Analytical studies

While experimental studies are sometimes non indicative or not ethical to conduct or very expensive, observational studies probably are the next best approach to answer certain investigative questions. Well-designed observational studies may also produce similar results as controlled trials; therefore, probably, the observational studies may not be considered as second best options. In order to design an appropriate observational study, one should able to distinguish between four different observational studies and their appropriate application depending on the investigative questions. Following is a brief discussion on four different observational studies (each will be discussed in detail individually in my upcoming blogs):


Observational Analytical Study Designs

Cohort studies

Cohort methodology is one of the main tools of analytical epidemiological research. The word “cohort” is derived from the Latin word “cohors” meaning unit. The word was adopted in epidemiology to refer a set of people monitored for a period of time. In modern epidemiology, the word is now defined as “group of people with defined characteristics who are followed up to determine incidence of, or mortality from, some specific disease, all causes of death, or some other outcome” (Morabia, 2004). In cohort studies, individuals are identified who initially do not have the outcome of interest and followed for a period of time. The group can be classified in sub sets on the basis of the exposure. For example, a group of people can be identified consisting of both smoker and non-smoker and followed them for the incidence of lung cancer. At the beginning of the study none of the individuals have lung-cancer and the individuals are grouped into two sub sets as smoker and non-smoker and then followed for a period of time for different characteristics of exposure such as smoking, BMI, eating habits, exercise habits, family history of lung cancer or cardiovascular diseases, etc. Over the time, some individuals develop the outcome of interest. From the data collected over time, it is convenient to evaluate the hypothesis whether smoking is related with the incidence of lung cancer. The following schematic shows the basic design of a cohort study. There are two types of cohort studies: prospective and retrospective. A prospective study is conducted at present but followed up to future i.e., waiting for the disease to develop. On the other hand, a retrospective study is carried out at present on the data collected in the past. This is also called as historic cohort study. In the next blog, I will discuss these in detail.

Design of a Cohort study

Case-control studies

In terms of objective, case-control studies and cohort studies are same. Both are observational analytical studies, which aim to investigate the association between exposure and outcome. The difference lies in the sampling strategy. While cohort studies identify the subjects based on the exposure status, case-control studies identify the subjects based on the outcome status. Once the outcome status is identified the subjects are divided into two sets: case and control (who do not develop the outcome). For example, a study design which determines the relation between endrometrial cancer with use of conjugated estrogen. For this study, subjects are chosen based on the outcome status (endrometrium cancer) i.e., with disease present (case) and absent (control), and then these two subsets are compared with respect to the exposure (use of conjugated estrogen). Therefore, case-control study is retrospective in nature and cannot be used for calculating relative risk. However, odd ratio can be measured, which in turn, is approximate to relative risk. In cases of rare outcomes, case control study is probably the only feasible analytical study approach.

Design of a Case-Control Study

Cross-sectional studies

Cross-sectional study is a type of observational analytical study which is used primarily to determine the prevalence without manipulating the study environment. For example, a study can be designed to determine the cholesterol level in walker and non-walker without exerting any exercise regime or activity on non-walkers or modifying the activity of the walkers. Apart from cholesterol other characteristics of interest, such as age, gender, food habits, educational level, occupation, income, etc., can also be measured. The data collected at one time in present with no further follow up. In cross-sectional design, one can study a single population (only walkers) or more than one population (both walker and non-walker) at one point of time to see the association between cholesterol level and walking. However, the design of this study does not allow to examine the causal of a certain condition since the subjects are never been followed either in past or present. 

Design of a Cross-Sectional Study

Longitudinal studies

Longitudinal studies, similar to cross-sectional studies, are also a type of observational analytical studies. However, the difference of this study design with the cross-sectional study is the following up the subjects for a longer time; hence, can contribute more to the association of causative to a condition. For example, the design that aims to determine the cholesterol level of a single population, say the walkers over a period of time along with some other characteristics of interest such as age, gender, food habits, educational level, occupation, income, etc. One may choose to examine the pattern of cholesterol level in men aged 35 years walking daily for 10 years. The cholesterol level is measured at the onset of the activity (here, walking) and followed up throughout the defined time period, which enables to detect any change or development in the characteristics of the population.

Following two tables summarize different observational analytical studies with regard to the objectives and time-frame.


I will define several terms, such as risk factor, odd ratio, probability, confounding factors, etc., related to study designs along with the detail discussion on individual analytical study design and tips to choose correct design depending on the research question in my upcoming blogs. Visit the blog section of the website ( for more such informative and educative topics. 


[1] Morabia, A (2004). A History of Epidemiologic Methods and Concepts. Birkhaeuser Verlag; Basel: p. 1-405.

[2] Hulley, S.B., Cummings, S.R., Browner, W.S., et al (2001). Designing Clinical Research: An Epidemiologic Approach. 2nd Ed. Lippincott Williams & Wilkins; Philadelphia: p. 1-336.

[3] Merril, R.M., Timmreck, T.C (2006).  Introduction to Epidemiology. 4th Ed. Jones and Bartlett Publishers; Mississauga, Ontario: p. 1-342.

[4] Lilienfeld, A.M., and Lilienfeld, D.E. (1980): Foundations of Epidemiology. Oxford University Press, London.