Importance of Statistical Review of manuscripts

Statistics: It is a branch of mathematics that deals with the collection of data, its analysis, interpretation, presentation and sequential organization. In simple terms, it deals with philosophy, logic, and expression of data.

Who does the statistical review?

Statistical review is basically done by the expert statisticians or authors and journal editors with statistical knowledge. It comprises of statistical and even methodological questions that are to be answered by the author or even the journal editors that are put forward by the reviewer.

Role of the statistical reviewers:

  • The statistical reviewers find out the possible statistical error sources in the manuscript, in turn increasing the statistical accuracy of the paper as well as ensuring quicker publication of the manuscript.
  • All forms of statistical data checking is performed by the statistical reviewers like checking the missing data, checking whether correct statistical methods were followed or not, checking whether the statistical methods were used appropriately or not, checking for statistical errors like error in level of significance during analysis of the data, checking whether appropriate name of the statistical package is mentioned or not along with the version used, checking whether the measurable units are properly mentioned or not, checking whether the tables and figures displayed in the manuscript hold a proper self-explanatory footnote or not, and so on.
  • They ensure proper statistical presentation of data throughout the manuscript; proper use of statistical language is also ensured by the reviewer in the data presentation section.
  • The reviewer also checks whether the conclusion section in the manuscript is justified or not with regard to the presented data.
  • They also cross check the feasibility of the discussion section based on the results.

Significance of statistical review:

  • If there is any kind of major statistical errors found in the data presentation section, then it may lead to the rejection of the research paper. So, reviewing of the statistical data and its proper presentation is of utmost importance for the author. The frequent statistical problems in the manuscript are found in data interpretation and presentation, its analysis and the study design.
  • Sound statistics is the foundation to high-quality research work interpreting quantitative studies.

Gift authorship: A provocative issue

Assigning authorship for the real contributors can be a tricky business for the scientific or the academic writers.

What is gift authorship? In simple terms, it is defined as the co-authorship, conferred to those having a little or no intellectual contribution in the study. Gift authorship provides just an authority stamp for the non-contributors.

Authorship criteria: The person affiliated under this type of authorship, do not meet the authorship criteria defined by various international bodies. According to one of the giant international body, known as International Committee of Medical Journal Editors (ICMJE), authorship criteria are stated as follows:

(i) Substantial contributions to conception and design, acquisition or collection, analysis and interpretation of data;

(ii) Drafting or revising the article, critically for important intellectual content; and

(iii) Giving the final approval of the edited version of the article, to be published.

Ethical viewpoint of gift authorship: Gift authorship is regarded as highly unethical and irrelevant, as the author honored with the title of authorship holds no contributing hands, neither in the designing nor in data analysis; he/she do not meet the authorship criteria. He/she may not have expertise in the field to which the study is related, which may bring dismay to the institute or the publishing journal. This kind of authorship has cropped up in an intense way in the medical writing field, demanding an immediate action to be taken against this illegal authorship, as this dilutes the credentials of the authors who have done the real work, related to the study protocol.

Reasons for researchers opting for gift authorship: There are many identified reasons that put the precocious researchers to opt for the option of gift authorship.  Some of the most cited reasons are:

(i) Superiority complex of the guide or the lead supervisor. They do not allow any research article publication by their junior counterparts; and

(ii) Certain researchers believe that assigning the names of their head supervisor or institutional head can lead to increased readability and help in publication of their articles.

The researchers should think of undermining such authorship that leads to the loss in the credibility of the real contributing authors and diluting their work profile.

Is self-plagiarism ethical?

Research papers or journals are the medium of spreading knowledge and new ideas evolved. Innovative and original piece of work would certainly be more educative and admirable. Nevertheless, authors and writers are often found to be reusing their old piece of work or some extracts from their previous published papers while writing a new research paper.

When questions are raised against this content reuse, authors claim that those stuffs are their own works and materials, and thus, they can reuse them as they wish, and it cannot be termed as plagiarism since they have not stolen the ideas from any other author or source.

The ethics of plagiarism are not applicable to such reuse, as a result of which it has been overlooked till date. While the discussion is whether this reuse is ethical or not, the publications and the journals, on the other hand, have set certain guidelines for such works citing it as Self-plagiarism.

What is self-plagiarism?

Self-plagiarism is a form of plagiarism where the writer reuses his/her own previously published work in portions or entirely while creating a new study paper. It can breach the publisher’s copyright on those published work when it is reused in the new study papers without appropriate citations. Let us now know more about the ethical aspects of self-plagiarism.

Self-plagiarism can be detected when:

a)  A published paper is used to republish elsewhere without the consent of the co-authors and the publisher of the paper or work.

b)  A paper of a large study is published in small sections with an intention to increase the number of publications.

c)  A previously written work either published or not is reused again in portions in the new study papers.

Although the laws of self-plagiarism are not enforced, it somehow reflects the dishonesty of the author. Moreover, the journals and the publishers are rejecting such copy-paste works as they are seeking writings based on original research findings and proper citations of all the references.

Nowadays, journals are also pointing out questions on the reuse of one’s own work. In order to avoid self-plagiarism, one should try to keep his/her work original, and in case it is necessary to include any portion from his/her previous works, it should be then properly cited with proper references. I hope this article will surely help you in detecting prospective self-plagiarism before submitting your paper or work to publications or journals.

Analytical Study Design in Medical Research: Cohort study

Cohort studies are observational analytical studies. As mentioned in my previous blog (http://blog.manuscriptedit.com/2014/02/overview-different-analytical-study-designs-medical-research/), the word ‘cohort’ is derived from the Latin word ‘cohors’, which means unit. For conducting cohort type of studies, the study population is chosen from general population both exposed to a certain agent suspected for disease development and unexposed to the cause. The population is followed for a longer period of time. The incidence in disease development in exposed group is compared with the non-exposed group. Therefore, the objective of a cohort study is to find out association between a suspected cause(s) and disease. If performed correctly, cohort studies can predict results comparable to the experimental analytical studies. The following measurements can be done in a cohort study design: absolute risk or incidence, relative risk (risk ratio or rate ratio), risk difference, and attributable proportion. Cohort studies are classified as prospective and retrospective studies based on the timing of enrollment of subjects and disease outcome.

Analytical Study III_Fig1


Prospective Cohort Study

As the name suggested, prospective cohort studies are started with a population containing non-diseased subjects but all having a risk to develop a certain disease, and the investigator waits for the disease to develop. The population is divided into two groups, one with the exposure of the potent agent or environment suspected to be associated with the disease, and the other remains unexposed but with equal susceptibility to develop the disease. Then the population is followed up for a certain period of time until they develop the condition or disease. After following up the study population for a certain period of time until the disease developed, the incidences of disease in exposed and unexposed population are calculated (see the following schematic diagram). Therefore, incidence rate is the measure of disease in cohort studies. The association to disease is measured by relative risk (RR).

           Analyt Stydy III_Fig2a (1)Analyt Stydy III_Fig2b

From the above table, the incidences of disease in exposed and unexposed population, relative risk (RR) can be calculated.

Analytical Study

Alternatively, odd ratio (OR) can also be a measurement of association and is the ratio of two odds. Again, we can obtain odds from the ratio of chances of something to happen to that of not happening. In this case, the OR can be calculated as follows:

 

OR = a/b : c/d

 

Attributable risk or exposure attributed to the disease in total population or in other words population attributable risk (PAR) can also be calculated with respect to the total population.

The Framingham heart study is a good example of this type of prospective study. Started in 1948, this study is still going on. The Framingham heart study was undertaken to determine common contributing factors to cardiovascular disease. The Framingham risk score based on the Framingham study can predict 10-years cardiovascular risk in an individual with no known cardiovascular disease.

Advantages of prospective cohort study include the following:

(i)                 Better for rare exposure

(ii)               One can determine the disease incidence rate and relative risk

(iii)             More than one disease associated with a single exposure can be determined

(iv)             This study is able to establish the cause-effect

(v)               Selection and information biases are minimized

However, the study has certain limitations as well. The study requires a large population and long time to complete. Loss of subjects in long time follow-up adversely affects the outcome of the study. Prospective study is insufficient in rare diseases. Moreover, this type of study is expensive and has ethical issue too.

 

Retrospective Cohort study

Retrospective cohort study, also known as historical cohort study, is a type of cohort study where data are collected in past, but analyzed at present (see the inset diagram). Here, the investigator retrospectively Analytical Study identifies the exposure and the outcome information. A retrospective study design is chosen for a rare or unusual exposure for which a prospective study design will not be appropriate. In addition, retrospective study design can quickly estimate the effect of exposure on certain outcome as well as determine the disease association. This type of study is helpful in designing future studies and interventions. The data are collected from past medical records, administrative databases, conducting patients’ interviews, etc. Odd ratio is used as the measure of association between the exposure or risk factor and disease. The other measurements are same as prospective cohort study.

A classic example of retrospective study is the study conducted by Case et al (1954) to examine the excess risk of bladder cancer in men worked in the manufacturing plant of certain dye intermediate. In this study, the authors first made a list of men who worked in chemical plant manufacturing dye in UK at least six months since 1920. The investigators searched retrospectively for the cases of bladder cancer among those workers who had been employed in dye manufacturing chemical plants between 1921 to till February 1952. The number of cases of bladder cancer among these workers was then compared with the number of bladder cancer incidences in general population to determine the excess risk of bladder cancer in men exposed to certain dye intermediate.

Advantages of retrospective cohort study include the following:

(i)                 Good for rare exposure

(ii)               Unlike prospective study, this study takes relatively short time to complete

(iii)             Relatively less expensive

(iv)             Can be conducted for multiple cohort

(v)               Estimate the incidence data

(vi)             No ethical issues involve

 

However, retrospective cohort study design has certain disadvantages, especially, the chances of selection bias in sampling is relatively higher. Sometimes it may be difficult to identify the appropriate exposed group and corresponding control group or the group for comparison. Confounding is another issue in historical study design; loss of follow up may also bias the study results. In addition, like prospective cohort study, retrospective cohort study is not appropriate for rare diseases. Poor quality of available medical data not designed for such study often adds error in results.

 

References

1. Morabia, A (2004). A History of Epidemiologic Methods and Concepts. Birkhaeuser Verlag;  Basel: p. 1-405.

2. John-Hopkins open courseware.

http://ocw.jhsph.edu/courses/fundepiii/lectureNotes.cfm

3. Emily L. Harris EL. Linking Exposures and Endpoints: Measures of Association and Risk

http://www.genome.gov/pages/about/od/opg/epidemiologyforresearchers/3_harris.pdf

4. Framingham heart study, a project of the National Heart, Lung, and Blood Institute and Boston University. http://www.framinghamheartstudy.org/fhs-bibliography/index.php

5. http://www.iarc.fr/en/publications/pdfs-online/epi/cancerepi/CancerEpi-8.pdf

6. Case RAM, Hosker ME, McDonald DB, Pearson JT (1954). Tumors of the urinary bladder in workmen engaged in the manufacture and use of certain dyestuff intermediates in the British chemical industry. Part I. The role of aniline, benzidine, alpha-naphthylamine, and beta-naphthylamine. BR J Ind Med 11:75-104.

Analytical Study Design in Medical Research: Measures of risk and disease association

A researcher, while designing any analytical study in medical research, should be aware of few basic terms in epidemiology required to measure disease risk and association. This blog article focuses on defining those terms used for calculating disease risk and association. As mentioned above, there are two different types of measurements: Measures of risk and Measures of association.

Measures of Risk

Risk is defined as the probability of an individual developing a condition or disease over a period of time.

Risk = Chances of something to happen/ Chances of all things to happen

Odds= Chances of something to happen/ Chances of it not happening

Therefore, “Risk” is a proportion, while “Odds” is a ratio.

Incidence: Incidence is a measure of risk which describes the number of cases developed a new condition for a specified period of time. In this context, there is another important term, “Incidence proportion” to be worth mentioning. It is defined as the proportion of the number of cases developed a new condition and total population including the cases with developed condition and no condition in a specified period of time.

For example, among 100 non-diseased persons initially at risk, 20 develop a disease/condition over a period of five years.

Incidence = 20 cases

Incidence proportion = 20 cases per 100 persons i.e., 5%

Incidence rate = 20 cases developed in 100 persons in 5 year means the rate of incidence is equal to 4 per 100 person-years

Prevalence: Prevalence is the proportion of the number of people having a condition at a specific point of time and total population studied. This is specifically called point prevalence. For example, at a certain date, five persons are detected having a condition among 100 people studied. There are two more terms need to be defined in this regard: Period prevalence and Life time prevalence (LTF). The former is defined as the proportion of the number of people having the disease at a certain period of time, say a month or period or a year and the total population studied at that period of time. On the other hand, LTF is defined as the proportion of the number of people having the disease at some point of their life and total population studied.

There is a very subtle difference between incidence and prevalence. Incidence is the frequency of a new event, while prevalence is the frequency of an existing event.

Cumulative Risk: Cumulative risk is defined by the probability of developing a condition over a period of time.

Measures of Association

Association is defined as a statistical measurement between two or more variables.

For measuring the strength of association of a disease for etiological and hypothesis testing, following measurements are important. The terms defined below are used to measure the association between exposure and disease.

Relative risk (RR): The relative risk is measured as a ratio of two risks.

For example, in 100 people consisting of 50 male and 50 female, while 20 male are infected with Tuberculosis, 10 female develop the condition.

Risk in men: 20/50

Risk in women: 10/50

Therefore, relative risk (RR) of developing Tuberculosis in men compared to women is

RR = 20/50 : 10/50 = 2.0

i.e., men are at double risk of developing Tuberculosis as compared to women.

Odd ratio (OR): Odd ratio is measured as the ratio of two odds (odds is defined above).

Continuing the previous example of Tuberculosis in men and women in a total population of 100

Odds in men: 20/30

Odds in women: 10/40

Odd ratio (OR) = 20/30 : 10/40 = 2.67

Therefore, the odds of men getting infected with Tuberculosis are 2.6 times as high as the women developing Tuberculosis.

To measure the impact of   the disease association on public health, following measuerments are important. All these measurements assume that the association between exposure and disease is causal.

Attributable risk (AR): Amount of disease attributed to the exposure i.e., the difference between the incidence of disease in the exposed group (Ie) and the incidence of disease in the unexposed group (Iue).

AR = Ie – Iue

Attributable (risk) fraction (ARF): ARF is the proportion of disease in the exposed population whose disease can be attributed to the exposure.

ARF = Ie – Iue / Ie

Population attributable risk (PAR): The incidence of disease in total population (Ip) that can be attributed to the exposure.

PAR = Ip – Iue

Population attributable (risk) fraction (PARF): PARF is the proportion of the disease in the total population whose disease can be attributed to the exposure.

PARF = Ip – Iue / Ip

 

Bias and Confounding Factors

In an epidemiological study, when association is found between exposure and disease, it is very important to check first whether the association is real. One needs to be cautious if the association is by chance due to non-adequate sample size or it is because of some kind of bias in the design or measurement.

Bias is a systematic error in design, conduct or analysis which results in unreal association of exposure with disease. There are three types of biases possible: (i) Selection bias, (ii) Information bias, and (iii) Confounding.

Selection bias occurs when selection of participants in one group shows different outcome in the selection of other groups. Information bias happens when information is taken differently from two groups.

Confounding occurs when the observed result between exposure and disease differs from the truth due to the influence of a third variable which has not been considered for analysis. For example, a person suffers from headache when he is under stress; however the person eats a lot of junk food especially, when he is in under stress. Therefore, it is hard to predict what actually causes the headache; whether it is lack of sleep, anxiety, gas formation due to indigestion. Therefore, all these variables should be adjusted before associating mental stress with headache.

 

References

1. Health Statistics New South Wales – Definitions. (n.d.). http://www.healthstats.nsw.gov.au/ContentText/Display/Definitions

2. SOURCES OF EPIDEMIOLOGIC DATA – KSU. (n.d.).

http://faculty.ksu.edu.sa/71640/Publications/COURSES/epidemiology-334%20CHS%20%20(70).doc

3. John-Hopkins open courseware. http://ocw.jhsph.edu/courses/fundepiii/lectureNotes.cfm

4. Manuel Bayona M, Chris Olsen, C. Measures in Epidemiology. In The Young Epidemiology Scholars Program (YES)

www.collegeboard.com/prod_downloads/yes/4297_MODULE_09.pdf‎

5. Emily L. Harris EL. Linking Exposures and Endpoints: Measures of Association and Risk

http://www.genome.gov/pages/about/od/opg/epidemiologyforresearchers/3_harris.pdf

Analytical Study Designs in Medical Research

In medical research, it is important for a researcher to know about different analytical studies. The objectives of different analytical studies are different, and each study aims to determine different aspects of a disease(s) such as prevalence, incidence, cause, prognosis, or effect of treatment. Therefore, it is essential to identify the appropriate analytical study associated with certain objectives. Analytical studies are classified as experimental and observational studies. While in an experimental study, the investigator examines the effect of presence or absence of  certain intervention(s), he does not need to intervene in a observational study, rather he observes and assesses the  relation between exposure and disease variable. Interventional studies or clinical trials fall under the category of experimental study where investigator assigns the exposure status. Observational studies are of four types: cohort studies, case-control studies, cross-sectional studies, and longitudinal studies

Classification of Analytical studies

While experimental studies are sometimes non indicative or not ethical to conduct or very expensive, observational studies probably are the next best approach to answer certain investigative questions. Well-designed observational studies may also produce similar results as controlled trials; therefore, probably, the observational studies may not be considered as second best options. In order to design an appropriate observational study, one should able to distinguish between four different observational studies and their appropriate application depending on the investigative questions. Following is a brief discussion on four different observational studies (each will be discussed in detail individually in my upcoming blogs):

 

Observational Analytical Study Designs

Cohort studies

Cohort methodology is one of the main tools of analytical epidemiological research. The word “cohort” is derived from the Latin word “cohors” meaning unit. The word was adopted in epidemiology to refer a set of people monitored for a period of time. In modern epidemiology, the word is now defined as “group of people with defined characteristics who are followed up to determine incidence of, or mortality from, some specific disease, all causes of death, or some other outcome” (Morabia, 2004). In cohort studies, individuals are identified who initially do not have the outcome of interest and followed for a period of time. The group can be classified in sub sets on the basis of the exposure. For example, a group of people can be identified consisting of both smoker and non-smoker and followed them for the incidence of lung cancer. At the beginning of the study none of the individuals have lung-cancer and the individuals are grouped into two sub sets as smoker and non-smoker and then followed for a period of time for different characteristics of exposure such as smoking, BMI, eating habits, exercise habits, family history of lung cancer or cardiovascular diseases, etc. Over the time, some individuals develop the outcome of interest. From the data collected over time, it is convenient to evaluate the hypothesis whether smoking is related with the incidence of lung cancer. The following schematic shows the basic design of a cohort study. There are two types of cohort studies: prospective and retrospective. A prospective study is conducted at present but followed up to future i.e., waiting for the disease to develop. On the other hand, a retrospective study is carried out at present on the data collected in the past. This is also called as historic cohort study. In the next blog, I will discuss these in detail.

Design of a Cohort study

Case-control studies

In terms of objective, case-control studies and cohort studies are same. Both are observational analytical studies, which aim to investigate the association between exposure and outcome. The difference lies in the sampling strategy. While cohort studies identify the subjects based on the exposure status, case-control studies identify the subjects based on the outcome status. Once the outcome status is identified the subjects are divided into two sets: case and control (who do not develop the outcome). For example, a study design which determines the relation between endrometrial cancer with use of conjugated estrogen. For this study, subjects are chosen based on the outcome status (endrometrium cancer) i.e., with disease present (case) and absent (control), and then these two subsets are compared with respect to the exposure (use of conjugated estrogen). Therefore, case-control study is retrospective in nature and cannot be used for calculating relative risk. However, odd ratio can be measured, which in turn, is approximate to relative risk. In cases of rare outcomes, case control study is probably the only feasible analytical study approach.

Design of a Case-Control Study

Cross-sectional studies

Cross-sectional study is a type of observational analytical study which is used primarily to determine the prevalence without manipulating the study environment. For example, a study can be designed to determine the cholesterol level in walker and non-walker without exerting any exercise regime or activity on non-walkers or modifying the activity of the walkers. Apart from cholesterol other characteristics of interest, such as age, gender, food habits, educational level, occupation, income, etc., can also be measured. The data collected at one time in present with no further follow up. In cross-sectional design, one can study a single population (only walkers) or more than one population (both walker and non-walker) at one point of time to see the association between cholesterol level and walking. However, the design of this study does not allow to examine the causal of a certain condition since the subjects are never been followed either in past or present. 

Design of a Cross-Sectional Study

Longitudinal studies

Longitudinal studies, similar to cross-sectional studies, are also a type of observational analytical studies. However, the difference of this study design with the cross-sectional study is the following up the subjects for a longer time; hence, can contribute more to the association of causative to a condition. For example, the design that aims to determine the cholesterol level of a single population, say the walkers over a period of time along with some other characteristics of interest such as age, gender, food habits, educational level, occupation, income, etc. One may choose to examine the pattern of cholesterol level in men aged 35 years walking daily for 10 years. The cholesterol level is measured at the onset of the activity (here, walking) and followed up throughout the defined time period, which enables to detect any change or development in the characteristics of the population.

Following two tables summarize different observational analytical studies with regard to the objectives and time-frame.

Fig5

I will define several terms, such as risk factor, odd ratio, probability, confounding factors, etc., related to study designs along with the detail discussion on individual analytical study design and tips to choose correct design depending on the research question in my upcoming blogs. Visit the blog section of the website (www.manuscriptedit.com) for more such informative and educative topics. 

References

[1] Morabia, A (2004). A History of Epidemiologic Methods and Concepts. Birkhaeuser Verlag; Basel: p. 1-405.

[2] Hulley, S.B., Cummings, S.R., Browner, W.S., et al (2001). Designing Clinical Research: An Epidemiologic Approach. 2nd Ed. Lippincott Williams & Wilkins; Philadelphia: p. 1-336.

[3] Merril, R.M., Timmreck, T.C (2006).  Introduction to Epidemiology. 4th Ed. Jones and Bartlett Publishers; Mississauga, Ontario: p. 1-342.

[4] Lilienfeld, A.M., and Lilienfeld, D.E. (1980): Foundations of Epidemiology. Oxford University Press, London.