A study is conducted to assess the relationship between smoking status and forced expiratory volume in one second (FEV1). Subjects are randomly selected and categorized based on smoking status. Group A consists of 200 nonsmokers, group B consists of 200 light smokers (1-7 cigarettes per day), group C consists of 200 moderate smokers (8-22 cigarettes per day), and group D consists of 200 heavy smokers (23+ cigarettes per day). FEV1 is quantitatively measured in all participants using properly calibrated office spirometers. Which of the following is the most appropriate statistical method to compare the mean FEV1 results among all 4 groups?
Show Explanatory Sources
Analysis of variance (ANOVA) is used to determine whether there are any significant differences between the means of several independent groups.
ANOVA compares the means between the groups relative to the variability within groups (F-test) and determines whether any of those means are significantly different from one another. Specifically, it tests the null hypothesis that all groups are simply random samples of the same population (ie, the means are the same). The null hypothesis is rejected when there are at least 2 group means that are significantly different from one another.
ANOVA can be used to compare ≥2 groups but is generally used to compare ≥3 groups (because other equivalent methods exist to compare 2 groups). For example, the 2 independent samples t-test is a special case of the F-test in ANOVA. The assumptions for both tests and their resulting p-values are the same.
(Choice B) Chi-square tests can be used to evaluate the association between 2 categorical variables. For example, if FEV1 is measured as a categorical variable (eg, normal or low), then a chi-square test could be used to determine if there is an association between FEV1 and smoking status. However, this study is specifically comparing the mean FEV1 results (a quantitative variable) between groups.
(Choice C) Meta-analysis involves the pooling of data from several studies to perform an analysis with greater statistical power than the individual studies alone. For example, individual studies assessing the effects of aspirin on certain cardiovascular events may be inconclusive. However, analysis of data compiled from multiple clinical trials may reveal a significant benefit.
(Choice D) Multiple logistic regression is a method used to predict the probability of a binary outcome (eg, presence or absence of gastric cancer) based on 1 or more independent variables that can be either continuous or categorical. For example, this test could be used to predict the probability of gastric cancer based on alcohol consumption, tobacco use, and charred food consumption.
(Choice E) The Pearson correlation coefficient is a measure of the strength and direction of a linear relationship between 2 quantitative (ie, continuous) variables. For example, a study may report a correlation coefficient describing the association between hemoglobin A1c levels and average blood glucose levels.
(Choice F) A two-sample t-test can be used when 2 group means are compared. This test could have been used for the example in the question if the study participants were divided into smoking and nonsmoking groups only (ie, 2 groups instead of 4).
Educational objective:
A t-test is used to compare the difference between the means of 2 groups. Analysis of variance (ANOVA) compares the difference between the means of 2 or more groups. Results from a t-test and ANOVA test will be equivalent when comparing the difference between the means of 2 groups.