A study was conducted to assess the association between hormone replacement therapy (HRT) in post-menopausal women and the level of serum C-reactive protein (CRP), categorized as "high" or "normal" based on predefined values. The data from the study are presented below:
CRP high CRP normal Total HRT 32 41 73 No HRT 28 49 77 Total 60 90 150
Which of the following is the best statistical method to assess the association between HRT and CRP levels?
Dependent variable | |||
Qualitative (categorical) | Quantitative | ||
Independent variable | Qualitative (categorical) | Chi-square, logistic regression* | t test, ANOVA, linear regression |
Quantitative | Logistic regression* | Correlation, linear regression | |
*Dependent variable must be dichotomous. ANOVA = analysis of variance. |
Variables are broadly classified as qualitative (ie, categorical) or quantitative (ie, continuous) based on the scale of measurement. Qualitative variables (eg, type of treatment, blood type) represent categories or groups, whereas quantitative variables (eg, temperature, glucose level) represent numerical values. The scale of measurement of the dependent (eg, outcome) and independent (eg, exposure, risk factor) variables in a study determines the correct statistical test for any given situation.
The chi-square test is used to compare the proportions of a categorized outcome with a qualitative (categorical) independent variable and a qualitative (categorical) dependent variable. In this case, the outcome (CRP level) is categorized as either "high" and "normal," and then presented with the categorized exposure ("HRT" or "no HRT") in a 2 × 2 table. In one of the commonly used chi-square tests, the observed values in each of the cells are compared to expected (under the hypothesis of no association) values. If the difference between the observed and expected values is large, an association between the exposure and the outcome is assumed to be present.
(Choice A) Correlation analysis measures the strength and direction of a linear relationship between 2 quantitative variables. In the given 2 × 2 table, the variables are categorized (ie, qualitative rather than quantitative).
(Choices B and D) The two-sample z-test and two-sample t-test are used to compare 2 means, not proportions. Analysis of variance (ANOVA) is used to compare the means of ≥2 variables.
(Choice E) Meta-analysis is an epidemiologic method of pooling the data from several studies to do an analysis having a relatively big statistical power.
Educational objective:
The chi-square test is used to compare proportions. A 2 × 2 table may be used to compare the observed values with the expected values.