Hurry up!
: : Get The Offer
Unlimited Access Step ( one, two and three ).
Priority Access To New Features.
Free Lifetime Updates Facility.
Dedicated Support.
1
Question:

A genome-wide association study (GWAS) of rheumatoid arthritis (RA) is performed in a cohort of Portuguese patients, including 907 cases with RA and 1,524 controls without RA.  Logistic regression is used to test the association between RA and hundreds of thousands of loci.  These association results are then compared with data from a European GWAS cohort of 4,036 patients with RA and 6,959 patients without RA.  Finally, the Portuguese and European study results are combined into a meta-analysis.  Based on these data, the investigators identify 3 new loci that are associated with RA based on a significance level threshold of 5 × 10−8:

European studyPortuguese studyMeta-analysis
Odds ratiop-valueOdds ratiop-valueOdds ratiop-value
Locus 10.538.36 × 10−70.610.0130.553.5 × 10−8
Locus 21.168.40 × 10−71.140.0191.164.9 × 10−8
Locus 31.630.00000151.770.00741.664.1 × 10−8

Previous studies had identified 30 loci that accounted for <35% of disease heritability for RA.  Which of the following statements is correct regarding this study?

Hurry up!
: : Get The Offer
Unlimited Access Step ( one, two and three ).
Priority Access To New Features.
Free Lifetime Updates Facility.
Dedicated Support.


Explanation:

Genome-wide association studies (GWAS) are typically case-control studies that involve scanning thousands of genetic markers in subjects with and without a disease to identify associations between genetic variants (eg, loci) and that disease.  GWAS often use single-nucleotide polymorphisms (SNPs), with several million SNPs analyzed on 1 microscope slide.

In this case, GWAS are used to analyze hundreds of thousands of loci in Portuguese patients with and without rheumatoid arthritis (RA).  The results are compared to those from a larger European cohort; both studies are then combined in a meta-analysis.  A few loci have SNPs with an allele that is significantly more common among patients with RA than those without RA; these loci are therefore associated with RA.

Whenever multiple tests are performed simultaneously (eg, analyzing thousands of loci), the possibility of false-positive results increases.  Using a much smaller threshold for the p-value is a method to minimize this.  Traditionally, a p-value <5% is considered statistically significant.  However, if 100,000 loci are studied at that 5% (ie, 0.05, 5 × 10−2) level, then 100,000 × 0.05 = 5,000 false positives would be expected.  With a much smaller p-value, this number decreases substantially.  For this reason, a p-value of 5 × 10−8 ("genome-wide p-value"), as seen here, is often chosen in GWAS.

(Choice A)  A meta-analysis combines results from different studies, thereby increasing sample size, which increases power (eg, to detect statistical differences between groups).

(Choice B)  Logistic regression was used because it analyzes the association between exposures (eg, multiple loci) and binary outcomes (eg, yes = cases with RA; no = controls without RA).

(Choice C)  An odds ratio (OR) <1 does not imply a lack of statistically significant association.  If an OR >1 means that patients with RA have higher odds of having a specific locus compared to those without RA (ie, risk factor), an OR <1 means that patients with RA have lower odds of having that specific locus (ie, protective factor).  The OR for the association between Locus 1 and RA is statistically significant because its p-value (3.5 × 10−8) is less than the genome-wide p-value (5 × 10−8).

(Choice E)  GWAS help explain disease heritability (ie, effect of genotypic differences on phenotypic differences).  Although this study identified 3 new loci, there is no indication that the full group of 33 loci will explain most RA disease heritability.  GWAS have identified many genes associated with specific conditions, but these genes only seem to explain a small amount of the variance ("missing heritability").

Educational objective:
Genome-wide association studies aim to identify associations between thousands of genetic variants and a disease.  Because of the increased risk of false-positive results when multiple tests are performed simultaneously, a smaller genome-wide p-value is typically used.