- Short Report
- Open Access
A note on the use of the generalized odds ratio in meta-analysis of association studies involving bi- and tri-allelic polymorphisms
- Tiago V Pereira^{1} and
- Regina C Mingroni-Netto^{1}Email author
https://doi.org/10.1186/1756-0500-4-172
© Mingroni-Netto et al; licensee BioMed Central Ltd. 2011
- Received: 2 January 2011
- Accepted: 6 June 2011
- Published: 6 June 2011
Abstract
Background
The generalized odds ratio (GOR) was recently suggested as a genetic model-free measure for association studies. However, its properties were not extensively investigated. We used Monte Carlo simulations to investigate type-I error rates, power and bias in both effect size and between-study variance estimates of meta-analyses using the GOR as a summary effect, and compared these results to those obtained by usual approaches of model specification. We further applied the GOR in a real meta-analysis of three genome-wide association studies in Alzheimer's disease.
Findings
For bi-allelic polymorphisms, the GOR performs virtually identical to a standard multiplicative model of analysis (e.g. per-allele odds ratio) for variants acting multiplicatively, but augments slightly the power to detect variants with a dominant mode of action, while reducing the probability to detect recessive variants. Although there were differences among the GOR and usual approaches in terms of bias and type-I error rates, both simulation- and real data-based results provided little indication that these differences will be substantial in practice for meta-analyses involving bi-allelic polymorphisms. However, the use of the GOR may be slightly more powerful for the synthesis of data from tri-allelic variants, particularly when susceptibility alleles are less common in the populations (≤10%). This gain in power may depend on knowledge of the direction of the effects.
Conclusions
For the synthesis of data from bi-allelic variants, the GOR may be regarded as a multiplicative-like model of analysis. The use of the GOR may be slightly more powerful in the tri-allelic case, particularly when susceptibility alleles are less common in the populations.
Keywords
- Multiplicative Model
- Summary Effect
- Binary Trait
- High Mutation Load
- Generalize Odds Ratio
Findings
The generalized odds ratio (GOR) was recently suggested as a model-free measure of effect that might overcome the problem of a genetic model misspecification in meta-analyses of association studies [1]. In the context of case-control genetic association studies for a binary trait and under assumption of random sampling, the GOR measures the probability that a case has a higher mutation load (i.e. a larger number of high-risk alleles) than a control divided by the probability that a control has a higher mutation load than a case.
In this note, we highlight advantages and limitations of the use of the GOR as a measure of effect in meta-analyses of bi- and tri-allelic polymorphisms through simulation. Our results are further complemented by a re-analysis of a real meta-analysis of three genome-wide association studies covering >311,000 bi-allelic markers in Alzheimer's disease.
Results
Performance of the GOR in the bi-allelic model
Type-I error rates
Type-I error rates (%) for the bi-allelic case according to different genetic models of analysis and heterogeneity (τ^{2} ) for α = 5%
τ^{2}= 0 | τ^{2}= 0.025 | τ^{2}= 0.05 | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Model of analysis | Allelic | LAT | GOR | Domi | Rece | Allelic | LAT | GOR | Domi | Rece | Allelic | LAT | GOR | Domi | Rece |
No. of Studies | Fixed-effects, MAF = 10% | ||||||||||||||
2 | 4.82 | 4.82 | 4.82 | 4.86 | 2.30 | 12.34 | 12.28 | 13.44 | 13.60 | 2.54 | 16.98 | 16.80 | 18.72 | 18.86 | 2.68 |
5 | 5.34 | 5.36 | 5.12 | 5.16 | 2.10 | 11.34 | 11.22 | 12.80 | 12.90 | 2.42 | 17.88 | 17.96 | 19.80 | 20.16 | 2.48 |
7 | 4.80 | 4.80 | 4.94 | 4.78 | 1.76 | 11.46 | 11.30 | 12.78 | 12.88 | 1.98 | 17.94 | 17.78 | 19.66 | 20.08 | 2.72 |
10 | 5.00 | 5.00 | 5.02 | 5.00 | 1.92 | 11.98 | 11.80 | 13.26 | 13.42 | 2.12 | 17.72 | 17.72 | 19.60 | 20.02 | 2.76 |
20 | 4.86 | 4.80 | 5.06 | 5.18 | 1.68 | 11.52 | 11.36 | 12.36 | 12.64 | 2.02 | 17.72 | 17.60 | 19.28 | 19.32 | 2.62 |
30 | 4.96 | 4.92 | 4.96 | 5.02 | 1.94 | 11.94 | 11.78 | 12.84 | 13.06 | 2.04 | 18.52 | 18.40 | 20.18 | 20.40 | 2.38 |
No. of Studies | Random-effects, MAF = 10% | ||||||||||||||
2 | 3.76 | 3.76 | 3.66 | 3.60 | 2.24 | 8.38 | 8.28 | 8.72 | 8.84 | 2.44 | 10.54 | 10.62 | 11.58 | 11.56 | 2.66 |
5 | 4.14 | 4.10 | 3.92 | 3.84 | 1.90 | 6.90 | 6.78 | 7.48 | 7.50 | 2.20 | 8.76 | 8.68 | 9.08 | 9.30 | 2.28 |
7 | 3.70 | 3.70 | 3.74 | 3.76 | 1.66 | 7.02 | 6.92 | 7.14 | 7.22 | 1.88 | 8.34 | 8.34 | 8.66 | 8.68 | 2.48 |
10 | 3.88 | 3.84 | 4.04 | 4.16 | 1.80 | 7.14 | 7.02 | 7.52 | 7.42 | 1.98 | 7.30 | 7.20 | 7.56 | 7.58 | 2.64 |
20 | 4.12 | 4.14 | 4.42 | 4.42 | 1.66 | 6.06 | 6.04 | 6.32 | 6.36 | 1.92 | 6.86 | 6.88 | 6.88 | 6.78 | 2.42 |
30 | 4.16 | 4.16 | 4.34 | 4.32 | 1.88 | 5.82 | 5.80 | 5.76 | 5.84 | 1.96 | 6.24 | 6.34 | 6.06 | 5.88 | 2.24 |
No. of Studies | Fixed-effects, MAF = 40% | ||||||||||||||
2 | 4.82 | 4.72 | 4.84 | 4.96 | 4.70 | 9.82 | 9.66 | 9.74 | 12.32 | 13.78 | 15.62 | 15.40 | 15.60 | 19.82 | 22.58 |
5 | 4.98 | 4.86 | 4.88 | 4.68 | 4.90 | 10.42 | 10.38 | 10.46 | 12.42 | 14.60 | 15.66 | 15.42 | 15.84 | 20.14 | 22.52 |
7 | 5.22 | 4.98 | 4.82 | 4.56 | 5.08 | 10.34 | 10.18 | 10.62 | 12.92 | 15.04 | 16.26 | 16.16 | 16.26 | 20.44 | 21.58 |
10 | 5.02 | 4.98 | 4.86 | 4.68 | 5.26 | 11.42 | 11.26 | 11.08 | 13.22 | 14.94 | 15.64 | 15.42 | 15.74 | 20.46 | 22.38 |
20 | 5.18 | 5.12 | 5.06 | 4.94 | 5.14 | 10.80 | 10.62 | 10.84 | 13.82 | 14.76 | 16.22 | 15.98 | 16.34 | 20.36 | 23.52 |
30 | 4.60 | 4.60 | 4.66 | 4.62 | 4.56 | 11.34 | 11.28 | 11.44 | 13.88 | 15.22 | 16.56 | 16.48 | 16.88 | 20.26 | 22.64 |
No. of Studies | Random-effects, MAF = 40% | ||||||||||||||
2 | 3.82 | 3.76 | 3.82 | 4.04 | 3.58 | 6.92 | 6.86 | 7.00 | 8.42 | 9.10 | 9.54 | 9.66 | 9.86 | 12.34 | 13.58 |
5 | 3.82 | 3.70 | 3.72 | 3.76 | 3.60 | 6.18 | 6.16 | 6.02 | 6.72 | 7.86 | 8.50 | 8.40 | 8.34 | 9.60 | 9.28 |
7 | 3.94 | 3.80 | 3.58 | 3.46 | 3.96 | 6.08 | 6.02 | 6.30 | 6.78 | 7.68 | 8.02 | 7.98 | 7.92 | 9.20 | 8.46 |
10 | 3.94 | 3.90 | 3.88 | 3.48 | 4.00 | 6.20 | 6.20 | 6.32 | 6.96 | 6.90 | 7.10 | 7.08 | 7.30 | 7.98 | 7.40 |
20 | 4.14 | 4.12 | 4.16 | 4.24 | 4.16 | 5.90 | 5.88 | 6.16 | 6.74 | 6.02 | 6.96 | 6.96 | 7.06 | 6.88 | 6.26 |
30 | 3.90 | 3.86 | 3.90 | 3.92 | 3.74 | 6.74 | 6.66 | 7.04 | 6.72 | 6.16 | 6.88 | 6.98 | 6.84 | 6.92 | 5.32 |
Power
Bias in the estimated statistical heterogeneity (τ^{2})
Bias in the estimated genetic effect size
The GOR provides nearly unbiased summary effects for less common variants (MAF = 10%) acting dominantly, regardless of the meta-analytical model and τ^{2}. Conversely, when the variant follows a multiplicative model of action and is common (MAF = 40%), GOR-based meta-analyses overestimate the true underlying increase in the effect size per additional copy of the risk allele (on average 20%) [Additional file 1: Supplementary tables S1-S2].
Performance of the GOR in the tri-allelic model
Type-I error rates
The performance of each model of analysis depends on the underlying between-study variability, allele frequencies and meta-analytical model, but type-I error rates for LAT- and GOR-based meta-analyses are comparable, whereas false discoveries tend to be higher for the per-allele approach when statistical heterogeneity is present (i.e. τ^{2} >0). However, the extent of these differences is smaller in random-effects calculations [Additional file 1: Supplementary tables S3-S4].
Power: two alleles acting on the same direction
Power: two alleles acting on opposite directions
When prior evidence on the direction of the effects of the susceptibility alleles is available, similar power is achieved with the use of the per-allele, LAT and GOR, regardless of the meta-analytical model, f and statistical heterogeneity [Additional files 1: Supplementary tables S5-S7].
Power: when only one allele displays a significant effect
Power is comparable among the GOR, LAT and per-allele odds ratio when only one allele displays a significant effect. This is specially true when the high-risk allele is less common in the populations (f = 10%), particularly when f (A_{2}) = f (A3) = 10%. Overall, for common variants acting multiplicatively, the best performance is achieved with both GOR and LAT. When the risk allele is either recessive or dominant and is common, the best approach may depend on the frequency of the remaining alleles, but power is comparable among the three tested approaches whenever f (A_{2}) ≅ f (A3) [Additional file 1: Supplementary tables S8-S10].
Real application
Summary results according to different models of analysis for the seven strongest association signals obtained by a meta-analysis of three independent genome-wide association studies in Alzheimer's disease (TGen data sets, Reiman dise et al. 2007)
SNP | Chr | Gene | MAF^{a} | Model of analysis | OR (95% IC) | P-value | P(Q) | I ^{2} | ||
---|---|---|---|---|---|---|---|---|---|---|
Fixed | Random | Fixed | Random | |||||||
rs41377151 | 19q13.32 | APOC1(3'region) | 0.30 | Allelic | 3.00 (2.50-3.59) | 3.15 (2.20-4.53) | 2.14 × 10^{-32} | 4.65 × 10^{-10} | 0.05 | 67 |
LAT | 2.94 (2.43-3.56) | 3.06 (2.12-4.43) | 1.62 × 10^{-28} | 2.79 × 10^{-9} | 0.05 | 66 | ||||
GOR | 3.44 (2.79-4.25) | 3.64 (2.38-5.57) | 1.73 × 10^{-30} | 2.73 × 10^{-9} | 0.04 | 68 | ||||
Domi | 3.51 (2.81-4.40) | 3.78 (2.35-6.08) | 3.33 × 10^{-28} | 4.19 × 10^{-8} | 0.04 | 70 | ||||
Rece | 5.51 (3.32-9.14) | 5.51 (3.32-9.14) | 4.31 × 10^{-11} | 4.31 × 10^{-11} | 0.59 | 0 | ||||
rs17330779^{b} | 7q31 | NRCAM (intron) | 0.10 | Allelic | 0.53 (0.41-0.69) | 0.53 (0.41-0.69) | 1.61 × 10^{-6} | 1.61 × 10^{-6} | 0.42 | 0 |
LAT | 0.50 (0.38-0.65) | 0.50 (0.38-0.65) | 5.96 × 10^{-7} | 5.96 × 10^{-7} | 0.38 | 0 | ||||
GOR | 0.49 (0.37-0.65) | 0.49 (0.37-0.65) | 4.72 × 10^{-7} | 4.72 × 10^{-7} | 0.42 | 0 | ||||
Domi | 0.49 (0.37-0.65) | 0.49 (0.37-0.65) | 5.01 × 10^{-7} | 5.01 × 10^{-7} | 0.44 | 0 | ||||
Rece | - | - | - | - | - | - | ||||
rs10824310^{b} | 10q11.23 | PRKG1 (intron) | 0.06 | Allelic | 0.47 (0.35-0.64) | 0.47 (0.35-0.64) | 2.11 × 10^{-6} | 2.11 × 10^{-6} | 0.63 | 0 |
LAT | 0.45 (0.33-0.62) | 0.45 (0.33-0.62) | 1.41 × 10^{-6} | 1.41 × 10^{-6} | 0.56 | 0 | ||||
GOR | 0.44 (0.31-0.61) | 0.44 (0.31-0.61) | 7.37 × 10^{-7} | 7.37 × 10^{-7} | 0.67 | 0 | ||||
Domi | 0.43 (0.31-0.60) | 0.43 (0.31-0.60) | 7.04 × 10^{-7} | 7.04 × 10^{-7} | 0.68 | 0 | ||||
Rece | - | - | - | - | - | - | ||||
rs12162084 | 16 | Unknown | 0.16 | Allelic | 0.61 (0.50-0.75) | 0.61 (0.50-0.75) | 2.28 × 10^{-6} | 2.28 × 10^{-6} | 1.00 | 0 |
LAT | 0.59 (0.48-0.73) | 0.59 (0.48-0.73) | 1.56 × 10^{-6} | 1.56 × 10^{-6} | 0.99 | 0 | ||||
GOR | 0.56 (0.45-0.71) | 0.56 (0.45-0.71) | 9.50 × 10^{-7} | 9.50 × 10^{-7} | 0.94 | 0 | ||||
Domi | 0.56 (0.44-0.70) | 0.56 (0.44-0.70) | 1.15 × 10^{-6} | 1.15 × 10^{-6} | 0.84 | 0 | ||||
Rece | 0.55 (0.26-1.15) | 0.52 (0.17-1.60) | 1.13 × 10^{-1} | 2.55 × 10^{-1} | 0.22 | 34 | ||||
rs7077757 | 10q25.2 | RBM20 (intron) | 0.21 | Allelic | 0.64 (0.53-0.77) | 0.64 (0.53-0.77) | 2.45 × 10^{-6} | 2.45 × 10^{-6} | 0.92 | 0 |
LAT | 0.62 (0.51-0.76) | 0.62 (0.51-0.76) | 2.10 × 10^{-6} | 2.10 × 10^{-6} | 0.99 | 0 | ||||
GOR | 0.59 (0.47-0.73) | 0.59 (0.47-0.73) | 1.34 × 10^{-6} | 1.34 × 10^{-6} | 0.99 | 0 | ||||
Domi | 0.57 (0.46-0.72) | 0.57 (0.46-0.72) | 1.85 × 10^{-6} | 1.85 × 10^{-6} | 0.94 | 0 | ||||
Rece | 0.58 (0.34-1.00) | 0.57 (0.30-1.10) | 4.81 × 10^{-2} | 9.25 × 10^{-2} | 0.29 | 20 | ||||
rs10747758 | 12q14.2 | OR6U2P | 0.37 | Allelic | 0.69 (0.58-0.81) | 0.69 (0.58-0.81) | 3.97 × 10^{-6} | 3.97 × 10^{-6} | 0.50 | 0 |
LAT | 0.68 (0.57-0.80) | 0.68 (0.57-0.80) | 4.01 × 10^{-6} | 4.01 × 10^{-6} | 0.53 | 0 | ||||
GOR | 0.62 (0.51-0.76) | 0.62 (0.51-0.76) | 2.04 × 10^{-6} | 2.04 × 10^{-6} | 0.52 | 0 | ||||
Domi | 0.58 (0.46-0.73) | 0.58 (0.46-0.73) | 3.40 × 10^{-6} | 3.40 × 10^{-6} | 0.67 | 0 | ||||
Rece | 0.66 (0.48-0.90) | 0.66 (0.48-0.90) | 9.70 × 10^{-3} | 9.70 × 10^{-3} | 0.54 | 0 | ||||
rs2517509^{b} | 6p21.33 | HCG22 | 0.03 | Allelic | 3.22 (1.94-5.34) | 3.22 (1.94-5.34) | 5.77 × 10^{-6} | 5.77 × 10^{-6} | 0.45 | 0 |
LAT | 3.27 (1.96-5.46) | 3.30 (1.93-5.63) | 5.85 × 10^{-6} | 1.27 × 10^{-5} | 0.35 | 4.7 | ||||
GOR | 3.31 (1.97-5.57) | 3.33 (1.95-5.70) | 6.58 × 10^{-6} | 1.14 × 10^{-5} | 0.36 | 3.3 | ||||
Domi | 3.31 (1.96-5.58) | 3.32 (1.94-5.69) | 7.53 × 10^{-6} | 1.19 × 10^{-5} | 0.36 | 2.8 | ||||
Rece | - | - | - | - | - | - |
Discussion
The GOR was suggested as a model-free approach for the synthesis of genetic association studies. The rational is that the GOR provides more flexibility for the true underlying genetic effect to describe the difference between two cumulative distribution functions of the latent variables, particularly when the assumption of proportional odds is violated. Furthermore, an additional advantage is that this ordinal measure of association is easily interpretable in practice [1].
Recent meta-analyses have applied the GOR claiming that this might be considered a different genetic model or an independent approach compared to the specification of traditional genetic model of analysis [7, 8]. However, here we show that, since the GOR inherently assumes an ordinal mutation load (e.g. 1, 2 and 3 for genotypes A_{1}A_{1}, A_{1}A_{2}, and A_{2}A_{2}, respectively), this measure of assocation performs like a multiplicative model of analysis for bi-allelic polymorphisms. For diallelic variants, our simulations show that GOR-based results are highly correlated to those obtained by both LAT and per-allele ORs, resulting in similar type-I error rates and power compared to these traditional multiplicative models of analysis. In addition, a real meta-analysis of three GWAs in Alzheimer's disease indicates that limited. For example, under a fixed-effects framework and assumption of a threshold of P<10^{-5} (probably realistic due to the small samples sizes available), the total number of markers considered promising for further replication [9] would be 10, 13, 13, 14 and two for the per-allele, LAT, GOR, dominant and recessive approaches, respectively. Under a random-effects model, the correspondent numbers would be two for the recessive model and 8 for the remaining approaches.
Nonetheless, other important considerations in meta-analysis of genetic association studies involving bi-allelic polymorphism are biases in the estimated effect size [10] and heterogeneity [11]. In this respect, the most negative aspect of using the GOR as a measure of association in practice is that this measure provides inflated effects for bi-allelic variants following a multiplicative model of action. Although this inflation may be only mild for less common markers (i.e. median bias of ~5% for variants with MAF = 10%), the average upward bias in the observed genetic effect augments with increasing MAFs, reaching up to 20% for MAFs around 40%.
On the other hand, our data showed that the use of the GOR may be advantageous in meta-analyses involving tri-allelic polymorphisms as long as genotypes can be correctly ordered in terms of mutation load. In fact, a reasonable gain in power in the order of 2 to 15% may be achieved for the detection of association signals from variants with small frequencies (e.g. f ~10%) compared to the use of per-allele or LAT odds ratios. The observation that higher power might be obtained with GOR in scenarios with a larger number of alleles of low frequency may serve as hypothesis-generating information to extent the use of the GOR to meta-analysis of different types of genetic variants. For example, a special case might the use of the GOR in meta-analysis of structural variants such as copy-number variations (CNVs), which tend to exhibit a substantial number of alleles, yielding a correspondent large number of possible genotype categories [12]. Since the GOR handles categories with zero counts [13], and a different number of genotypes may be considered per study (for instance, in the case of specific allele sizes in some populations), the properties of the GOR in meta-analysis of CNVs is a topic worth of further investigation.
In summary, although there are differences in the statistical properties among the investigated approaches for bi-allelic variants, the absolute magnitude of these differences may be actually small and likely to be of very limited practical significance. An exception might be the use of the GOR in meta-analyses involving tri-allelic polymorphisms with less common alleles, since GOR uses of the complete genotypic distribution (e.g. the GOR less affected by zero cells). For these scenarios, the use of the GOR as a measure of effect may be slightly more powerful than traditional measures. However, the performance of GOR-based meta-analyses will depend on some knowledge about the direction of the effects when there are two alleles modulating the risk of disease in opposite directions.
Material and methods
Simulation procedures and scenarios
We simulated meta-analyses of association studies using approaches that rely on multinomial distributions described in detail elsewhere (autosomal markers) [9, 10]. Hardy-Weinberg equilibrium is assumed to hold for the whole population, whereas the susceptibility alleles are considered the causal variants or surrogate markers in tight linkage disequilibrium (r^{2} = 1.0). For the bi-allelic case, we simulated data assuming the susceptibility variant A_{2} (minor allele) and non-risk allele A_{1}.
Under a three-allele model, we denote A_{1}, A_{2} and A_{3} as the possible alleles with frequencies f(A_{ 1 }), f(A_{ 2 }) and f(A_{ 3 }^{)}, respectively, yielding six possible distinct genotypes (A_{1}A_{1}, A_{1}A_{2}, A_{1}A_{3}, A_{2}A_{2}, A_{2}A_{3} and A_{3}A_{3}).
Parameters and simulated scenarios (trait prevalence = 10%).
True underlying genetic model | OR | f | τ^{2} | N per study | Genetic model of analysis |
---|---|---|---|---|---|
Null | 1.0 | 0 (homogeneity) | GOR | ||
0.10 | LAT | ||||
0.33 | 0.025 (mild heterogeneity) | 500-1000 | Per-allele | ||
0.40 | Dominant | ||||
Dominant Multiplicative Recessive Over-dominant | 1.3 (modest effect) | 0.05 (strong heterogeneity) | Recessive |
For the tri-allelic case, three possible scenarios were considered: (i), among the alleles, two were susceptibility variants (e.g. both increase the susceptibility for the trait with the same magnitude), (ii) two alleles were associated with the trait, but in opposite directions (i.e. one increases, while the other decreases the risk for the trait in a similar magnitude) and (iii) only one out of the three alleles displays significant effects on the trait. We further assumed that the mechanism of action is similar for both alleles when there are two alleles with genuine effects on the trait (e.g. both act multiplicatively, or both act dominantly, and so forth). For scenarios with two alleles modulating the risk of disease, two additional situations of practical interest were investigated: (ii-a) the two alleles are associated with the susceptibility of disease in opposite directions and investigators have no prior evidence on the direction of these effects (e.g. initial agnostic screenings) and (ii-b) two alleles are associated with the susceptibility of disease in opposite directions, but investigators posses prior evidence on the direction of the effects (e.g. meta-analyses from the literature). For consistency, allele A_{2} is coined to be the protective variant, whereas allele A_{3} is the susceptibility one in these scenarios.
Bi-allelic polymorphisms
Assessment of bias
The percentage bias was computed as and for genetic effect sizes and between-study variance, respectively, where is the (average) observed summary effect, μ is the true average genetic effect across population-specific genetic effects, τ^{2} is the true between-study variance and is the method-of-moments-based estimate of τ^{2}. Both and μ are captured as the natural logarithm of the odds ratio (Table 3). Use of alternative bias estimators (e.g. mean squared error) yielded qualitatively analogous results (data not shown).
Tri-allelic polymorphisms
Meta-analyses involving three-allele polymorphisms may rely on a diversity of approaches to summarize effects across studies. However, because the assumption of multiplicative effects yields, on average, the lowest rates of false-positive results in bi-allelic markers [9, 10], we compared the GOR to two approaches that assume a multiplicative mode of action: the per-allele OR, which yields to three correlated odds ratios (OR[A_{3} vs A_{1}], OR[A_{3} vs A_{2}] and OR[A_{2} vs A_{1}]) and the log-additive trend approach.
The generalized odds ratio
For a binary trait (e.g. case-control studies), GOR measures the probability that a randomly sampled case has a genotype with a higher mutation load (i.e. a larger number of high-risk alleles) than a randomly sampled control divided by the probability that a randomly sampled case has a genotype with lower mutation load than a randomly sampled control [1].
where J is the total number of genotypes (categories) given the number of alleles, i.e., J = m(m+1)/2, m is the number of alleles, (i.e. the proportion of the subjects with genotype j, for j = 1,..,J, in which the higher the value of j, the higher the mutation load) in the group i (i = 0 or 1 for controls and cases, respectively). In the present investigation, the large-sample variance for GOR was computed from the asymptotic standard error of the Goodman-Kruskal γ [1]. Stata and R codes to compute the GOR and its large-sample variance are available from the first author upon request.
Mutational load order
The order of the j th genotypic category (i.e. mutational load) for the GOR and log-additive trend is anticipated to impact statistical power. Hence, for the situation ii-a (initial agnostic screenings), we set as genotypic order and for situation ii-b (meta-analyses from the literature with prior information on the direction of effects).
Assessment of power and type-I error
Empirical power and type-I error rates (i.e. false-positive discoveries) were computed as the proportion of simulations that gave a two-sided P-value < 5%. Because there are three correlated OR estimates for the tri-allelic case for the per-allele model, we corrected the α level using the Dunn-Šidák procedure. Specifically, power and type-I error rates for the per-allele model (tri-allelic case) were computed as the proportion of the simulations that gave one or more P-values < α_{ corrected }, where .
Real application
We compared results based on the GOR as a summary effect to those obtained by usual approaches of model specification in a real meta-analysis of three independent genome-wide studies in late-onset Alzheimer's disease. After standard control measures, a total of 311,915 bi-allelic polymorphisms were scored in 1411 participants (961 cases and 560 controls). Detailed description on the samples, genotyping platforms and diagnostics criteria are available elsewhere [4]. Results from individual studies were corrected for residual inflation of the test statistic using genomic control methods [14].
Meta-analysis methods
Meta-analyses were carried out under both fixed- and random-effects models, represented by the general inverse-variance and DerSimonian-Laird methods, respectively [15, 16]. For the real application, statistical heterogeneity was test using the Cochran's Q test [11], and quantified using the I^{2} index [17].
All simulations were performed in Stata 11.1 package (Stata Corporation), whereas the meta-analysis of real data sets were carried out in PLINK [18].
Declarations
Acknowledgements
TVP is funded by grants from the Fundação de Amparo à Pesquisa do Estado de São Paulo (State of São Paulo Research Foundation, FAPESP). The authors are deeply indebted to the two anonymous reviewers for their extensive and valuable comments on the manuscript.
Authors’ Affiliations
References
- Zintzaras E: The generalized odds ratio as a measure of genetic risk effect in the analysis and meta-analysis of association studies. Stat Appl Genet Mol Biol. 2010, 9: Article21-PubMedGoogle Scholar
- Lee JH, Cheng R, Graff-Radford N, Foroud T, Mayeux R: Analyses of the National Institute on Aging Late-Onset Alzheimer's Disease Family Study: implication of additional loci. Arch Neurol. 2008, 65: 1518-1526. 10.1001/archneur.65.11.1518.PubMedPubMed CentralView ArticleGoogle Scholar
- Nyholt DR, Yu CE, Visscher PM: On Jim Watson's APOE status: genetic information is hard to hide. Eur J Hum Genet. 2009, 17: 147-149. 10.1038/ejhg.2008.198.PubMedPubMed CentralView ArticleGoogle Scholar
- Reiman EM, Webster JA, Myers AJ, Hardy J, Dunckley T, Zismann VL, Joshipura KD, Pearson JV, Hu-Lince D, Huentelman MJ: GAB2 alleles modify Alzheimer's risk in APOE epsilon4 carriers. Neuron. 2007, 54: 713-720. 10.1016/j.neuron.2007.05.022.PubMedPubMed CentralView ArticleGoogle Scholar
- Hu X, Pickering E, Liu YC, Hall S, Fournier H, Katz E, Dechairo B, John S, Van EP, Soares H: Meta-Analysis for Genome-Wide Association Study Identifies Multiple Variants at the BIN1 Locus Associated with Late-Onset Alzheimer's Disease. PLoS One. 2011, 6: e16616-10.1371/journal.pone.0016616.PubMedPubMed CentralView ArticleGoogle Scholar
- Shi H, Medway C, Bullock J, Brown K, Kalsheker N, Morgan K: Analysis of Genome-Wide Association Study (GWAS) data looking for replicating signals in Alzheimer's disease (AD). Int J Mol Epidemiol Genet. 2010, 1: 53-66.PubMedPubMed CentralGoogle Scholar
- Wang JL, Wang HG, Gao HQ, Zhai GX, Chang P, Chen YG: Endothelial nitric oxide synthase polymorphisms and erectile dysfunction: a meta-analysis. J Sex Med. 2010, 7: 3889-3898. 10.1111/j.1743-6109.2010.01968.x.PubMedView ArticleGoogle Scholar
- Zintzaras E: Is catechol-O-methyl transferase 472G/A gene polymorphism a marker associated with alcoholism?. Psychiatr Genet. 2011, 21: 29-36. 10.1097/YPG.0b013e3283413615.PubMedView ArticleGoogle Scholar
- Pereira TV, Patsopoulos NA, Pereira AC, Krieger JE: Strategies for genetic model specification in the screening of genome-wide meta-analysis signals for further replication. Int J Epidemiol. 2011, 40: 457-469. 10.1093/ije/dyq203.PubMedView ArticleGoogle Scholar
- Pereira TV, Patsopoulos NA, Salanti G, Ioannidis JP: Discovery properties of genome-wide association signals from cumulatively combined data sets. Am J Epidemiol. 2009, 170: 1197-1206. 10.1093/aje/kwp262.PubMedPubMed CentralView ArticleGoogle Scholar
- Pereira TV, Patsopoulos NA, Salanti G, Ioannidis JP: Critical interpretation of Cochran's Q test depends on power and prior assumptions about heterogeneity. Research Synthesis Methods. 2010, 1: 149-161. 10.1002/jrsm.13.PubMedView ArticleGoogle Scholar
- McCarroll SA: Extending genome-wide association studies to copy-number variation. Hum Mol Genet. 2008, 17: R135-R142. 10.1093/hmg/ddn282.PubMedView ArticleGoogle Scholar
- Agresti A: Generalized odds ratios for ordinal data. Biometrics. 1980, 36: 59-67. 10.2307/2530495.View ArticleGoogle Scholar
- Bacanu SA, Devlin B, Roeder K: The power of genomic control. Am J Hum Genet. 2000, 66: 1933-1944. 10.1086/302929.PubMedPubMed CentralView ArticleGoogle Scholar
- Borenstein M, Hedges L, Higgins JPT, Rothstein HR: A basic introduction to fixed-effect and random-effects models for meta-analysis. Research Synthesis Methods. 2010, 1: 97-111. 10.1002/jrsm.12.PubMedView ArticleGoogle Scholar
- DerSimonian R, Laird N: Meta-analysis in clinical trials. Control Clin Trials. 1986, 7: 177-188. 10.1016/0197-2456(86)90046-2.PubMedView ArticleGoogle Scholar
- Higgins JP, Thompson SG: Quantifying heterogeneity in a meta-analysis. Stat Med. 2002, 21: 1539-1558. 10.1002/sim.1186.PubMedView ArticleGoogle Scholar
- Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007, 81: 559-575. 10.1086/519795.PubMedPubMed CentralView ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.