Statistical Analyses and Power for Primary Aim: Detection Stage
The primary aim of this study is to detect variants of specific genes that predict the development of PTSD following trauma. We hypothesize inherited vulnerability to PTSD is mediated by genetic variation in three specific neurobiological systems (HPA axis, locus coeruleus/noradrenergic system, limbic-frontal neuro-circuitry of fear) whose alterations are implicated in PTSD etiology.
Basic association analysis will be performed using the
PLINK[
103] and
Haploview[
104] software packages. The initial step of analysis is to perform a rigorous quality control procedure: high missing genotype rates (both per individual and per SNP) and deviations from Hardy-Weinberg genotype proportions are indicative of problems. Individuals and/or SNPs will be removed as needed, as will very rare and monomorphic SNPs. The basic association test assumes an additive effect of SNP genotype (for dichotomous traits, additivity is on the log-odds scale) and is a regression of the phenotype on genotype. It is also possible to assume dominant and recessive gene action, and to perform likelihood ratio tests comparing these three models to a more general 2 degree of freedom model. We propose to take account of potential covariates (such as subpopulation membership in the case of population stratification) either by directly incorporating covariates in the model, or by use of a permutation-framework (i.e. permuting phenotype labels only within subpopulations).
Information across multiple SNPs within a gene can be combined in two ways: via haplotype-based tests and gene-based tests. The
PLINK package uses a weighted likelihood mixture of regressions model, to account for the potential ambiguity in statistically-inferred haplotypes, following the model of Schaid
et al.[
105] The posterior probabilities for each particular pair of haplotypes for each individual are calculated via the E-M algorithm; these posterior probabilities (of haplotype pair conditional on multilocus SNP genotype data) are used to weight the haplotype-PTSD association analysis. For
H haplotypes, either a
H-1 df omnibus test (looking for a joint effect of all haplotypes) or a series of
H haplotype-specific tests, each with 1 df, can be conducted. The tests are likelihood ratio test statistics; both asymptotic and empirical significance values are available, as well as confidence intervals on parameter estimates. Based on the LD profile of each gene, haplotypes will either be formed across the entire gene, or restricted to regions of high LD/low haplotype diversity, e.g. using a haplotype block definition rule as implemented in the Haploview package[
104]. In contrast, for
S SNPs, a gene-based analysis simply considers the
S cumulative sums of rank-ordered single SNP association chi-squared statistics (S
1, S
1+S
2, S
1+S
2+S
3, ...) and evaluates significance via permutation, which also corrects for having tested
S different ranked sum scores. This method is a gene-based implementation of Ott & Hoh's[
106] method that utilizes sum-statistics. A gene-based test might potentially be more sensitive to genes with multiple, less common variants having individually small effects on the phenotype.
Table
1 presents power calculations for our study. The sample is well powered to perform a comprehensive screening of common variation in ~30 genes. We used the Genetic Power Calculator[
107] online resource to calculate power, assuming either that the causal variant (CV) is directly typed (an upper bound on power) or is in incomplete linkage disequilibrium (LD) (r
2 = 0.8) with a typed marker (effectively a lower bound, as the tag SNP selection is designed to capture all common variants with an r
2 of at least 0.8). The calculations below are based on 1000 cases and 1000 controls, assuming a prevalence of PTSD of 13%, a multiplicative true mode of gene action and that the test is a 1 df test allelic or haplotypic test. The calculations are parameterized in terms of a liability-threshold model: the CV explains either 1 or 2% of variation in a continuous, unobserved normally-distributed liability; the threshold is chosen to correspond to the known population prevalence of PTSD. That is, rather than a table ordered by fixed risk ratio (which would show that rare, low-penetrance alleles are undetectable by any practical study design given the sample-size requirements and that common, higher penetrance alleles are detected easily), we fix the variance explained in the table to restrict the presentation to the lowest range of effects likely to be achievable – and find that indeed genetic risk factors that explain considerably less than 2% of the variance in liability will be detected by our study. The implied genotypic relative risks (GRR) for the having one (het) or two (hom) copies of the risk allele relative to the baseline genotype are shown in Table
1.
Table 1
Power calculations for genetic study of 1,000 cases and 1,000 controls
1% of variance in liability
|
0.010 | 2.65 | 4.86 | 0.87 | 0.95 | 0.58 | 0.75 |
0.025 | 1.95 | 3.25 | 0.87 | 0.94 | 0.57 | 0.74 |
0.050 | 1.65 | 2.49 | 0.86 | 0.94 | 0.55 | 0.72 |
0.100 | 1.45 | 2.02 | 0.84 | 0.93 | 0.53 | 0.70 |
0.250 | 1.31 | 1.67 | 0.83 | 0.92 | 0.50 | 0.67 |
0.500 | 1.27 | 1.59 | 0.81 | 0.91 | 0.47 | 0.65 |
0.750 | 1.34 | 1.75 | 0.79 | 0.89 | 0.45 | 0.62 |
0.900 | 1.56 | 2.33 | 0.76 | 0.87 | 0.40 | 0.57 |
2% of variance in liability
|
0.010 | 3.59 | 6.55 | 1.00 | 1.00 | 0.96 | 0.99 |
0.025 | 2.48 | 4.54 | 1.00 | 1.00 | 0.97 | 0.99 |
0.050 | 1.99 | 3.37 | 1.00 | 1.00 | 0.97 | 0.99 |
0.100 | 1.69 | 2.61 | 1.00 | 1.00 | 0.96 | 0.99 |
0.250 | 1.47 | 2.05 | 0.99 | 1.00 | 0.95 | 0.99 |
0.500 | 1.42 | 1.94 | 0.99 | 1.00 | 0.93 | 0.98 |
0.750 | 1.53 | 2.25 | 0.99 | 1.00 | 0.91 | 0.97 |
0.900 | 1.96 | 3.50 | 0.98 | 1.00 | 0.87 | 0.96 |
To address the issue of multiple testing: power is given for three levels of type I error rate: 0.05, 0.05/12 (4.17e-3) and 0.05/360 based on 30 genes (1.39e-4) which correspond to a (conservative) control of family-wise error rate at the level of SNP (i.e. single test), gene and experiment respectively. In practice, during analysis we will use a less conservative permutation-based procedure to control for multiple testing: the conservative Bonferroni assumption is used only to facilitate power calculation. The "lower bound" on power is based on the reasonable assumption that tag SNP selection will improve efficiency (i.e. r2 > = 0.8). We performed a set of coalescent simulations to determine the expected maximum r2 that would result from completely random selection of SNPs, to provide an even more stringent lower bound on power. Using the CoSi simulator, we generated 50 kb haplotypes (i.e. corresponding to a typically-sized gene) with SNP frequency and LD profiles similar to those observed in Caucasian samples (assuming uniform mutation and recombination rates). We randomly designated one variant (minor allele frequency, MAF > 0.02) as the "CV" and then selected 12 variants (MAF > 0.02) as the typed SNPs. Across 500 replicates, the average maximum r2 between a typed SNP and the (possibly typed but most likely unobserved) CV varied depending on the allele frequency of the CV, from approximately 0.5 for less common SNPs (MAF < 0.1) to 0.7 for more common CVs. Even in the scenario that the CV is rare and the tag SNP selection performs no better than chance, the expected marker density should ensure reasonable to good coverage of common variation. Power is still good under most circumstances: for a 1% CV with MAF of 0.1, the "lower bound" drops from 0.84 (r2 = 0.8) to 0.78 for r2 = 0.7 (0.58 for r2 = 0.5), although experiment-wide power is poor in this case however, at 0.43 for r2 = 0.7 (0.23 for r2 = 0.5). For CVs explaining 2% of the variation in liability, power at the gene-wide level is still greater than 0.90 in almost all cases; experiment-wide power approximately ranges between 0.80 and 0.90 for r2 = 0.7 (0.60 and 0.70 for r2 = 0.5). In summary, even under the unlikely assumption that tag SNP selection adds no value whatsoever, and the conservative correction for all 360 single SNP tests assuming independence, the study is still adequately-powered at the experiment-wide level for multiple tests.
Approach to multiple testing
As well as limiting the number of tests performed via specific hypotheses, we propose to use a permutation-based framework to control for multiple testing. Within a gene, we will control the family-wise type I error rate at 5%: case labels are randomly permuted (possibly within subgroups to control for potential confounders) against all genotypes – this procedure maintains the correlational structure of the tests under all permuted replicates, and so is not conservative as the Bonferroni correction which assumes tests are independent. By comparing each observed test statistic within a gene against the maximum permuted test statistic per replicate, the empirical p-value will naturally control for multiple testing. (A similar logic can be applied to multiple, potentially correlated, phenotypes also.) At the gene-based level of analysis, controlling for the chance of at least one false positive is appropriate, as we will conclude a significant gene-disease association if at least one test within the gene is significant after correction. In contrast, we may wish to use a less stringent control across genes, to obtain an experiment-wide error significance value: here false discovery rate (FDR) procedures, that control the probability that a significant result is also a true one, may be more appropriate. We should note that, along with many areas in statistical genetics, this area is currently the subject of much methodological development and debate: as such, when the time comes to perform the analysis, the literature will be reviewed to formalize a specific analytic plan. Ultimately, replication in an independent sample will also be important to establish true associations.
Genetic overlap between PTSD and major depression
Based on epidemiologic studies, we estimate that the prevalence of lifetime major depression (MD) will be ~40% in PTSD cases and ~20% in trauma-exposed controls who never developed PTSD.[
3,
6,
8,
12,
41,
42]Given our sample of 1000 cases and 1000 controls, this will be the first PTSD candidate gene study to date with adequate numbers of PTSD cases with and without MD to conduct exploratory analyses examining the complex relation between these two disorders in trauma-exposed women. To empirically address this potentially complex genetic relationship, for PTSD-associated variants we will, a) test whether genotype distribution differs within PTSD cases with and without MD, b) establish whether the association with PTSD holds after controlling for MD. These tests will be performed using PLINK: the first analysis is a standard association analysis performed in the PTSD case subsample; the second analysis will use the Cochran-Mantel-Haenzsel test for association in stratified tables (stratifying by MD status). In this way, we can ask whether the association is similar for PTSD cases with and without PTSD or is specific to PTSD, or to the PTSD+MD comorbid phenotype. Given genetic influences on MD explain 57% of the genetic variance in PTSD,[
52] we predict most gene-PTSD associations to be similar for PTSD cases with and without MD.[
108]
Statistical Analyses for the Secondary Aim: Dissection Stage
The goal of the Detection stage is to screen all genes for association using a simple, powerful and rigorous analytic approach. In this second Dissection stage, we propose a more detailed examination of any genes that pass the first stage, both to refine the association signal and to explore it in its broader phenotypic, genetic and environmental context. In particular, we consider: 1) conditional tests to determine the causal variant among multiple correlated association signals; 2) analysis of trauma timing, type, and severity; 3) a gene-based approach to detecting epistasis.
We will capitalize on the large sample size and conduct conditional analyses to help fine-map the causal variant within a region showing multiple significant associations. The use of haplotype information can, to some extent, help to determine whether specific associated SNPs and/or haplotypes are more likely to be only indirectly associated (via LD) as opposed to being the causal variant. The PLINK package enables a flexible specification of nested hypotheses which allows tests to be constructed that ask questions such as: can this sole SNP or haplotype explain the entire association signal at a locus? does SNP A have an effect independent of SNP B or haplotype C? For example, for two SNPs, alleles A and B (as opposed to alleles a and b) may both be associated with PTSD as well as with each other (due to LD). The basic analysis would not inform us as to whether both A and B are contributing independently to risk for PTSD, however. A conditional analysis might proceed as follows: if, for example, three haplotypes are observed, AB, ab, and Ab, then we can test each SNP controlling for the other, e.g. for the A allele the test is [A b versus a b ] and for the B allele it is [A B versus A b]. The PLINK package (developed by Dr. Purcell) enables flexible specification of such conditional tests, for any number of SNPs and haplotypes. For example, testing the effect of A conditional on two other SNPs might entail fitting a model that equates the following haplotypes: [A BC = a BC; A bC = a bC; A bc = a bc ] and comparing the fit (via likelihood ratio test statistic) with the full model which does not impose these equality constraints. The above model can be easily specified in PLINK. In summary, given a strong initial association signal, these analyses can help to determine which variants are causal and which are only indirectly associated. This analysis can never prove that a variant is causal: it can however, indicate which of a set of associated variants do not show simple independent causal effects and inform functional studies.
We will test whether trauma timing, type, and severity modify the association between genetic risk variants and PTSD. For genetic variants that pass the Detection stage (p < .05 after correction for multiple testing), we will perform a focused set of analyses that test for heterogeneity in terms of the timing, type, and severity of the trauma. We hypothesize that the effect of PTSD genetic risk variants will be magnified among women whose first trauma occurred in childhood (rather than adolescence or adulthood), among those exposed to interpersonal violence versus other traumas, and among those with more severe (high versus low) trauma exposure. Heterogeneity analyses can be performed using PLINK, which allows allelic and haplotypic coefficient to vary as a linear function of a measured covariate, e.g. instead of simply g the coefficient is estimated as (g+bM
i
) where M
i
is the measured covariate for individual i. A likelihood ratio test is constructed by comparison against the nested submodel with fixes b to 0, which indicates whether the association depends on the covariate. For any environmental measures coded as multiple categories, we shall use the Breslow-Day test of homogeneous odds ratios as implemented in PLINK.
Our sample of 1000 cases and 1000 controls is well powered to perform gene-trauma interaction analyses for genes associated with PTSD in detection analyses. To evaluate the statistical power to detect an interaction between genotype and trauma-exposure characteristics, we conducted a series of simulations considering a range of scenarios. Power was calculated as the proportion of simulated samples (out of 1000) that were significant for an alpha level = .01. The power to detect an interaction will depend on the minor allele frequency, prevalence of the high-risk trauma-exposure characteristic (e.g. childhood trauma versus later, IPV versus OTS, high versus low exposure severity), and the effect size for the interaction. We chose minor allele frequencies of .10, .25, and .50 to be comparable to minor allele frequencies of variants included in our study, e.g. the s allele of SLC6A4 has ~50% frequency in Caucasian populations. In all cases, we assumed a main effect of exposure, an allelic effect only in the high-risk exposure group, and alpha = .01. In summary, if the prevalence of the high-risk trauma-exposure characteristic was .10 or .25, power to detect interaction ranged from .80 to ~1.00 for a minor allele frequency of .10 or greater and an interaction RR of 1.5 or greater. If the prevalence of the high-risk trauma-exposure characteristic was .50, power was >.90–1.00 for a minor allele frequency of .10 or greater and interaction RR of 1.5 or greater.
As a more exploratory, secondary goal, we plan to evaluate evidence for epistatic gene-gene interaction, using a novel method which considers all SNPs in a pair of genes simultaneously in the test for interaction. The method has been validated both in simulation studies and via application to several datasets, e.g. detecting interaction between dysbindin and the BLOC-1 genes in schizophrenia, Morris et al.[
109] The method, based on canonical correlation analysis, can be applied either as a case-only test for epistasis (more powerful but applicable only to unlinked genes and makes a more stringent assumption regarding population homogeneity) or the more traditional case-control approach. Comparing this approach to the standard pairwise SNP-by-SNP approach (e.g. Marchini et al.[
110]) simulations have shown the increase in power. For example, using a dominant/complementary model of epistatic gene action, we simulated 5 genes (of which only two interacted) each with 10 SNPs, which leads to 250 SNP-by-SNP tests but only 25 gene-by-gene tests. We used permutation to generate empirical p-values and control for multiple testing: power is presented correcting for all tests in a particular gene-by-gene comparison, and also at the experiment-wide level. The standard SNP-by-SNP approach yields powers of 24% and 6% respectively, whereas the new approach gave 70% and 58% power. This novel approach is therefore considerably more powerful and ideally suited to detecting epistasis between the 30 genes. The approach is also ideally suited to testing interaction between groups of genes: the neurobiological pathways to which the candidate genes belong will be used to specify intra- and inter-pathway interactions. Importantly, the large sample size available to us will ensure that the screen for epistasis is both comprehensive and rigorous (i.e. controlling for multiple testing).