Trynka G, Westra H-J, Slowikowski K, Hu X, Xu H, Stranger BE, Klein RJ, Han B, Raychaudhuri S. Disentangling the Effects of Colocalizing Genomic Annotations to Functionally Prioritize Non-coding Variants within Complex-Trait Loci [Internet]. Am J Hum Genet 2015;97(1):139-52. Publisher's VersionAbstract
Identifying genomic annotations that differentiate causal from trait-associated variants is essential to fine mapping disease loci. Although many studies have identified non-coding functional annotations that overlap disease-associated variants, these annotations often colocalize, complicating the ability to use these annotations for fine mapping causal variation. We developed a statistical approach (Genomic Annotation Shifter [GoShifter]) to assess whether enriched annotations are able to prioritize causal variation. GoShifter defines the null distribution of an annotation overlapping an allele by locally shifting annotations; this approach is less sensitive to biases arising from local genomic structure than commonly used enrichment methods that depend on SNP matching. Local shifting also allows GoShifter to identify independent causal effects from colocalizing annotations. Using GoShifter, we confirmed that variants in expression quantitative trail loci drive gene-expression changes though DNase-I hypersensitive sites (DHSs) near transcription start sites and independently through 3' UTR regulation. We also showed that (1) 15%-36% of trait-associated loci map to DHSs independently of other annotations; (2) loci associated with breast cancer and rheumatoid arthritis harbor potentially causal variants near the summits of histone marks rather than full peak bodies; (3) variants associated with height are highly enriched in embryonic stem cell DHSs; and (4) we can effectively prioritize causal variation at specific loci.
Lenz TL#, Deutsch AJ#, Han B, Hu X, Okada Y, Eyre S, Knapp M, Zhernakova A, Huizinga TWJ, Abecasis G, Becker J, Boeckxstaens GE, Chen W-M, Franke A, Gladman DD, Gockel I, Gutierrez-Achury J, Martin J, Nair RP, Nöthen MM, Onengut-Gumuscu S, Rahman P, Rantapää-Dahlqvist S, Stuart PE, Tsoi LC, van Heel DA, Worthington J, Wouters MM, Klareskog L, Elder JT, Gregersen PK, Schumacher J, Rich SS, Wijmenga C, Sunyaev SR, de Bakker PIW*, Raychaudhuri S*. Widespread non-additive and interaction effects within HLA loci modulate the risk of autoimmune diseases [Internet]. Nat Genet 2015;47(9):1085-90. Publisher's VersionAbstract
Human leukocyte antigen (HLA) genes confer substantial risk for autoimmune diseases on a log-additive scale. Here we speculated that differences in autoantigen-binding repertoires between a heterozygote's two expressed HLA variants might result in additional non-additive risk effects. We tested the non-additive disease contributions of classical HLA alleles in patients and matched controls for five common autoimmune diseases: rheumatoid arthritis (ncases = 5,337), type 1 diabetes (T1D; ncases = 5,567), psoriasis vulgaris (ncases = 3,089), idiopathic achalasia (ncases = 727) and celiac disease (ncases = 11,115). In four of the five diseases, we observed highly significant, non-additive dominance effects (rheumatoid arthritis, P = 2.5 × 10(-12); T1D, P = 2.4 × 10(-10); psoriasis, P = 5.9 × 10(-6); celiac disease, P = 1.2 × 10(-87)). In three of these diseases, the non-additive dominance effects were explained by interactions between specific classical HLA alleles (rheumatoid arthritis, P = 1.8 × 10(-3); T1D, P = 8.6 × 10(-27); celiac disease, P = 6.0 × 10(-100)). These interactions generally increased disease risk and explained moderate but significant fractions of phenotypic variance (rheumatoid arthritis, 1.4%; T1D, 4.0%; celiac disease, 4.1%) beyond a simple additive model.
van Steenbergen HW, Raychaudhuri S, Rodríguez-Rodríguez L, Rantapää-Dahlqvist S, Berglin E, Toes REM, Huizinga TWJ, Fernández-Gutiérrez B, Gregersen PK, van der Helm-van Mil AHM. Association of valine and leucine at HLA-DRB1 position 11 with radiographic progression in rheumatoid arthritis, independent of the shared epitope alleles but not independent of anti-citrullinated protein antibodies [Internet]. Arthritis Rheumatol 2015;67(4):877-86. Publisher's VersionAbstract
OBJECTIVE: For decades it has been known that the HLA-DRB1 shared epitope (SE) alleles are associated with an increased risk of development and progression of rheumatoid arthritis (RA). Recently, the following variations in the peptide-binding grooves of HLA molecules that predispose to RA development have been identified: Val and Leu at HLA-DRB1 position 11, Asp at HLA-B position 9, and Phe at HLA-DPB1 position 9. This study was undertaken to investigate whether these variants are also associated with radiographic progression in RA, independent of SE and anti-citrullinated protein antibody (ACPA) status. METHODS: A total of 4,911 radiograph sets from 1,878 RA patients included in the Leiden Early Arthritis Clinic (The Netherlands), Umeå (Sweden), Hospital Clinico San Carlos-Rheumatoid Arthritis (Spain), and National Data Bank for Rheumatic Diseases (US) cohorts were studied. HLA was imputed using single-nucleotide polymorphism data from an Immunochip, and the amino acids listed above were tested in relation to radiographic progression per cohort using an additive model. Results from the 4 cohorts were combined in inverse-variance weighted meta-analyses using a fixed-effects model. Analyses were conditioned on SE and ACPA status. RESULTS: Val and Leu at HLA-DRB1 position 11 were associated with more radiographic progression (meta-analysis P = 5.11 × 10(-7)); this effect was independent of SE status (meta-analysis P = 0.022) but not independent of ACPA status. Phe at HLA-DPB1 position 9 was associated with more severe radiographic progression (meta-analysis P = 0.024), though not independent of SE status. Asp at HLA-B position 9 was not associated with radiographic progression. CONCLUSION: Val and Leu at HLA-DRB1 position 11 conferred a risk of a higher rate of radiographic progression independent of SE status but not independent of ACPA status. These findings support the relevance of these amino acids at position 11.
Liao KP, Cai T, Savova GK, Murphy SN, Karlson EW, Ananthakrishnan AN, Gainer VS, Shaw SY, Xia Z, Szolovits P, Churchill S, Kohane I. Development of phenotype algorithms using electronic medical records and incorporating natural language processing [Internet]. BMJ 2015;350:h1885. Publisher's Version
Lee HS, Byrne EM, Hultman CM, Kähler A, Vinkhuyzen AAE, Ripke S, Andreassen OA, Frisell T, Gusev A, Hu X, Karlsson R, Mantzioris VX, McGrath JJ, Mehta D, Stahl EA, Zhao Q, Kendler KS, Sullivan PF, Price AL, O'Donovan M, Okada Y, Mowry BJ, Raychaudhuri S, Wray NR, of the and International SWGPGCRAC, of the Authors SWGPGC, Byerley W, Cahn W, Cantor RM, Cichon S, Cormican P, Curtis D, Djurovic S, Escott-Price V, Gejman PV, Georgieva L, Giegling I, Hansen TF, Ingason A, Kim Y, Konte B, Lee PH, McIntosh A, McQuillin A, Morris DW, Nöthen MM, O'Dushlaine C, Olincy A, Olsen L, Pato CN, Pato MT, Pickard BS, Posthuma D, Rasmussen HB, Rietschel M, Rujescu D, Schulze TG, Silverman JM, Thirumalai S, Werge T, of the Collaborators SWGPGC, Agartz I, Amin F, Azevedo MH, Bass N, Black DW, Blackwood DHR, Bruggeman R, Buccola NG, Choudhury K, Cloninger RC, Corvin A, Craddock N, Daly MJ, Datta S, Donohoe GJ, Duan J, Dudbridge F, Fanous A, Freedman R, Freimer NB, Friedl M, Gill M, Gurling H, De Haan L, Hamshere ML, Hartmann AM, Holmans PA, Kahn RS, Keller MC, Kenny E, Kirov GK, Krabbendam L, Krasucki R, Lawrence J, Lencz T, Levinson DF, Lieberman JA, Lin D-Y, Linszen DH, Magnusson PKE, Maier W, Malhotra AK, Mattheisen M, Mattingsdal M, McCarroll SA, Medeiros H, Melle I, Milanova V, Myin-Germeys I, Neale BM, Ophoff RA, Owen MJ, Pimm J, Purcell SM, Puri V, Quested DJ, Rossin L, Ruderfer D, Sanders AR, Shi J, Sklar P, St Clair D, Stroup ST, van Os J, Visscher PM, Wiersma D, Zammit S, Zammit S, Bridges LS, Choi HK, Coenen MJ, de Vries N, Dieud P, Greenberg JD, Huizinga TWJ, Padyukov L, Siminovitch KA, Tak PP, Worthington J, Worthington J, De Jager PL, Denny JC, Gregersen PK, Klareskog L, Mariette X, Plenge RM, van Laar M, van Riel P. New data and an old puzzle: the negative association between schizophrenia and rheumatoid arthritis [Internet]. Int J Epidemiol 2015;44(5):1706-21. Publisher's VersionAbstract
BACKGROUND: A long-standing epidemiological puzzle is the reduced rate of rheumatoid arthritis (RA) in those with schizophrenia (SZ) and vice versa. Traditional epidemiological approaches to determine if this negative association is underpinned by genetic factors would test for reduced rates of one disorder in relatives of the other, but sufficiently powered data sets are difficult to achieve. The genomics era presents an alternative paradigm for investigating the genetic relationship between two uncommon disorders. METHODS: We use genome-wide common single nucleotide polymorphism (SNP) data from independently collected SZ and RA case-control cohorts to estimate the SNP correlation between the disorders. We test a genotype X environment (GxE) hypothesis for SZ with environment defined as winter- vs summer-born. RESULTS: We estimate a small but significant negative SNP-genetic correlation between SZ and RA (-0.046, s.e. 0.026, P = 0.036). The negative correlation was stronger for the SNP set attributed to coding or regulatory regions (-0.174, s.e. 0.071, P = 0.0075). Our analyses led us to hypothesize a gene-environment interaction for SZ in the form of immune challenge. We used month of birth as a proxy for environmental immune challenge and estimated the genetic correlation between winter-born and non-winter born SZ to be significantly less than 1 for coding/regulatory region SNPs (0.56, s.e. 0.14, P = 0.00090). CONCLUSIONS: Our results are consistent with epidemiological observations of a negative relationship between SZ and RA reflecting, at least in part, genetic factors. Results of the month of birth analysis are consistent with pleiotropic effects of genetic variants dependent on environmental context.
Triebwasser MP, Roberson EDO, Yu Y, Schramm EC, Wagner EK, Raychaudhuri S, Seddon JM, Atkinson JP. Rare Variants in the Functional Domains of Complement Factor H Are Associated With Age-Related Macular Degeneration [Internet]. Invest Ophthalmol Vis Sci 2015;56(11):6873-8. Publisher's VersionAbstract
PURPOSE: Age-related macular degeneration (AMD) has a substantial genetic risk component, as evidenced by the risk from common genetic variants uncovered in the first genome-wide association studies. More recently, it has become apparent that rare genetic variants also play an independent role in AMD risk. We sought to determine if rare variants in complement factor H (CFH) played a role in AMD risk. METHODS: We had previously collected DNA from a large population of patients with advanced age-related macular degeneration (A-AMD) and controls for targeted deep sequencing of candidate AMD risk genes. In this analysis, we tested for an increased burden of rare variants in CFH in 1665 cases and 752 controls from this cohort. RESULTS: We identified 65 missense, nonsense, or splice-site mutations with a minor allele frequency ≤ 1%. Rare variants with minor allele frequency ≤ 1% (odds ratio [OR] = 1.5, P = 4.4 × 10⁻²), 0.5% (OR = 1.6, P = 2.6 × 10⁻²), and all singletons (OR = 2.3, P = 3.3 × 10⁻²) were enriched in A-AMD cases. Moreover, we observed loss-of-function rare variants (nonsense, splice-site, and loss of a conserved cysteine) in 10 cases and serum levels of FH were decreased in all 5 with an available sample (haploinsufficiency). Further, rare variants in the major functional domains of CFH were increased in cases (OR = 3.2; P = 1.4 × 10⁻³) and the magnitude of the effect correlated with the disruptive nature of the variant, location in an active site, and inversely with minor allele frequency. CONCLUSIONS: In this large A-AMD cohort, rare variants in the CFH gene were enriched and tended to be located in functional sites or led to low serum levels. These data, combined with those indicating a similar, but even more striking, increase in rare variants found in CFI, strongly implicate complement activation in A-AMD etiopathogenesis as CFH and CFI interact to inhibit the alternative pathway.
Yu S, Liao KP, Shaw SY, Gainer VS, Churchill SE, Szolovits P, Murphy SN, Kohane IS, Cai T. Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources [Internet]. J Am Med Inform Assoc 2015;22(5):993-1000. Publisher's VersionAbstract
OBJECTIVE: Analysis of narrative (text) data from electronic health records (EHRs) can improve population-scale phenotyping for clinical and genetic research. Currently, selection of text features for phenotyping algorithms is slow and laborious, requiring extensive and iterative involvement by domain experts. This paper introduces a method to develop phenotyping algorithms in an unbiased manner by automatically extracting and selecting informative features, which can be comparable to expert-curated ones in classification accuracy. MATERIALS AND METHODS: Comprehensive medical concepts were collected from publicly available knowledge sources in an automated, unbiased fashion. Natural language processing (NLP) revealed the occurrence patterns of these concepts in EHR narrative notes, which enabled selection of informative features for phenotype classification. When combined with additional codified features, a penalized logistic regression model was trained to classify the target phenotype. RESULTS: The authors applied our method to develop algorithms to identify patients with rheumatoid arthritis and coronary artery disease cases among those with rheumatoid arthritis from a large multi-institutional EHR. The area under the receiver operating characteristic curves (AUC) for classifying RA and CAD using models trained with automated features were 0.951 and 0.929, respectively, compared to the AUCs of 0.938 and 0.929 by models trained with expert-curated features. DISCUSSION: Models trained with NLP text features selected through an unbiased, automated procedure achieved comparable or slightly higher accuracy than those trained with expert-curated features. The majority of the selected model features were interpretable. CONCLUSION: The proposed automated feature extraction method, generating highly accurate phenotyping algorithms with improved efficiency, is a significant step toward high-throughput phenotyping.
Yarwood A, Han B, Raychaudhuri S, Bowes J, Lunt M, Pappas DA, Kremer J, Greenberg JD, Plenge R, Plenge R, Worthington J, Barton A, Eyre S. A weighted genetic risk score using all known susceptibility variants to estimate rheumatoid arthritis risk [Internet]. Ann Rheum Dis 2015;74(1):170-6. Publisher's VersionAbstract
BACKGROUND: There is currently great interest in the incorporation of genetic susceptibility loci into screening models to identify individuals at high risk of disease. Here, we present the first risk prediction model including all 46 known genetic loci associated with rheumatoid arthritis (RA). METHODS: A weighted genetic risk score (wGRS) was created using 45 RA non-human leucocyte antigen (HLA) susceptibility loci, imputed amino acids at HLA-DRB1 (11, 71 and 74), HLA-DPB1 (position 9) HLA-B (position 9) and gender. The wGRS was tested in 11 366 RA cases and 15 489 healthy controls. The risk of developing RA was estimated using logistic regression by dividing the wGRS into quintiles. The ability of the wGRS to discriminate between cases and controls was assessed by receiver operator characteristic analysis and discrimination improvement tests. RESULTS: Individuals in the highest risk group showed significantly increased odds of developing anti-cyclic citrullinated peptide-positive RA compared to the lowest risk group (OR 27.13, 95% CI 23.70 to 31.05). The wGRS was validated in an independent cohort that showed similar results (area under the curve 0.78, OR 18.00, 95% CI 13.67 to 23.71). Comparison of the full wGRS with a wGRS in which HLA amino acids were replaced by a HLA tag single-nucleotide polymorphism showed a significant loss of sensitivity and specificity. CONCLUSIONS: Our study suggests that in RA, even when using all known genetic susceptibility variants, prediction performance remains modest; while this is insufficiently accurate for general population screening, it may prove of more use in targeted studies. Our study has also highlighted the importance of including HLA variation in risk prediction models.
Sul JH, Raj T, de Jong S, de Bakker PIW, Raychaudhuri S, Ophoff RA, Stranger BE, Eskin E, Han B. Accurate and fast multiple-testing correction in eQTL studies [Internet]. Am J Hum Genet 2015;96(6):857-68. Publisher's VersionAbstract
In studies of expression quantitative trait loci (eQTLs), it is of increasing interest to identify eGenes, the genes whose expression levels are associated with variation at a particular genetic variant. Detecting eGenes is important for follow-up analyses and prioritization because genes are the main entities in biological processes. To detect eGenes, one typically focuses on the genetic variant with the minimum p value among all variants in cis with a gene and corrects for multiple testing to obtain a gene-level p value. For performing multiple-testing correction, a permutation test is widely used. Because of growing sample sizes of eQTL studies, however, the permutation test has become a computational bottleneck in eQTL studies. In this paper, we propose an efficient approach for correcting for multiple testing and assess eGene p values by utilizing a multivariate normal distribution. Our approach properly takes into account the linkage-disequilibrium structure among variants, and its time complexity is independent of sample size. By applying our small-sample correction techniques, our method achieves high accuracy in both small and large studies. We have shown that our method consistently produces extremely accurate p values (accuracy > 98%) for three human eQTL datasets with different sample sizes and SNP densities: the Genotype-Tissue Expression pilot dataset, the multi-region brain dataset, and the HapMap 3 dataset.
Pers TH, Karjalainen JM, Chan Y, Westra H-J, Wood AR, Yang J, Lui JC, Vedantam S, Gustafsson S, Esko T, Frayling T, Speliotes EK, of Consortium GIANT (GIANT), Boehnke M, Raychaudhuri S, Fehrmann RSN, Hirschhorn JN, Franke L. Biological interpretation of genome-wide association studies using predicted gene functions [Internet]. Nat Commun 2015;6:5890. Publisher's VersionAbstract
The main challenge for gaining biological insights from genetic associations is identifying which genes and pathways explain the associations. Here we present DEPICT, an integrative tool that employs predicted gene functions to systematically prioritize the most likely causal genes at associated loci, highlight enriched pathways and identify tissues/cell types where genes from associated loci are highly expressed. DEPICT is not limited to genes with established functions and prioritizes relevant gene sets for many phenotypes.
Scott IC, Rijsdijk F, Walker J, Quist J, Spain SL, Tan R, Steer S, Okada Y, Raychaudhuri S, Cope AP, Lewis CM. Do Genetic Susceptibility Variants Associate with Disease Severity in Early Active Rheumatoid Arthritis? [Internet]. J Rheumatol 2015;42(7):1131-40. Publisher's VersionAbstract
OBJECTIVE: Genetic variants affect both the development and severity of rheumatoid arthritis (RA). Recent studies have expanded the number of RA susceptibility variants. We tested the hypothesis that these associated with disease severity in a clinical trial cohort of patients with early, active RA. METHODS: We evaluated 524 patients with RA enrolled in the Combination Anti-Rheumatic Drugs in Early RA (CARDERA) trials. We tested validated susceptibility variants - 69 single-nucleotide polymorphisms (SNP), 15 HLA-DRB1 alleles, and amino acid polymorphisms in 6 HLA molecule positions - for their associations with progression in Larsen scoring, 28-joint Disease Activity Scores, and Health Assessment Questionnaire (HAQ) scores over 2 years using linear mixed-effects and latent growth curve models. RESULTS: HLA variants were associated with joint destruction. The *04:01 SNP (rs660895, p = 0.0003), *04:01 allele (p = 0.0002), and HLA-DRβ1 amino acids histidine at position 13 (p = 0.0005) and valine at position 11 (p = 0.0012) significantly associated with radiological progression. This association was only significant in anticitrullinated protein antibody (ACPA)-positive patients, suggesting that while their effects were not mediated by ACPA, they only predicted joint damage in ACPA-positive RA. Non-HLA variants did not associate with radiograph damage (assessed individually and cumulatively as a weighted genetic risk score). Two SNP - rs11889341 (STAT4, p = 0.0001) and rs653178 (SH2B3-PTPN11, p = 0.0004) - associated with HAQ scores over 6-24 months. CONCLUSION: HLA susceptibility variants play an important role in determining radiological progression in early, active ACPA-positive RA. Genome-wide and HLA-wide analyses across large populations are required to better characterize the genetic architecture of radiological progression in RA.
Gutierrez-Achury J, Zhernakova A, Pulit SL, Trynka G, Hunt KA, Romanos J, Raychaudhuri S, van Heel DA, Wijmenga C, de Bakker PIW. Fine mapping in the MHC region accounts for 18% additional genetic risk for celiac disease [Internet]. Nat Genet 2015;47(6):577-8. Publisher's VersionAbstract
Although dietary gluten is the trigger for celiac disease, risk is strongly influenced by genetic variation in the major histocompatibility complex (MHC) region. We fine mapped the MHC association signal to identify additional risk factors independent of the HLA-DQA1 and HLA-DQB1 alleles and observed five new associations that account for 18% of the genetic risk. Taking these new loci together with the 57 known non-MHC loci, genetic variation can now explain up to 48% of celiac disease heritability.
Kim K, Bang S-Y, Lee H-S, Cho S-K, Choi C-B, Sung Y-K, Kim T-H, Jun J-B, Yoo DH, Kang YM, Kim S-K, Suh C-H, Shim S-C, Lee S-S, Lee J, Chung WT, Choe J-Y, Shin HD, Lee J-Y, Han B-G, Nath SK, Eyre S, Bowes J, Pappas DA, Kremer JM, Gonzalez-Gay MA, Rodriguez-Rodriguez L, Ärlestig L, Okada Y, Diogo D, Liao KP, Karlson EW, Raychaudhuri S, Rantapää-Dahlqvist S, Martin J, Klareskog L, Padyukov L, Gregersen PK, Worthington J, Greenberg JD, Plenge RM, Bae S-C. High-density genotyping of immune loci in Koreans and Europeans identifies eight new rheumatoid arthritis risk loci [Internet]. Ann Rheum Dis 2015;74(3):e13. Publisher's VersionAbstract
OBJECTIVE: A highly polygenic aetiology and high degree of allele-sharing between ancestries have been well elucidated in genetic studies of rheumatoid arthritis. Recently, the high-density genotyping array Immunochip for immune disease loci identified 14 new rheumatoid arthritis risk loci among individuals of European ancestry. Here, we aimed to identify new rheumatoid arthritis risk loci using Korean-specific Immunochip data. METHODS: We analysed Korean rheumatoid arthritis case-control samples using the Immunochip and genome-wide association studies (GWAS) array to search for new risk alleles of rheumatoid arthritis with anticitrullinated peptide antibodies. To increase power, we performed a meta-analysis of Korean data with previously published European Immunochip and GWAS data for a total sample size of 9299 Korean and 45,790 European case-control samples. RESULTS: We identified eight new rheumatoid arthritis susceptibility loci (TNFSF4, LBH, EOMES, ETS1-FLI1, COG6, RAD51B, UBASH3A and SYNGR1) that passed a genome-wide significance threshold (p<5×10(-8)), with evidence for three independent risk alleles at 1q25/TNFSF4. The risk alleles from the seven new loci except for the TNFSF4 locus (monomorphic in Koreans), together with risk alleles from previously established RA risk loci, exhibited a high correlation of effect sizes between ancestries. Further, we refined the number of single nucleotide polymorphisms (SNPs) that represent potentially causal variants through a trans-ethnic comparison of densely genotyped SNPs. CONCLUSIONS: This study demonstrates the advantage of dense-mapping and trans-ancestral analysis for identification of potentially causal SNPs. In addition, our findings support the importance of T cells in the pathogenesis and the fact of frequent overlap of risk loci among diverse autoimmune diseases.
Kim K, Jiang X, Cui J, Lu B, Costenbader KH, Sparks JA, Bang S-Y, Lee H-S, Okada Y, Raychaudhuri S, Alfredsson L, Bae S-C, Klareskog L, Karlson EW. Interactions between amino acid-defined major histocompatibility complex class II variants and smoking in seropositive rheumatoid arthritis [Internet]. Arthritis Rheumatol 2015;67(10):2611-23. Publisher's VersionAbstract
OBJECTIVE: To define the interaction between cigarette smoking and HLA polymorphisms in seropositive rheumatoid arthritis (RA), in the context of a recently identified amino acid-based HLA model for RA susceptibility. METHODS: We imputed Immunochip data on HLA amino acids and classical alleles from 3 case-control studies (the Swedish Epidemiological Investigation of Rheumatoid Arthritis [EIRA] study [1,654 cases and 1,934 controls], the Nurses' Health Study [NHS] [229 cases and 360 controls], and the Korean RA Cohort Study [1,390 cases and 735 controls]). We examined the interaction effects of heavy smoking (>10 pack-years) and the genetic risk score (GRS) of multiple RA-associated amino acid positions (positions 11, 13, 71, and 74 in HLA-DRβ1, position 9 in HLA-B, and position 9 in HLA-DPβ1), as well as the interaction effects of heavy smoking and the GRS of HLA-DRβ1 4-amino acid haplotypes (assessed via attributable proportion due to interaction [AP] using the additive interaction model). RESULTS: Heavy smoking and all investigated HLA amino acid positions and haplotypes were associated with RA susceptibility in the 3 populations. In the interaction analysis, we found a significant deviation from the expected additive joint effect between heavy smoking and the HLA-DRβ1 4-amino acid haplotype (AP 0.416, 0.467, and 0.796, in the EIRA, NHS, and Korean studies, respectively). We further identified the key interacting variants as being located at HLA-DRβ1 amino acid positions 11 and 13 but not at any of the other RA risk-associated amino acid positions. For residues in positions 11 and 13, there were similar patterns between RA risk effects and interaction effects. CONCLUSION: Our findings of significant gene-environment interaction effects indicate that a physical interaction between citrullinated autoantigens produced by smoking and HLA-DR molecules is characterized by the HLA-DRβ1 4-amino acid haplotype, primarily by positions 11 and 13.
Kavanagh D, Yu Y, Schramm EC, Triebwasser M, Wagner EK, Raychaudhuri S, Daly MJ, Atkinson JP, Seddon JM. Rare genetic variants in the CFI gene are associated with advanced age-related macular degeneration and commonly result in reduced serum factor I levels [Internet]. Hum Mol Genet 2015;24(13):3861-70. Publisher's VersionAbstract
To assess a potential diagnostic and therapeutic biomarker for age-related macular degeneration (AMD), we sequenced the complement factor I gene (CFI) in 2266 individuals with AMD and 1400 without, identifying 231 individuals with rare genetic variants. We evaluated the functional impact by measuring circulating serum factor I (FI) protein levels in individuals with and without rare CFI variants. The burden of very rare (frequency <1/1000) variants in CFI was strongly associated with disease (P = 1.1 × 10(-8)). In addition, we examined eight coding variants with counts ≥5 and saw evidence for association with AMD in three variants. Individuals with advanced AMD carrying a rare CFI variant had lower mean FI compared with non-AMD subjects carrying a variant (P < 0.001). Further new evidence that FI levels drive AMD risk comes from analyses showing individuals with a CFI rare variant and low FI were more likely to have advanced AMD (P = 5.6 × 10(-5)). Controlling for covariates, low FI increased the risk of advanced AMD among those with a variant compared with individuals without advanced AMD with a rare CFI variant (OR 13.6, P = 1.6 × 10(-4)), and also compared with control individuals without a rare CFI variant (OR 19.0, P = 1.1 × 10(-5)). Thus, low FI levels are strongly associated with rare CFI variants and AMD. Enhancing FI activity may be therapeutic and measuring FI provides a screening tool for identifying patients who are most likely to benefit from complement inhibitory therapy.
Hayes JE, Trynka G, Vijai J, Offit K, Raychaudhuri S, Klein RJ. Tissue-Specific Enrichment of Lymphoma Risk Loci in Regulatory Elements [Internet]. PLoS One 2015;10(9):e0139360. Publisher's VersionAbstract
Though numerous polymorphisms have been associated with risk of developing lymphoma, how these variants function to promote tumorigenesis is poorly understood. Here, we report that lymphoma risk SNPs, especially in the non-Hodgkin's lymphoma subtype chronic lymphocytic leukemia, are significantly enriched for co-localization with epigenetic marks of active gene regulation. These enrichments were seen in a lymphoid-specific manner for numerous ENCODE datasets, including DNase-hypersensitivity as well as multiple segmentation-defined enhancer regions. Furthermore, we identify putatively functional SNPs that are both in regulatory elements in lymphocytes and are associated with gene expression changes in blood. We developed an algorithm, UES, that uses a Monte Carlo simulation approach to calculate the enrichment of previously identified risk SNPs in various functional elements. This multiscale approach integrating multiple datasets helps disentangle the underlying biology of lymphoma, and more broadly, is generally applicable to GWAS results from other diseases as well.
Diogo D, Bastarache L, Liao KP, Graham RR, Fulton RS, Greenberg JD, Eyre S, Bowes J, Cui J, Lee A, Pappas DA, Kremer JM, Barton A, Coenen MJ, Franke B, Kiemeney LA, Mariette X, Richard-Miceli C, Canhão H, Fonseca JE, de Vries N, Tak PP, Crusius BJA, Nurmohamed MT, Kurreeman F, Mikuls TR, Okada Y, Stahl EA, Larson DE, Deluca TL, O'Laughlin M, Fronick CC, Fulton LL, Kosoy R, Ransom M, Bhangale TR, Ortmann W, Cagan A, Gainer V, Karlson EW, Kohane I, Murphy SN, Martin J, Zhernakova A, Klareskog L, Padyukov L, Worthington J, Mardis ER, Seldin MF, Gregersen PK, Behrens T, Raychaudhuri S, Denny JC, Plenge RM. TYK2 protein-coding variants protect against rheumatoid arthritis and autoimmunity, with no evidence of major pleiotropic effects on non-autoimmune complex traits [Internet]. PLoS One 2015;10(4):e0122271. Publisher's VersionAbstract
Despite the success of genome-wide association studies (GWAS) in detecting a large number of loci for complex phenotypes such as rheumatoid arthritis (RA) susceptibility, the lack of information on the causal genes leaves important challenges to interpret GWAS results in the context of the disease biology. Here, we genetically fine-map the RA risk locus at 19p13 to define causal variants, and explore the pleiotropic effects of these same variants in other complex traits. First, we combined Immunochip dense genotyping (n = 23,092 case/control samples), Exomechip genotyping (n = 18,409 case/control samples) and targeted exon-sequencing (n = 2,236 case/controls samples) to demonstrate that three protein-coding variants in TYK2 (tyrosine kinase 2) independently protect against RA: P1104A (rs34536443, OR = 0.66, P = 2.3 x 10(-21)), A928V (rs35018800, OR = 0.53, P = 1.2 x 10(-9)), and I684S (rs12720356, OR = 0.86, P = 4.6 x 10(-7)). Second, we show that the same three TYK2 variants protect against systemic lupus erythematosus (SLE, Pomnibus = 6 x 10(-18)), and provide suggestive evidence that two of the TYK2 variants (P1104A and A928V) may also protect against inflammatory bowel disease (IBD; P(omnibus) = 0.005). Finally, in a phenome-wide association study (PheWAS) assessing >500 phenotypes using electronic medical records (EMR) in >29,000 subjects, we found no convincing evidence for association of P1104A and A928V with complex phenotypes other than autoimmune diseases such as RA, SLE and IBD. Together, our results demonstrate the role of TYK2 in the pathogenesis of RA, SLE and IBD, and provide supporting evidence for TYK2 as a promising drug target for the treatment of autoimmune diseases.
Won H-H, Natarajan P, Dobbyn A, Jordan DM, Roussos P, Lage K, Raychaudhuri S, Stahl E, Do R. Disproportionate Contributions of Select Genomic Compartments and Cell Types to Genetic Risk for Coronary Artery Disease [Internet]. PLoS Genet 2015;11(10):e1005622. Publisher's VersionAbstract
Large genome-wide association studies (GWAS) have identified many genetic loci associated with risk for myocardial infarction (MI) and coronary artery disease (CAD). Concurrently, efforts such as the National Institutes of Health (NIH) Roadmap Epigenomics Project and the Encyclopedia of DNA Elements (ENCODE) Consortium have provided unprecedented data on functional elements of the human genome. In the present study, we systematically investigate the biological link between genetic variants associated with this complex disease and their impacts on gene function. First, we examined the heritability of MI/CAD according to genomic compartments. We observed that single nucleotide polymorphisms (SNPs) residing within nearby regulatory regions show significant polygenicity and contribute between 59-71% of the heritability for MI/CAD. Second, we showed that the polygenicity and heritability explained by these SNPs are enriched in histone modification marks in specific cell types. Third, we found that a statistically higher number of 45 MI/CAD-associated SNPs that have been identified from large-scale GWAS studies reside within certain functional elements of the genome, particularly in active enhancer and promoter regions. Finally, we observed significant heterogeneity of this signal across cell types, with strong signals observed within adipose nuclei, as well as brain and spleen cell types. These results suggest that the genetic etiology of MI/CAD is largely explained by tissue-specific regulatory perturbation within the human genome.
Ombrello MJ, Remmers EF, Tachmazidou I, Grom A, Foell D, Haas J-P, Martini A, Gattorno M, Özen S, Prahalad S, Zeft AS, Bohnsack JF, Mellins ED, Ilowite NT, Russo R, Len C, Hilario MOE, Oliveira S, Yeung RSM, Rosenberg A, Wedderburn LR, Anton J, Schwarz T, Hinks A, Bilginer Y, Park J, Cobb J, Satorius CL, Han B, Baskin E, Signa S, Duerr RH, Achkar JP, Kamboh IM, Kaufman KM, Kottyan LC, Pinto D, Scherer SW, Alarcón-Riquelme ME, Docampo E, Estivill X, Gül A, of and Group BSPAR (BSPAR) S, of and Group BSPAR (BSPAR) S, of in sJIA Investigators RPPSR (RAPPORT), to Group S-CARMS (CHARMS), in Group BBOPJIA (BBOP), de Bakker PIW, Raychaudhuri S, Langefeld CD, Thompson S, Zeggini E, Thomson W, Kastner DL, Woo P, Woo P. HLA-DRB1*11 and variants of the MHC class II locus are strong risk factors for systemic juvenile idiopathic arthritis [Internet]. Proc Natl Acad Sci U S A 2015;112(52):15970-5. Publisher's VersionAbstract
Systemic juvenile idiopathic arthritis (sJIA) is an often severe, potentially life-threatening childhood inflammatory disease, the pathophysiology of which is poorly understood. To determine whether genetic variation within the MHC locus on chromosome 6 influences sJIA susceptibility, we performed an association study of 982 children with sJIA and 8,010 healthy control subjects from nine countries. Using meta-analysis of directly observed and imputed SNP genotypes and imputed classic HLA types, we identified the MHC locus as a bona fide susceptibility locus with effects on sJIA risk that transcended geographically defined strata. The strongest sJIA-associated SNP, rs151043342 [P = 2.8 × 10(-17), odds ratio (OR) 2.6 (2.1, 3.3)], was part of a cluster of 482 sJIA-associated SNPs that spanned a 400-kb region and included the class II HLA region. Conditional analysis controlling for the effect of rs151043342 found that rs12722051 independently influenced sJIA risk [P = 1.0 × 10(-5), OR 0.7 (0.6, 0.8)]. Meta-analysis of imputed classic HLA-type associations in six study populations of Western European ancestry revealed that HLA-DRB1*11 and its defining amino acid residue, glutamate 58, were strongly associated with sJIA [P = 2.7 × 10(-16), OR 2.3 (1.9, 2.8)], as was the HLA-DRB1*11-HLA-DQA1*05-HLA-DQB1*03 haplotype [6.4 × 10(-17), OR 2.3 (1.9, 2.9)]. By examining the MHC locus in the largest collection of sJIA patients assembled to date, this study solidifies the relationship between the class II HLA region and sJIA, implicating adaptive immune molecules in the pathogenesis of sJIA.
Finucane HK, Bulik-Sullivan B, Gusev A, Trynka G, Reshef Y, Loh P-R, Anttila V, Xu H, Zang C, Farh K, Ripke S, Day FR, Day FR, of the Consortium SWGPG, of the Consortium SWGPG, Purcell S, Stahl E, Lindstrom S, Perry JRB, Okada Y, Raychaudhuri S, Daly MJ, Patterson N, Neale BM, Price AL. Partitioning heritability by functional annotation using genome-wide association summary statistics [Internet]. Nat Genet 2015;47(11):1228-35. Publisher's VersionAbstract
Recent work has demonstrated that some functional categories of the genome contribute disproportionately to the heritability of complex diseases. Here we analyze a broad set of functional elements, including cell type-specific elements, to estimate their polygenic contributions to heritability in genome-wide association studies (GWAS) of 17 complex diseases and traits with an average sample size of 73,599. To enable this analysis, we introduce a new method, stratified LD score regression, for partitioning heritability from GWAS summary statistics while accounting for linked markers. This new method is computationally tractable at very large sample sizes and leverages genome-wide information. Our findings include a large enrichment of heritability in conserved regions across many traits, a very large immunological disease-specific enrichment of heritability in FANTOM5 enhancers and many cell type-specific enrichments, including significant enrichment of central nervous system cell types in the heritability of body mass index, age at menarche, educational attainment and smoking behavior.