New Research Building, 77 Avenue Louis Pasteur Boston, MA 02215 Contact
Publications by Year: 2011
Yu Y, Bhangale TR, Fagerness J, Ripke S, Thorleifsson G, Tan PL, Souied EH, Richardson AJ, Merriam JE, Buitendijk GHS, Reynolds R, Raychaudhuri S, Chin KA, Sobrin L, Evangelou E, Lee PH, Lee AY, Leveziel N, Zack DJ, Campochiaro B, Campochiaro P, Smith TR, Barile GR, Guymer RH, Hogg R, Chakravarthy U, Robman LD, Gustafsson O, Sigurdsson H, Ortmann W, Behrens TW, Stefansson K, Uitterlinden AG, van Duijn CM, Vingerling JR, Klaver CCW, Allikmets R, Brantley MA, Baird PN, Katsanis N, Thorsteinsdottir U, Ioannidis JPA, Daly MJ, Graham RR, Seddon JM. Common variants near FRK/COL10A1 and VEGFA are associated with advanced age-related macular degeneration. Hum Mol Genet 2011;20(18):3699-709.Abstract
Despite significant progress in the identification of genetic loci for age-related macular degeneration (AMD), not all of the heritability has been explained. To identify variants which contribute to the remaining genetic susceptibility, we performed the largest meta-analysis of genome-wide association studies to date for advanced AMD. We imputed 6 036 699 single-nucleotide polymorphisms with the 1000 Genomes Project reference genotypes on 2594 cases and 4134 controls with follow-up replication of top signals in 5640 cases and 52 174 controls. We identified two new common susceptibility alleles, rs1999930 on 6q21-q22.3 near FRK/COL10A1 [odds ratio (OR) 0.87; P = 1.1 × 10(-8)] and rs4711751 on 6p12 near VEGFA (OR 1.15; P = 8.7 × 10(-9)). In addition to the two novel loci, 10 previously reported loci in ARMS2/HTRA1 (rs10490924), CFH (rs1061170, and rs1410996), CFB (rs641153), C3 (rs2230199), C2 (rs9332739), CFI (rs10033900), LIPC (rs10468017), TIMP3 (rs9621532) and CETP (rs3764261) were confirmed with genome-wide significant signals in this large study. Loci in the recently reported genes ABCA1 and COL8A1 were also detected with suggestive evidence of association with advanced AMD. The novel variants identified in this study suggest that angiogenesis (VEGFA) and extracellular collagen matrix (FRK/COL10A1) pathways contribute to the development of advanced AMD.
Epidemiology and candidate gene studies indicate a shared genetic basis for celiac disease (CD) and rheumatoid arthritis (RA), but the extent of this sharing has not been systematically explored. Previous studies demonstrate that 6 of the established non-HLA CD and RA risk loci (out of 26 loci for each disease) are shared between both diseases. We hypothesized that there are additional shared risk alleles and that combining genome-wide association study (GWAS) data from each disease would increase power to identify these shared risk alleles. We performed a meta-analysis of two published GWAS on CD (4,533 cases and 10,750 controls) and RA (5,539 cases and 17,231 controls). After genotyping the top associated SNPs in 2,169 CD cases and 2,255 controls, and 2,845 RA cases and 4,944 controls, 8 additional SNPs demonstrated P<5 × 10(-8) in a combined analysis of all 50,266 samples, including four SNPs that have not been previously confirmed in either disease: rs10892279 near the DDX6 gene (P(combined) = 1.2 × 10(-12)), rs864537 near CD247 (P(combined) = 2.2 × 10(-11)), rs2298428 near UBE2L3 (P(combined) = 2.5 × 10(-10)), and rs11203203 near UBASH3A (P(combined) = 1.1 × 10(-8)). We also confirmed that 4 gene loci previously established in either CD or RA are associated with the other autoimmune disease at combined P<5 × 10(-8) (SH2B3, 8q24, STAT4, and TRAF1-C5). From the 14 shared gene loci, 7 SNPs showed a genome-wide significant effect on expression of one or more transcripts in the linkage disequilibrium (LD) block around the SNP. These associations implicate antigen presentation and T-cell activation as a shared mechanism of disease pathogenesis and underscore the utility of cross-disease meta-analysis for identification of genetic risk factors with pleiotropic effects between two clinically distinct diseases.
Genome-wide association studies (GWAS) have defined over 150 genomic regions unequivocally containing variation predisposing to immune-mediated disease. Inferring disease biology from these observations, however, hinges on our ability to discover the molecular processes being perturbed by these risk variants. It has previously been observed that different genes harboring causal mutations for the same Mendelian disease often physically interact. We sought to evaluate the degree to which this is true of genes within strongly associated loci in complex disease. Using sets of loci defined in rheumatoid arthritis (RA) and Crohn's disease (CD) GWAS, we build protein-protein interaction (PPI) networks for genes within associated loci and find abundant physical interactions between protein products of associated genes. We apply multiple permutation approaches to show that these networks are more densely connected than chance expectation. To confirm biological relevance, we show that the components of the networks tend to be expressed in similar tissues relevant to the phenotypes in question, suggesting the network indicates common underlying processes perturbed by risk loci. Furthermore, we show that the RA and CD networks have predictive power by demonstrating that proteins in these networks, not encoded in the confirmed list of disease associated loci, are significantly enriched for association to the phenotypes in question in extended GWAS analysis. Finally, we test our method in 3 non-immune traits to assess its applicability to complex traits in general. We find that genes in loci associated to height and lipid levels assemble into significantly connected networks but did not detect excess connectivity among Type 2 Diabetes (T2D) loci beyond chance. Taken together, our results constitute evidence that, for many of the complex diseases studied here, common genetic associations implicate regions encoding proteins that physically interact in a preferential manner, in line with observations in Mendelian disease.
UNLABELLED: Primary sclerosing cholangitis (PSC) is a chronic cholestatic liver disease characterized by inflammation and fibrosis of the bile ducts. Both environmental and genetic factors contribute to its pathogenesis. To further clarify its genetic background, we investigated susceptibility loci recently identified for ulcerative colitis (UC) in a large cohort of 1,186 PSC patients and 1,748 controls. Single nucleotide polymorphisms (SNPs) tagging 13 UC susceptibility loci were initially genotyped in 854 PSC patients and 1,491 controls from Benelux (331 cases, 735 controls), Germany (265 cases, 368 controls), and Scandinavia (258 cases, 388 controls). Subsequently, a joint analysis was performed with an independent second Scandinavian cohort (332 cases, 257 controls). SNPs at chromosomes 2p16 (P-value 4.12 × 10(-4) ), 4q27 (P-value 4.10 × 10(-5) ), and 9q34 (P-value 8.41 × 10(-4) ) were associated with PSC in the joint analysis after correcting for multiple testing. In PSC patients without inflammatory bowel disease (IBD), SNPs at 4q27 and 9q34 were nominally associated (P < 0.05). We applied additional in silico analyses to identify likely candidate genes at PSC susceptibility loci. To identify nonrandom, evidence-based links we used GRAIL (Gene Relationships Across Implicated Loci) analysis showing interconnectivity between genes in six out of in total nine PSC-associated regions. Expression quantitative trait analysis from 1,469 Dutch and UK individuals demonstrated that five out of nine SNPs had an effect on cis-gene expression. These analyses prioritized IL2, CARD9, and REL as novel candidates.
CONCLUSION: We have identified three UC susceptibility loci to be associated with PSC, harboring the putative candidate genes REL, IL2, and CARD9. These results add to the scarce knowledge on the genetic background of PSC and imply an important role for both innate and adaptive immunological factors.
A common allele at the TAGAP gene locus demonstrates a suggestive, but not conclusive association with risk of rheumatoid arthritis (RA). To fine map the locus, we conducted comprehensive imputation of CEU HapMap single-nucleotide polymorphisms (SNPs) in a genome-wide association study (GWAS) of 5,500 RA cases and 22,621 controls (all of European ancestry). After controlling for population stratification with principal components analysis, the strongest signal of association was to an imputed SNP, rs212389 (P=3.9 × 10(-8), odds ratio=0.87). This SNP remained highly significant upon conditioning on the previous RA risk variant (rs394581, P=2.2 × 10(-5)) or on a SNP previously associated with celiac disease and type I diabetes (rs1738074, P=1.7 × 10(-4)). Our study has refined the TAGAP signal of association to a single haplotype in RA, and in doing so provides conclusive statistical evidence that the TAGAP locus is associated with RA risk. Our study also underscores the utility of comprehensive imputation in large GWAS data sets to fine map disease risk alleles.
Discovering and following up on genetic associations with complex phenotypes require large patient cohorts. This is particularly true for patient cohorts of diverse ancestry and clinically relevant subsets of disease. The ability to mine the electronic health records (EHRs) of patients followed as part of routine clinical care provides a potential opportunity to efficiently identify affected cases and unaffected controls for appropriate-sized genetic studies. Here, we demonstrate proof-of-concept that it is possible to use EHR data linked with biospecimens to establish a multi-ethnic case-control cohort for genetic research of a complex disease, rheumatoid arthritis (RA). In 1,515 EHR-derived RA cases and 1,480 controls matched for both genetic ancestry and disease-specific autoantibodies (anti-citrullinated protein antibodies [ACPA]), we demonstrate that the odds ratios and aggregate genetic risk score (GRS) of known RA risk alleles measured in individuals of European ancestry within our EHR cohort are nearly identical to those derived from a genome-wide association study (GWAS) of 5,539 autoantibody-positive RA cases and 20,169 controls. We extend this approach to other ethnic groups and identify a large overlap in the GRS among individuals of European, African, East Asian, and Hispanic ancestry. We also demonstrate that the distribution of a GRS based on 28 non-HLA risk alleles in ACPA+ cases partially overlaps with ACPA- subgroup of RA cases. Our study demonstrates that the genetic basis of rheumatoid arthritis risk is similar among cases of diverse ancestry divided into subsets based on ACPA status and emphasizes the utility of linking EHR clinical data with biospecimens for genetic studies.
Two common variants in the gene encoding complement factor H (CFH), the Y402H substitution (rs1061170, c.1204C>T)(1-4) and the intronic rs1410996 SNP(5,6), explain 17% of age-related macular degeneration (AMD) liability. However, proof for the involvement of CFH, as opposed to a neighboring transcript, and knowledge of the potential mechanism of susceptibility alleles are lacking. Assuming that rare functional variants might provide mechanistic insights, we used genotype data and high-throughput sequencing to discover a rare, high-risk CFH haplotype with a c.3628C>T mutation that resulted in an R1210C substitution. This allele has been implicated previously in atypical hemolytic uremic syndrome, and it abrogates C-terminal ligand binding(7,8). Genotyping R1210C in 2,423 AMD cases and 1,122 controls demonstrated high penetrance (present in 40 cases versus 1 control, P = 7.0 × 10(-6)) and an association with a 6-year-earlier onset of disease (P = 2.3 × 10(-6)). This result suggests that loss-of-function alleles at CFH are likely to drive AMD risk. This finding represents one of the first instances in which a common complex disease variant has led to the discovery of a rare penetrant mutation.
Although genome-wide association studies have implicated many individual loci in complex diseases, identifying the exact causal alleles and the cell types within which they act remains greatly challenging. To ultimately understand disease mechanism, researchers must carefully conceive functional studies in relevant pathogenic cell types to demonstrate the cellular impact of disease-associated genetic variants. This challenge is highlighted in autoimmune diseases, such as rheumatoid arthritis, where any of a broad range of immunological cell types might potentially be impacted by genetic variation to cause disease. To this end, we developed a statistical approach to identify potentially pathogenic cell types in autoimmune diseases by using a gene-expression data set of 223 murine-sorted immune cells from the Immunological Genome Consortium. We found enrichment of transitional B cell genes in systemic lupus erythematosus (p = 5.9 × 10(-6)) and epithelial-associated stimulated dendritic cell genes in Crohn disease (p = 1.6 × 10(-5)). Finally, we demonstrated enrichment of CD4+ effector memory T cell genes within rheumatoid arthritis loci (p < 10(-6)). To further validate the role of CD4+ effector memory T cells within rheumatoid arthritis, we identified 436 loci that were not yet known to be associated with the disease but that had a statistically suggestive association in a recent genome-wide association study (GWAS) meta-analysis (p(GWAS) < 0.001). Even among these putative loci, we noted a significant enrichment for genes specifically expressed in CD4+ effector memory T cells (p = 1.25 × 10(-4)). These cell types are primary candidates for future functional studies to reveal the role of risk alleles in autoimmunity. Our approach has application in other phenotypes, outside of autoimmunity, where many loci have been discovered and high-quality cell-type-specific gene expression is available.
Advances in genotyping and sequencing technologies have revolutionized the genetics of complex disease by locating rare and common variants that influence an individual's risk for diseases, such as diabetes, cancers, and psychiatric disorders. However, to capitalize on these data for prevention and therapies requires the identification of causal alleles and a mechanistic understanding for how these variants contribute to the disease. After discussing the strategies currently used to map variants for complex diseases, this Primer explores how variants may be prioritized for follow-up functional studies and the challenges and approaches for assessing the contributions of rare and common variants to disease phenotypes.
MOTIVATION: As disease loci are rapidly discovered, an emerging challenge is to identify common pathways and biological functionality across loci. Such pathways might point to potential disease mechanisms. One strategy is to look for functionally related or interacting genes across genetic loci. Previously, we defined a statistical strategy, Gene Relationships Across Implicated Loci (GRAIL), to identify whether pair-wise gene relationships defined using PubMed text similarity are enriched across loci. Here, we have implemented VIZ-GRAIL, a software tool to display those relationships and to depict the underlying biological patterns.
RESULTS: Our tool can seamlessly interact with the GRAIL web site to obtain the results of analyses and create easy to read visual displays. To most clearly display results, VIZ-GRAIL arranges genes and genetic loci to minimize intersecting pair-wise gene connections. VIZ-GRAIL can be easily applied to other types of functional connections, beyond those from GRAIL. This method should help investigators appreciate the presence of potentially important common functions across loci.
AVAILABILITY: The GRAIL algorithm is implemented online at http://www.broadinstitute.org/mpg/grail/grail.php. VIZ-GRAIL source-code is at http://www.broadinstitute.org/mpg/grail/vizgrail.html.