From GWAS to ENCODE and Beyond — Recognizing DNA Functional Elements with Direct Relevance to Rheumatic, Skin, and Musculoskeletal Diseases

Meeting Summary

March 5, 2014

Introduction

The overall goal of all NIAMS roundtables is to discuss scientific and clinical needs, and to listen to the concerns and challenges facing the scientific community. These sessions provide a valuable source of input for the NIAMS planning process. This specific roundtable explored the potential value of genome-wide data to define functional elements of the genome for research in NIAMS mission areas.

Background

A growing body of data, arising from genome-wide analysis of transcription, chromatin organization, and epigenetic marks, has been widely interpreted as defining functional elements in the human genome. The Encyclopedia of DNA Elements (ENCODE) Project, sponsored by the National Human Genome Research Institute (NHGRI), and the NIH Common Fund Epigenomics Program have both contributed to a growing catalogue of functional elements in the human genome. Such functional elements include transcribed regions of DNA (coding and non-coding), promoters, and enhancers. These data have the potential to help in explaining the biological mechanisms underlying the genetic associations identified in genome-wide association studies (GWAS), and to illuminate many fundamental aspects of gene expression. However, relatively few ENCODE datasets reflect the functional status of tissues and cell types with direct relevance to rheumatic, skin, and musculoskeletal diseases. Because functional genomic elements show a high degree of cell and tissue specificity, the utility of currently available data for the study of diseases in the NIAMS mission may be limited.

The purpose of the meeting was to:

assess the potential significance of genomic functional elements, as currently defined, for advancing future research in NIAMS mission areas; and
consider how the utility of this type of data can be improved to facilitate research in rheumatic, skin, and musculoskeletal biology and diseases.

Participants in the three-hour webinar included scientists with expertise in the study of specific diseases and tissues; those familiar with genome-wide analysis of chromatin organization; and bioinformatics experts.

Summary of ENCODE Resources

The meeting opened with a broad overview of current ENCODE resources and scientific applications for diseases within the NIAMS mission areas. NHGRI/ENCODE staff discussed the rationale behind the ENCODE project, the goals the current resources are meant to fulfill, and upcoming changes to data use policies. The discussion focused on the uses of ENCODE data for hypothesis generation, determining the relationship of regulatory elements with the corresponding target gene(s), and predicting which cell types are involved in disease phenotypes. A brief tutorial of the ENCODE portal introduced participants to the broad array of tools available to researchers.

Significance of the ENCODE Approach

All participants were enthusiastic about the potential of the ENCODE approach to translate genetic and epigenetic information into plausible functional roles for disease-associated genomic regions. The community is currently using ENCODE data to examine the genetics of rheumatic, skin, and musculoskeletal diseases, as well as to understand basic biology. Improved cell sorting technologies are allowing for collection of better defined populations of cells, and fewer cells are needed to generate libraries for analysis. However, library preparation continues to be challenging and costly. As new methodologies come online, service labs which specialize in delivering ENCODE-ready materials may drive costs down, as has occurred with sequencing technologies. Participants noted that some of the more recent technologies in the ENCODE resources, such as chromatin interaction analyses for chromosomal DNA/DNA interactions, provide very useful information, but require larger quantities of DNA, and therefore larger-sized samples. In addition, challenges in data submission need to be addressed. Standardization of human subjects consent documents and open data access could facilitate increased usage of ENCODE data. Community data submission to the ENCODE databases is a work in progress. Further, data standards for community submissions, especially for metadata, need to be developed, along with benchmark standards to reduce variability. The ENCODE Data Coordination Center may require additional resources to meet these needs and enable community data submission.

The group also discussed the usability of the current ENCODE resources. While ENCODE is able to provide a variety of analyses, specialized bioinformatics and computational expertise are increasingly needed to process the complex genomic data. This impedes integration of ENCODE-type data into the ongoing experiments of laboratories with limited expertise, and requires significant investment by laboratories wishing to enter the field, either in staff time and training, or obtaining the expertise from outside individuals. A welcome addition would be improved tutorials to assist investigators in learning how to integrate ENCODE data into their projects.

Another barrier to the application of ENCODE data to NIAMS diseases arises from the fact that the majority of cell types analyzed to date are either from transformed cell lines or healthy subjects. Thus, genetic variants that are associated with specific diseases may not be well represented. In addition, the current collection has limited representation of primary cells from human tissues, especially those with relevance to NIAMS diseases. Participants discussed emerging technologies that allow the analysis of small numbers of cells and even single cells. Currently, there are no plans for generating ENCODE-type data at the single cell level under the ENCODE umbrella. Furthermore, individual labs are generating relevant data outside the ENCODE project; this creates further challenges for standardization, reproducibility, and data sharing in a broader sense.

Cell and Tissue Targets

The group discussed the various sources of materials for generating ENCODE-type data. The relative utility of cultured, sorted, scaffold grown, in vivo, and grafted tissues was also explored. In general, all sources are valid starting points, but careful consideration should be given to potential factors which could impact the data. In particular, in vitro expansion of primary cells could greatly affect the transcriptional state of the cells, and data should be monitored to determine if artifacts are introduced due to the culture conditions. The age of the donor, differentiation state of the cell, sample site, disease stage, and relative heterogeneity of the tissue should all be defined before initiating a new study. Participants noted that certain types of cell lines, such as Epstein-Barr virus transformed lines, were useful as a screening tool, but results would need to be confirmed in primary cells. A noted need for NIAMS-related mission areas was generating data from better defined cells, as opposed to expanded studies on poorly defined populations already in the ENCODE database. Several specific cell types were discussed as gaps in the current data sets that would have broad applicability to the NIAMS mission areas, including specialized cells from bone and joint (e.g., chondrocytes, synoviocytes, osteocytes), better defined cells from skin (which includes more than 60 types of cells), and more data from non-malignant cells, as well as immune infiltrates from tissues targeted by skin and rheumatic diseases. The relative utility of data obtained from mouse samples or grafts was also considered. Participants felt that studies in mice had value and should continue to be supported. Animal models allow exploration of carefully controlled disease states that may not be obtainable from human subjects. However, verification of animal data in human tissues is necessary as correlation is not always consistent.

Future Considerations

Participants noted several concepts that will be critical considerations as epigenomic research moves forward. Understanding the epigenetic changes of normal versus pathogenic states within tissues will enhance understanding of disease initiation and progression. Making data more accessible and practical to use will encourage integration of these types of data into existing research. In addition, the future uses of genomic and epigenomic data generated should be clarified to ensure that current research projects catalog data in such a way that subsequent projects can build upon them. NIAMS will continue to engage with the community to consider whether future endeavors in epigenomic research specific to NIAMS mission areas are needed.

Participants

CHRISTIANO, Angela M., Ph.D., FACMG, Columbia University
ELDER, James T., M.D., Ph.D., University of Michigan Medical School
FEGHALI-BOSTWICK, Carol A., Ph.D., Medical University of South Carolina
FEINGOLD, Elise, Ph.D., National Human Genome Research Institute, National Institutes of Health
FIRESTEIN, Gary S., M.D., University of California-San Diego
GAFFNEY, Patrick M., M.D., Oklahoma Medical Research Foundation
GREGERSEN, Peter K., M.D., Feinstein Institute for Medical Research
GRANT, Struan F.A., Ph.D., University of Pennsylvania, Children’s Hospital of Philadelphia Research Institute
HANKENSON, Kurt D., Ph.D., University of Pennsylvania
KHAVARI, Paul A., M.D., Ph.D., Stanford University School of Medicine
LEFEBVRE, Veronique M., Ph.D., Cleveland Clinic, Lerner Research Institute
LOTZ, Martin K., M.D., The Scripps Research Institute
PAZIN, Michael, Ph.D., National Human Genome Research Institute, National Institutes of Health
PIKE, J. Wesley, Ph.D., University of Wisconsin-Madison
RAYCHAUDHURI, Soumya, M.D., Ph.D., Broad Institute of MIT and Harvard, Harvard Medical School
TAPSCOTT, Stephen J., M.D., Ph.D., Fred Hutchinson Cancer Research Center
THOMPSON, Susan D., Ph.D., Cincinnati Children’s Hospital Medical Center

NIAMS

BAKER, Carl C., M.D., Ph.D. (Co-Chair)
CARTER, Robert H., M.D.
KATZ, Stephen I., M.D., Ph.D.
KESTER, Mary Beth, M.S.
LINDE, Anita M., M.P.P.
McGOWAN, Joan A., Ph.D.
REUSS, Andreé (Reaya) E., M.S.
SARTORELLI, Vittorio, M.D.
SHARROCK, William J., Ph.D. (Co-Chair)
SERRATE-SZTEIN, Susana A., M.D.
WANG, Yan, M.D., Ph.D. (Co-Chair)

Roundtable