The integration and processing of increasing amounts of data is a major challenge for the biomedical industry. Scientists are often faced with complex problems such as large study populations, multiple environmental exposure mixtures, and numerous socioeconomic and geographic factors.
The National Institutes of Health, (NIH), is there to ensure that researchers have access to that information and encourage greater discovery. Office of Data Science StrategyCreated the Data and Technology Advancement (DATA) program. Data scholars work for NIH institutes or centers for one to 2 years and tackle high-priority issues.
Lara Clark, Ph.D., recently joined NIEHS under the umbrella of a DATA project titled “Use geospatial data to protect environmental public health,” bringing knowledge from her background in civil and environmental engineering. Geospatial data refers to data that can be linked to a specific area.
Diverse populations and diverse exposures
The goal is to combine large datasets that include diverse populations and environmental exposures in order for the scientific community to take a deeper dive and conduct research that ultimately improves public health worldwide.
“We have to try to bring together different types of geospatial data on a range of topics, from air pollution to green space to extreme temperature, all from different sources and on different spatial and temporal scales,” noted Clark.
Charles Schmitt, Ph.D. director of the NIEHS Office of Data Science, served on Clark’s hiring committee and will be her administrative supervisor. “We had a number of good applicants,” he said. “We were fortunate to get Lara for two years.”
Interactions between genes and the environment
Clark’s first order of business is to work on the NIEHS Study on Personalized Environment and Genes(PEGS), an ongoing project that collects detailed health, exposure, genetic and other information from a diverse group involving 20,000 North Carolinians. The study aims at increasing knowledge about how gene-environment interactions affect health and eventually providing personalized risk assessments to participants.
Clark will be focusing on this group because of the sequencing of whole genomes from nearly 5,000 people. Clark will link the de-identified patient’s location information with other geospatial data in order to improve exposure analysis for this and other studies.
“Lara is looking at how to build tools on top of what we have done to allow other studies to access the platforms and data that we have developed,” said Schmitt. “We want researchers everywhere to be able to scale up what we have built through PEGS. It is a kind of test bed for what we can do for the world in other areas.”
Research tools that can be used to enhance it
Clark states that one of Clark’s major tasks is to improve data integration so scientists can answer current and future research questions.
“It is an ongoing challenge for researchers to make sense of what data are available and to integrate such information in ways that are useful,” she noted. “My goals are to streamline those efforts and to bring clarity to complex research questions.”
“We want to build tools and online data platforms that can be sustained throughout the years,” Schmitt added.
Global scientific discovery
Clark will also work to make non NIH data easily accessible to NIH grantees and scientists. For example, researchers call geospatial/temporal data the valuable information that federal agencies, universities and other countries can collect about environmental exposures based upon location and time.
“There is a growing need in population-based environmental health science to incorporate data collected and maintained in geospatial-temporal frameworks,” said David Fargo, Ph.D. He is the director of the NIEHS Office of Environmental Science Cyberinfrastructure (OESC) and served on Clark’s hiring committee. “Lara is going to try to make very heterogeneous data interpretable and usable by public health experts.”
Learn more about geospatial information and its use in the support of environmental health sciences research.
“At NIEHS, there is a lot of expertise on both the data science and the environmental health sides,” said Clark. “The good news is that with a lot of people working on these problems, there is an increasing number of tools and resources at our disposal.”
John Yewell is a contract writer at the NIEHS Office of Communications and Public Liaison.