Projects and Pilots

NDS sponsors pilot projects and engages in collaborative (funded) efforts to help build the NDS community and prototype the NDS infrastructure. Would you like to collaborate with NDS? Check out our piloting services.

Featured Projects

MBDH Logo

Bridging Big Data

In collaboration with the Midwest Big Data Hub (MBDH) and the Bridging Big Data initiative, NDS is hosting an active database of National Bridge Infrastructure (NBI) data along with sample Jupyter analysis environments to accelerate bridge health research. The NBI database, based on publicly-available Federal Highway Administration (FHWA) data, and sample notebooks are being developed by researchers at the University of Nebraska at Omaha.


ETK Logo

Einstein Toolkit School

In cooperation with the Einstein Toolkit community, NDS is piloting an instance of the Labs Workbench platform to facilitate training and education of researchers using the Einstein Toolkit software for relativistic astrophysics. Using Labs Workbench, users are able to run through a set of tutorials. Labs Workbench was also used as part of the 2017 Einstein Toolkit School and Workshop.


PI4 Logo

PI4 Computational Bootcamp

In cooperation with the ARPA-E TERRA-REF project , Labs Workbench was used as the primary platform for the two-week Computational Bootcamp for Program for Interdisciplinary and Industrial Internships at Illinois (PI4). Students used Workbench to learn a variety of data science techniques leveraging RStudio, Jupyter, and OpenRefine software. A central feature of the bootcamp was access to the TERRA-REF reference dataset.


ThinkChicago Logo

ThinkChicago Civic Tech Challenge

The 2017 ThinkChicago Civic Tech Challenge engaged participants to propose innovative ways to address challenges related to urban sustainability, transportation and civic engagement. The Labs Workbench Platform was used to enable participants to quickly and easily access shared to a number of City of Chicago datasets along with analysis and development software.


Active Projects

TERRA-REF Logo

ARPAE TERRA-REF

TERRA-REF uses advanced crop analytics to increase accelerate breeding and the commercial release of high-yield bioenergy sorghum hybrids. The project utilizes NDS Labs Workbench to launch analysis environments from within the dataset viewer. It also uses NDS Labs Workbench for training. Tutorial sessions provide participants with hands-on experience using specialized Jupyter Notebook and RStudio environments to analyze TERRA-REF data products.

For more information, see terraref.org .


UW iSchool Logo

Educational Workbench for Data Curation

The Data Curation Educational Workbench provides a platform for students to gain hands-on experience with data curation software and tools using the NDS Lab Workbench. The platform brings together a core set of tools to support data curation learning objectives and allows both on-campus and online students to gain experience and experiment with the tools without the interference of setup and administration. The workbench will be piloted with students at the University of Washington Information School enrolled in the Master of Library and Information Science program and the Master of Information Management program, with regular offerings of online sections. After the initial pilot phase, including evaluation and iterative improvement, the workbench will be made available to the broader educational community, particularly Information Schools and other programs offering curriculum in data curation, data management, and data science. A later phase of development would be required to extend the platform for use by practitioners as part of self-guided professional development in data curation.


Crops-in-Silico Logo

iSEE Crops in silico

NDS is partnering with University of Illinois faculty to build a user-friendly platform for plant scientists around the globe who are working on the food security challenge.

As the Earth's population climbs toward 9 billion by 2050—and the world climate continues to change, affecting temperatures, weather patterns, water supply, and even the seasons—future food security has become a grand world challenge. Accurate prediction of how food crops react to climate change will play a critical role in ensuring food security. An ability to computationally mimic the growth, development and response of plants to the environment will allow researchers to conduct many more experiments than can realistically be achieved in the field. Designing more sustainable crops to increase productivity depends on complex interactions between genetics, environment, and ecosystem. Therefore, creation of an in silico—computer simulation—platform that can link models across different biological scales, from cell to ecosystem level, has the potential to provide more accurate simulations of plant response to the environment than any single model could alone.

For more information, see cropsinsilico.org .


KnowEng Logo

KnowEng

KnowEnG (pronounced "knowing") is a National Institutes of Health-funded initiative  that brings together researchers from the University of Illinois and the Mayo Clinic to create a Center of Excellence in Big Data Computing. It is part of the Big Data to Knowledge (BD2K) Initiative that NIH launched in 2012 to tap the wealth of information contained in biomedical Big Data. KnowEnG is one of 11 Centers of Excellence in Big Data Computing funded by NIH in 2014.

This four-year project is creating a platform where biomedical scientists, clinical researchers, and bioinformaticians can bring their own data and perform common as well as advanced analysis tasks, guided by the "knowledge network," a large compendium of public-domain data. The knowledge network embodies community data on genes, proteins, functions, species, and phenotypes, and relationships among them. Instead of analyzing their data set in an isolated fashion, researchers will be able to go straight to asking global questions. The infrastructure, capacity and tools will grow with the datasets.

For more information, see knoweng.org .


MDF Logo

Materials Data Facility

The Materials Data Facility (MDF) is a collaboration between Globus at the University of Chicago, the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign, and the Center for Hierarchical Materials Design (CHiMaD)—a NIST-funded center of excellence. MDF is a scalable repository where materials scientists can publish, preserve, and share research data. The repository provides a focal point for the materials community, enabling publication and discovery of materials data of all sizes.

MDF is developing key data services for materials researchers with the goal of promoting open data sharing, simplifying data publication and curation workflows, encouraging data reuse, and providing powerful data discovery interfaces for data of all sizes and sources. Specifically, MDF services will allow individual researchers and institutions to 1) enable publication of large research datasets with flexible policies; 2) grant the ability to publish data directly from local storage, institutional data stores, or from cloud storage, without third-party publishers; 3) build extensible domain-specific metadata and automated metadata ingestion scripts for key data types; 4) develop publication workflows; 5) register a variety of resources for broader community discovery; and 6) access a discovery model that allows researchers to search, interrogate, and build upon existing published data.

For more information, see materialsdatafacility.org . MDF also runs an instance of the NIST Materials Resource Registry .


NASA Logo

NASA Access to Terra Data Fusion

Terra is the flagship of NASA's Earth Observing System. Launched in 1999, Terra's five instruments continue to gather data that enable scientists to address fundamental questions that are central to the six NASA Earth Science Research Focus Areas. It is amongst the most popular NASA datasets, serving not only the scientific community, but also governmental, commercial, and educational communities.

The strength of the Terra mission has always been rooted in its five instruments and the ability to fuse the instrument data together for obtaining greater quality of information for Earth Science compared to individual instruments alone. As the data volume grows and the central Earth Science questions shift from process-oriented to climate-oriented questions, the need for data fusion and the ability for scientists to perform large-scale analytics with long records have never been greater. The challenge is particularly acute for Terra, given its growing volume of data (> 1 petabyte), the storage of different instrument data at different archive centers, the different file formats and projection systems employed for different instrument data, and the inadequate cyberinfrastructure for scientists to access and process whole-mission fusion data (including Level 1 data). Sharing newly derived Terra products with the rest of the world also poses challenges. The ACCESS to Terra Data Fusion Products effort is developing data sharing and access protocols in step with the NDS Share vision.


NIST Logo

NIST Materials Data Pilots

Using the NDS Labs environment, NIST developers deployed the following pilots, allowing for rapid prototyping and accurate requirements building for the production versions.


yt Logo

Renaissance Simulations

Using the powerful visualization and analysis package, yt , as an exemplar, this project is creating flexible and reusable recipes for creating presentations of data customized for a particular community. Going beyond the simple splash page, this project leverages cloud technologies for putting advanced interfaces in front of data. In particular, it enables scientists to safely apply custom analysis to remote data in the form of, for example, Python scripts.

This pilot effort is utilizing NDS Labs resources to host its archive of simulations. NDS staff run a specialized server where an NDS-inspired set of tools allows users to view Jupyter Notebooks , run analysis in Dockerized containers and to add their own findings in additional Jupyter Notebooks.