It is widely believed that ubiquitous digital information will transform the very nature of research and education. The reasons for this excitement are clear: In essentially every field of science, simulations, experiments, instruments, observations, sensors, and/or surveys are generating exponentially growing data volumes. Information from different sources and fields can be combined to permit new modes of discovery. Data, including critical metadata and associated software models, can capture the precise scientific content of the processes that generated them, permitting analysis, reuse, and reproducibility. By digitizing communication among scientists and citizens, discoverable and shareable data can enable collaboration and support repurposing for new discoveries and cross-disciplinary research enabled by data sharing across communities. Open, shareable data also promise to transform education, society, and economic development.
The NDS Framework
While some communities are making progress in developing discipline-specific data services, the U.S. and international scientific communities lack a unified framework and supporting services for storing, sharing, and publishing data; for locating data; or for verifying data. More specifically, we are lacking standard means of accessing data, software, tools, metadata, and other project materials that can span across disciplines. These capability gaps make it difficult to build on prior research or to reproduce the results of a scientific publication. Hence, the promise of the data revolution—for rapid discovery, cross-disciplinary research, and increased reproducibility—remains largely unfulfilled. To break this logjam, the nation urgently needs an open framework that supports an integrated set of national-scale services to individually and collectively enable the efficient, convenient, and secure storage, sharing, publication, discovery, verification, and attribution of data by individuals, groups, and large collaborations. This framework and services will constitute a National Data Service (NDS). If these services are embedded within an extensible NDS architecture allowing numerous tools and community-specific services to enhance NDS over time, then we can realize a research environment where access to and citation of data is as useful and necessary as it is for published literature.
How will this help researchers?
- We can provide a generic data portal for allows one to work with data from across different disciplines
- Discipline-specific portals can enhance their capabilities by accessing data and services from related disciplines
- Researchers will have common practices for publishing data that engage interoperable and interchangeable repositories and tools
- Interoperable tools can be created to automate much of our current data handling chores
For more details, see our description of envisioned NDS capabilities.
The NDS Consortium
To be successful, a National Data Service will need to build on the federations, the cyberinfrastructure, the institutional and community archiving, and the best practices already taking hold in the varied research communities today. To this end, a growing group of universities, academic federations, federally funded projects, and publishers are collaborating to form an NDS Consortium that can guide the development of a National Data Service, and we are inviting interested organizations to join. In particular, we envision an initial membership that samples a broad variety of stakeholders, including:
Important journals will partner with NDS to create links between publications, data, software, and associated digital products, raising the bar for information provided and reproducibility of scientific results.
Many academic institutions have strong, on-going efforts to provide long-term archives for digital assets. Building a strong connection to Internet2 will be key to bringing reaching researchers of large number of universities.
National and regional computing centers can build deep and often missing links between the data and computing communities and enable new compute-intensive data services, accessible through high-speed network links.
Data-driven research typically knows no boundaries, and international collaborations are often a necessity. NDS will engage international partners, including the Research Data Alliance (RDA) and data infrastructure projects like EUDAT to ensure interoperability of data services across global communities.
Focused projects and discipline-specific data federations provide capabilities and services tailored to their target communities. In aggregate, they address a broad set of data types, data volumes, and requirements. From large, highly organized projects, to MREFC projects, to various "long tail" community projects, the NDS seeks to provide functional and practical connections, share solutions, and enable multi-disciplinary research.
Both the not-for-profit and commercial sectors increasingly live off data; some are heavy consumers, and some offer resources to researchers. Improving accessibiliyt to data by industry will be key for economic growth in the digital era. We seek strategic partnerships with industry, including through private sector programs on university campuses.
We see this Consortium as a forum for coordinating the various community efforts and federally funded programs (such as NSF's DataNet and DIBBs) to connect a web of archives, repositories, services, and computing platforms that can work together.