Project Awards

Total Awards

0
0

Open Solicitations

Project Status

Informs NSDS Vision

Project TitleProject Objective StatusProgressInforms NSDS Vision
A Network Scale-Up and Ad-Tech Approach Examining Workforce Participation and International Mobility in the FBSE PopulationThe foreign-born scientists and engineers (FBSE) projects aim to integrate data sources to fill knowledge gaps and better understand this subpopulation in the U.S. workforce. The projects test novel approaches to create data sources and demonstrate the feasibility of acquiring, analyzing, and disseminating data files to inform this and other topics within a future NSDS.CompleteOverview
Reports
Access
AI-Ready Data Products to Facilitate Discovery and UseThis project explores how to make agencies' statistical data products more readily ingestible by AI technologies. It will produce an AI readiness assessment as a shared resource for any agency looking to test the machine understandability of its public data products and an AI readiness prototype tool to transform public data products into machine-understandable, AI-ready data. The project ends in April 2026.ActiveOverview
Lessons
Navigation
Resources
Artificial Intelligence for Enhancing Data Quality, Standardization, and IntegrationThis project aims to develop a set of data processing tools using AI to enhance data standardization and integration activities. The project will begin with interviews of key stakeholders in the federal statistical system to identify current best practices, data processing gaps, and confidentiality concerns. It will then prototype a user-friendly toolkit and user interface for a future NSDS, providing an accessible and unified system for agencies addressing data quality. The project ends in April 2026.ActiveOverview
Lessons
Resources
Building an Evidence-Based Foundation to Understand Foreign-Born Scientists and Engineers’ Participation in the US WorkforceThe foreign-born scientists and engineers (FBSE) projects aim to integrate data sources to fill knowledge gaps and better understand this subpopulation in the U.S. workforce. The projects test novel approaches to create data sources and demonstrate the feasibility of acquiring, analyzing, and disseminating data files to inform this and other topics within a future NSDS.ActiveOverview
Lessons
Access
Building Capacity for State, Local, and Territorial Governments to Use Administrative Data for Evidence-BuildingThis project explores how nonfederal administrative databases could be used to produce new data products. It will prototype a tool to help jurisdictional governments ingest, visualize, and explore their own administrative data, and it will provide a report that can be used as a roadmap for other state, local, and territorial governments. The project ends in February 2026.ActiveOverview
Lessons
Resources
Clarivate and Stepping Blocks Proposal to Request for Solutions (RFS) Foreign Born Scientists and Engineers and the U.S. WorkforceThe foreign-born scientists and engineers (FBSE) projects aim to integrate data sources to fill knowledge gaps and better understand this subpopulation in the U.S. workforce. The projects test novel approaches to create data sources and demonstrate the feasibility of acquiring, analyzing, and disseminating data files to inform this and other topics within a future NSDS.CompleteOverview
Reports
Access
Creating a New Data Infrastructure for Foreign-Born Scientist and Engineers: Data, Analysis and UseThe foreign-born scientists and engineers (FBSE) projects aim to integrate data sources to fill knowledge gaps and better understand this subpopulation in the U.S. workforce. The projects test novel approaches to create data sources and demonstrate the feasibility of acquiring, analyzing, and disseminating data files to inform this and other topics within a future NSDS.CompleteOverview
Lessons
Reports
Access
Creating and Validating Synthetic Data (NCSES/Census, Annual Business SurveyThis project explores two methods of producing synthetic versions of a large-scale restricted use microdata file (NCSES’s Annual Business Survey). The two synthetic files will be compared for accuracy and quality, with one selected to undergo disclosure review for public release. This dataset will then be used in an evidence-building project and its accuracy tested using verification metrics. Lessons learned will inform future possibilities for creating synthetic data files to support a tiered access model for the NSDS. The project ends in March 2026.ActiveOverview
Lessons
Access
Creation of Synthetic Data and Development and Use of Verification Metrics (Survey of Earned Doctorates)This project explores the creation of a synthetic data file, demonstrates examples of uses of synthetic data for evidence-building, and tests the use of verification metrics in validating estimates produced from synthetic data. NCSES’s Survey of Earned Doctorates, an annual census conducted since 1957 of all individuals receiving a research doctorate from an accredited U.S. institution in a given academic year, serves as the case study for this work. Lessons learned will inform future possibilities for creating synthetic data to support a tiered access model for the NSDS. The project ends in October 2025.ActiveOverview
Lessons
Access
Data Access Alternatives: Artificial Intelligence Supported InterfacesThis project seeks to develop and pilot an AI “chatbot” that answers natural language user questions based on public data products from federal statistical agencies. In the first part of the pilot, the team is building a Retrieval Augmented Generation (RAG) based system that is compatible with and builds on the open-source framework behind Google’s Data Commons. The chatbot will focus on types of data products that represent how statistical agencies publish public data.(1) public use files, (2) data tables, and (3) analytical reports. These features are designed to make public data more accessible, useful, and relevant for a broad range of users, including those in science, policy, journalism, and more. In addition to a pilot tool, this project will record lessons learned about the size of input data tables, making statistical data “AI ready”, and engineering issues encountered while building the pilot tool. The project ends in August 2025.ActiveOverview
Lessons
Navigation
Data Integration to Estimate Science, Technology, Engineering and Mathematics (STEM) Attrition and Workforce Supply: A Pilot Approach SolutionThis project seeks to develop an analytic approach that researchers, policymakers, and other interested parties can replicate when analyzing data from different sources (e.g., survey and administrative data, state and local data). This project uses an evidence-building question as a use case, seeking to understand the impact of STEM attrition on future STEM workforce supply. The project will result in a framework for replicating the study's approaches to using disparate data sources to answer a question. This project ends in March 2026.ActiveOverview
Lessons
Access
Resources
Data Protection Toolkit Use Case AnalysisThis project conducted a use case analysis on the Federal Committee on Statistical Methodology's (FCSM’s) Data Protection Toolkit, holding interviews with 15 individuals working for federal agencies, state governments, and other institutions. The project resulted in feedback on the Data Protection Toolkit and recommendations for improvement. The project ended in January 2024.CompleteOverview
Lessons
Reports
Resources
Development of a Prototype for the Standard Application Process Portal (Award 1 of 3)These projects developed multiple prototypes of an online portal allowing users to search for confidential data held by federal statistical agencies, apply for access to that data, and allowing data providers to review those applications and render a decision. This portal supports the implementation of Section 3583 of the Foundations for Evidence-based Policymaking Act of 2018. The projects ended in September 2024.CompleteOverview
Lessons
Other Services
Development of a Prototype for the Standard Application Process Portal (Award 2 of 3)These projects developed multiple prototypes of an online portal allowing users to search for confidential data held by federal statistical agencies, apply for access to that data, and allowing data providers to review those applications and render a decision. This portal supports the implementation of Section 3583 of the Foundations for Evidence-based Policymaking Act of 2018. The projects ended in September 2024.CompleteOverview
Lessons
Other Services
Development of a Prototype for the Standard Application Process Portal (Award 3 of 3)These projects developed multiple prototypes of an online portal allowing users to search for confidential data held by federal statistical agencies, apply for access to that data, and allowing data providers to review those applications and render a decision. This portal supports the implementation of Section 3583 of the Foundations for Evidence-based Policymaking Act of 2018. The projects ended in September 2024.CompleteOverview
Lessons
Other Services
Engaging Policy Stakeholders to Inform a Future National Secure Data ServiceThis project seeks to identify the data needs of federal policy stakeholders as future users of an NSDS using a human-centered design approach. It will conduct a landscape analysis of the data needs within the federal policy ecosystem and conduct a detailed case study with the National Science Board. This project will result in recommendations for the navigation and data concierge services needed by policy stakeholders and a prototype service framework or policy toolkit. The project ends in November 2025.ActiveOverview
Lessons
Resources
Evaluation of Noise Infusion for Large-Scale Demographic Sample Survey (Survey of Doctorate Recipients)This project seeks to evaluate noise infusion for a sample survey. It will investigate different methods for noise infusion to evaluate data quality with each method and explore public messaging surrounding noise infusion. The project will result in a noise-infused sample survey with documentation of methodology and data quality assessment. The project ends in August 2025.ActiveOverview
Lessons
Access
Expanding Equitable Access to Restricted-Use Data Through Federal Statistical Research Data Centers This project explores strategies to expand access to the restricted-use data made available through Federal Statistical Research Data Centers (FSRDCs) beyond its traditional base of users at high research activity (R1) universities. This project is conducting a national survey and focus groups to identify barriers to data access and potential strategies to promote data access within the FSRDCs. It will result in a report informing future project phases, the FSRDCs, and the NSDS. The project ended January 2025.CompleteOverview 
Lessons
Resources
Federated Data Usage Platform (Award 1 of 2)This project seeks to prototype a data usage platform to illuminate instances of how federal data are being used across a wide variety of audiences and use cases. These prototypes will inform the development of a data usage platform dashboard that federal agencies can use as a shared service within the NSDS.ActiveOverview
Lessons
Resources
Federated Data Usage Platform (Award 2 of 2)This project seeks to prototype a data usage platform to illuminate instances of how federal data are being used across a wide variety of audiences and use cases. These prototypes will inform the development of a data usage platform dashboard that federal agencies can use as a shared service within the NSDS.ActiveOverview
Lessons
Resources
Informing Evidence Building Capacity Among State, Local, Territorial, and Tribal Governments within a National Secure Data ServiceThis project explores how an NSDS could support capacity for evidence building among state, local, territorial, and tribal governments. "Capacity building" here refers to skill building for staff, continuous learning opportunities, and/or access to infrastructure and tools. This project will conduct a needs analysis with all 50 states as well as local, tribal, and territorial governments. The project will produce 3 reports: 1). Needs analysis by group; 2) Gap analysis by group; 3) Recommendations for a future NSDS. The project ends in August 2026.ActiveOverview
Lessons
Resources
Investigating Science, Technology, Engineering and Mathematics Retention Intervention Strategies in Education for All (I-STEM RISE 4All)This project seeks to develop an analytic approach that researchers, policymakers, and other interested parties can replicate when analyzing data from different sources (e.g., survey and administrative data, state and local data). This project uses an evidence-building question as a use case, seeking to understand the impact of STEM attrition on future STEM workforce supply. The project will result in a framework for replicating the study's approaches to using disparate data sources to answer a question. This project ends in June 2026.ActiveOverview
Lessons
Access
Resources
Models for a Data Concierge Service for a National Secure Data ServiceThis project explores models for a data concierge service, conducting an environmental scan of service request types that federal agencies receive and interviews of federal data providers and data users to inform a data concierge service. It will result in two or more models for a data concierge service as well as resource needs for each and potential staffing requirements. The project ended in March 2025.CompleteOverview
Lessons
Reports
Navigation
National Vital Statistics System Modernization - New Opportunities for Interoperable DataThis project explored the National Vital Statistics System (NVSS) ecosystem as a way to inform shared services in a future NSDS because of the system’s experience with data interoperability, implementation of governance considerations and authorized roles and responsibilities, and tiered data access structure. The project ended in September 2024 and resulted in a final report outlining considerations for a future NSDS.CompleteOverview
Lessons
Reports
Access
Opportunity #1 Foreign-Born Scientists and Engineers and the U.S. WorkforceThe foreign-born scientists and engineers (FBSE) projects aim to integrate data sources to fill knowledge gaps and better understand this subpopulation in the U.S. workforce. The projects test novel approaches to create data sources and demonstrate the feasibility of acquiring, analyzing, and disseminating data files to inform this and other topics within a future NSDS.ActiveOverview
Lessons
Access
Privacy Preserving Technologies Phase 1: Environmental ScanThis project conducted an environmental scan to understand the current landscape of privacy -enhancing technologies, resulting in a report documenting the analysis. The results of this project have informed project testing and piloting using privacy-enhancing technologies (such as privacy-preserving record linkage and synthetic data generation), which inform the NSDS secure compute environment and Capacity Building Center. The project ended in January 2024.CompleteOverview
Reports
Access
Providing Customer Service for a National Secure Data Service Through a Data ConciergeThe objective of this project is to develop a data concierge service that provides high-quality technical assistance to help data users with data discovery, access, and linkage as well as navigating services available within a National Secure Data Service (NSDS). OpenRequestResources
Navigation
Access
Secure Compute Environment ScanThis project conducted an environmental scan of secure compute environments. Over 20 federal stakeholders were interviewed to share perspectives on benefits, challenges, and requirements for successful utilization of a secure compute environment within the federal space. It produced a final report detailing findings to inform the requirements needed for the NSDS secure compute environment build. The project ended in July 2024.CompleteOverview
Lessons
Reports
Access
Secure Compute Environment Testbed for a National Secure Data ServiceThis project builds a secure compute environment, a core component of the NSDS. The secure compute environment allows approved researchers to access, link, and analyze data for approved projects and enables testing and use of state-of-the-art privacy-enhancing technologies. The secure compute environment will undergo operational testing in early 2025 with an operational testbed available in summer of 2025. The project ends in August 2026.ActiveOverview
Lessons
Access
Synthetic Data Generation with Large, Real-Word DataThis project explores how synthetic data generation, a type of privacy-enhancing technology, works with large real-world data (that is, datasets with over 30 billion rows of data) in a secure super compute environment. It will produce a framework to inform a synthetic data toolkit that will include but not be limited to methods to assess privacy risk, data utility and open-source AI methods for generating synthetic data. This is a joint project between the National AI Research Resource (NAIRR) pilot and the NSDS demonstration project. These are independent initiatives with expected synergies as reflected in the CHIPS and Science Act requirement that the NSDS demonstration project consult with the NAIRR Task Force in NSDS development. The project ends in August 2026.ActiveOverview
Lessons
Access
Resources
Utilizing Privacy Preserving Record Linkage to Link Data from Two Federal Statistical AgenciesThis project explores the development of a data sharing agreement between two federal statistical agencies that have not previously developed data sharing relationships, deploys a commercial privacy preserving record linkage (PPRL) tool to link data from these two agencies, and uses a secure environment to analyze the resulting linked data file. It will inform linkages across the federal government by developing agreements and deploying PPRL as a model to improve the availability, quality, accessibility, and interoperability of data sharing. The project ends in September 2025.ActiveOverview
Lessons
Access
Utilizing Privacy Preserving Record Linkage with Parent Agency Data and Statistical Agency to Inform Programs and PoliciesThis project explores the development of a data sharing agreement between a federal statistical agency and its parent agency, deploys an open-source privacy-preserving record linkage (PPRL) tool to perform the linkage, and uses a secure environment to analyze the resulting linked data file. This project will inform linkages across the federal government, including within-agency collaborations, by developing agreements and deploying PPRL as a model to improve the availability, quality, accessibility, and interoperability of data sharing. The project ends in September 2025.ActiveOverview
Lessons
Access