Project Name:

Models for a Data Concierge Service for a National Secure Data Service

Contractor: NORC at the University of Chicago

Lessons Learned

 Based on our interviews with federal statistical agencies, NORC has compiled three lessons learned for the period January – March 2024:

  1. Among statistical agencies who have adopted the SAP, many want expanded capabilities including application tracking and improved metadata. Given the SAP utilization, a data concierge service in combination with the SAP could be an ideal entry point for data users looking to search available data assets within a future NSDS.
  2. Several statistical agencies referred to the SAP as their most complete inventory of data assets available. However, available agency metadata is not uniform, particularly across data types (survey vs. administrative data). To improve data user search a discovery, a core function of a data concierge service for a future NSDS is to provide a metadata standard for making data search, discovery, and review consistent across statistical agency data assets.
  3. Most statistical agencies indicated strong interest in a tiered access approach to data asset sharing. However, tiered access has been enacted differently across agencies. A future data concierge service could facilitate improved data access by developing a commonly accepted tiered access standard across statistical agencies.

Based on our interviews with federal statistical agencies and federal data users, NORC has compiled the following lessons learned for the period April – June 2024:

  1. Many federal data users described working with data assets that lack proper documentation, which can delay their analysis. Most agreed that, at a minimum, available federal data should include a codebook, a data dictionary, or a read me file. To improve data user search and discovery, a core function of a future NSDS, a data concierge service could work to provide metadata and documentation standards that will improve data users’ experience.
  2. Findings suggest that a future NSDS data concierge service should consist of six core components.
  3. Centralized help desk assistance for data access, including experts with general knowledge of the NSDS and federal data assets;
  4. Chatbot for general data inquiries that interface with FAQ’s and other public-facing documentation;
  5. Assistance navigating legal requirements for data access and disclosure requirements;
  6. Producing of anonymized queries on restricted data that minimizes disclosure and satisfies users’ requests;
  7. Statistical expert consultations; and
  8. A library of data use best practices including use cases, documentation, video tutorials, publications, and code.

A fully integrated and successful data concierge service will require staff have the requisite expertise and skills to help users navigate process of data discovery, data access, and data use. More specifically, the skills fall into three broad categories:

1. Technical skills: Staff will need to assist data owners and users in preparing, archiving and sharing data, reviewing customer engagement analytics to improve service, and providing access to restricted data through secure compute environments.
2. Customer service skills: Staff will need create guides and other materials for user consumption, gather user requirements and rout data requests, and conduct data searches and distribute data files.
3. Policy-related skills: Staff will need to ensure data use agreements (DUA) comply with regulations, review and update DUA’s with data owners, and ensure data confidentiality and assist with disclosure review.

These skills will require staff have certain level of expertise. Based on our review of organizations who offer similar services, data concierge staff should have a background in areas such as social sciences (criminal justice, public health, education, etc.), data science, and library information sciences.

Disclaimer: America’s DataHub Consortium (ADC), a public-private partnership, implements research opportunities that support the strategic objectives of the National Center for Science and Engineering Statistics (NCSES) within the U.S. National Science Foundation (NSF). These results document research funded through ADC and is being shared to inform interested parties of ongoing activities and to encourage further discussion. Any opinions, findings, conclusions, or recommendations expressed above do not necessarily reflect the views of NCSES or NSF. Please send questions to ncsesweb@nsf.gov.