The NIH endorses the sharing of final research data. If you are applying for direct costs of $500,000 or more in any single year you need to include a data sharing plan in your application.
What do you include in your data management plan?
Your plan should be a brief paragraph immediately following the Research Plan Section of the PHS 398 application form (i.e., immediately after I. Letters of Support). You can write your own or use a web tool like the The DMPTool to generate a data management plan. Essential information includes:
- What data will be shared?
- Who will have access to the data?
- Where will the data to be shared be located?
- When will the data be shared?
- How will researchers locate and access the data?
Which data do you need to share?
You need to share your final research data, not your summary statistics and tables, but the actual data on which your summary statistics and tables are based. You don’t need to share laboratory notebooks, partial datasets, preliminary analyses, drafts of scientific papers, plans for future research, peer review reports, communications with colleagues, or physical objects, such as gels or laboratory specimens.
How do you share?
The NIH does not specify data content, formatting, presentation, or transport mode. There are no standards or best practices. The method you choose may depend on several factors, including the sensitivity of your data, its size and complexity, and the volume of requests anticipated.
There are four methods for sharing data:
- The PI may store the data where he likes and share it in any manner he chooses.
- For example, you can share your data on a website or in a journal.
- Data Archive (open access database): for data that will get a high number of requests, possible frivolous requests, and data that needs technical assistance for researchers to use. Most of them charge a fee which you can include in your grant. You can find data archives at:
- Data Enclave (restricted access database): for data that cannot be distributed to the general public due to confidentiality concerns, third-party licensing agreements, or national security considerations.
- One example is the CDC’s National Center for Health Statistics’ Research Data Center
- Mixed Mode Sharing: allows for more than one version of the dataset and provides different levels of access depending on the version. For example, a redacted dataset could be made available for general use, while more sensitive data was available on a restricted data enclave.
- One example is the NIMH Repository and Genomics Resource
Regardless of the method used to share data, datasets will require documentation which gives information about the methodology and procedures used to collect the data, details about codes, definitions of variables, variable field locations, frequencies, etc. For more information see “Data Standards and Common Data Elements Resource Guide.”
Data should be released upon acceptance for publication of the main findings. Data must be kept for three years after grant closeout (unless your contract or university specifies otherwise).
How do you report?
You need to note what steps you’ve taken to implement data sharing in your progress reports to the NIH.
The NIH recognizes that data sharing may be complicated or limited by organizational policies, local IRB rules, and local, State, and Federal laws and regulations, including the HIPAA Privacy Rule. For example, if your sample is so small and unique that it would be impossible to protect the identities of subjects, you may opt to not share the data.