Skip to Main Content
Purdue University Purdue Logo Purdue Libraries

Research Data Management Overview

Includes best practices, resources, and tools for managing and sharing research data.

Research Data Management Introduction

Research Data Management (RDM) involves overseeing how data is gathered, processed, analyzed, stored, and shared to enhance its accessibility and reuse by both the original researcher and the wider community. It focuses on ensuring that research materials are easy to find, well-organized, properly documented, and securely maintained, all while streamlining the research workflow.

This guide will provide best practices in managing research data as well as tips and tools to support these practices and provide more context for your specific needs. There are also links to resources found at the bottom of each page to offer more guidance and in depth information for each element of RDM.

The Research Data Lifecycle

RDM addresses the lifecycle of your research output including its creation, organization, accessibility, archiving and distribution. Proper data management helps maintain scientific rigor and research integrity. When discussing research data management, “data” will include any recorded factual material, such as data, images, and samples, that is collected or created to support and validate research findings.

Below is a detailed example of research data lifecycle elements. This image was adopted from Harvard's Longwood Research Data Management guide and more closely reflects the lifecycle of biomedical data. Click on the image to see more information on this process and to find a data lifecycle checklist. The link at the bottom of this box will lead you to another widely used example of the research data lifecycle.

circular data lifecycle diagram with arrows flowing clockwise. The center emphazises storage and manage while the outer parts show pla & design, collect & create, analyze & collborate, evaluate & archive, share & disseminate, and Publish & reuse

Research Data Lifecycle by LMA Research Data Management Working Group is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Reproducibility and Replicability

Reproducibility and replicability are two foundational concepts researchers should understand as they explore each element of the research data lifecycle. Effective RDM practices facilitate the sharing of data and workflows, strengthening both reproducibility and replicability, key pillars for maintaining research credibility and integrity.

  • Reproducibility - refers to the ability of a researcher to duplicate the results of a prior study using the same materials and procedures as were used by the original investigator.
    • Example: Using version control for code and analysis scripts ensures that the exact workflow used to generate results is preserved and accessible.
  • Replicability - refers to the ability of a researcher to duplicate the results of a prior study if the same procedures are followed but new data are collected.
    • Example: Providing rich metadata and clear documentation about data collection and processing helps others apply the same methodology to different datasets.

General Research Data Management Resources