Skip to Main Content
Purdue University Purdue Logo Purdue Libraries

Data Management for Health and Human Sciences

Research Data Management overview and resources for students and faculty focusing on health and human sciences

Organizing Your Data

Data organization involves arranging project directories to make file storage and retrieval easier, naming files to allow for logical grouping or chronological sorting within directories, and structuring file contents to support analysis. Researchers should organize their folders to reflect how the records were created and to align with current or planned workflows. Filing structures encourage transparency, consistency, and continuity. File organizational structure and naming conventions should be established by team members before you even begin collecting or working with data. This page provides best practices in organization strategies.

Directory Structure Overview

One of the most essential aspects of data management is organizing your data. This includes several elements, including thinking through names, structures, and relationships.

  • Organize your data hierarchically, and identify ways to divide your data into categories (or attributes):
    • Project
    • Time
    • Location
    • File type
  • Within folders, files can be maintained chronologically, by classification or code, or alphabetically (depending on the types of files)
  • Folder and subfolder names should reflect the content of the folder, not the names of researchers or staff
  • Document your file directory structure and describe the kinds of records that should be maintained in those folders to ensure compliance
  • Include basic information, such as project titles, dates, and some type of unique identifier (such as a grant number)

The file structure below, created by Lane Medical Library at Stanford Medicine with reference to TIER Protocol, shows one way you can consider organizing files associated with a given project:

Graphic of file organization structure

File Naming Convention Overview

A File Naming Convention (FNC) is a framework for naming your files in a way that describes what they contain and how they relate to other files. Developing an FNC is done through identifying the key elements of the project, the important differences and commonalities between your files. These elements could include things such as:

  • Project name, experiment name or acronym 
  • Initials or name of researcher
  • Date or range of dates when data was collected
  • Location or spatial information
  • Type of data
  • Type of analysis
  • Conditions
  • Description of experiment
  • Unique identifier
  • Name or pseudonym of interviewee
  • Sample name
  • Version number of file (with leading zeroes)

Creating File Naming Conventions

A file naming convention (FNC) can help you stay organized by making it easy to identify the file(s) that contain the information that you are looking for just from its title and by grouping files that contain similar information close together.  A good FNC can also help others better understand and navigate through your work.

Consider the following examples:

Files without employing an naming convention:

  • Test_data_2013
  • Project_Data
  • Design for project.doc
  • Lab_work_Eric
  • Second_test
  • Meeting Notes Oct 23

Files with a naming convention:

  • 20130503_NIHProject_DesignDocument_Smith_v2-01.docx
  • 20130709_NIHProject_MasterData_Jones_v1-00.xlsx
  • 20130825_NIHProject_Ex1Test1_Data_Gonzalez_v3-03.xlsx
  • 20130825_NIHProject_Ex1Test1_Documentation_Gonzalez_v3-03.xlsx
  • 20131002_NIHProject_Ex1Test2_Data_Gonzalez_v1-01.xlsx
  • 20141023_NIHProject_ProjectMeetingNotes_Kramer_v1-00.docx

The files with a naming convention provide a preview of the content, are organized in a logical way (by date yyyy-mm-dd) identify the responsible party and convey the work history, unlike the files without a naming convention.

File Organization Resources