Research Guides: Data Management for Health and Human Sciences: File Organization

Organizing Your Data

Data organization involves arranging project directories to make file storage and retrieval easier, naming files to allow for logical grouping or chronological sorting within directories, and structuring file contents to support analysis. Researchers should organize their folders to reflect how the records were created and to align with current or planned workflows. Filing structures encourage transparency, consistency, and continuity. File organizational structure and naming conventions should be established by team members before you even begin collecting or working with data. This page provides best practices in organization strategies.

Directory Structure Overview

One of the most essential aspects of data management is organizing your data. This includes several elements, including thinking through names, structures, and relationships.

Organize your data hierarchically, and identify ways to divide your data into categories (or attributes):
- Project
- Time
- Location
- File type
Within folders, files can be maintained chronologically, by classification or code, or alphabetically (depending on the types of files)
Folder and subfolder names should reflect the content of the folder, not the names of researchers or staff
Document your file directory structure and describe the kinds of records that should be maintained in those folders to ensure compliance
Include basic information, such as project titles, dates, and some type of unique identifier (such as a grant number)

The file structure below, created by Lane Medical Library at Stanford Medicine with reference to TIER Protocol, shows one way you can consider organizing files associated with a given project:

File Naming Convention Overview

A File Naming Convention (FNC) is a framework for naming your files in a way that describes what they contain and how they relate to other files. Developing an FNC is done through identifying the key elements of the project, the important differences and commonalities between your files. These elements could include things such as:

Project name, experiment name or acronym
Initials or name of researcher
Date or range of dates when data was collected
Location or spatial information
Type of data
Type of analysis
Conditions
Description of experiment
Unique identifier
Name or pseudonym of interviewee
Sample name
Version number of file (with leading zeroes)

Creating File Naming Conventions

A file naming convention (FNC) can help you stay organized by making it easy to identify the file(s) that contain the information that you are looking for just from its title and by grouping files that contain similar information close together. A good FNC can also help others better understand and navigate through your work.

Consider the following examples:

Files without employing an naming convention:

Test_data_2013
Project_Data
Design for project.doc
Lab_work_Eric
Second_test
Meeting Notes Oct 23

Files with a naming convention:

20130503_NIHProject_DesignDocument_Smith_v2-01.docx
20130709_NIHProject_MasterData_Jones_v1-00.xlsx
20130825_NIHProject_Ex1Test1_Data_Gonzalez_v3-03.xlsx
20130825_NIHProject_Ex1Test1_Documentation_Gonzalez_v3-03.xlsx
20131002_NIHProject_Ex1Test2_Data_Gonzalez_v1-01.xlsx
20141023_NIHProject_ProjectMeetingNotes_Kramer_v1-00.docx

The files with a naming convention provide a preview of the content, are organized in a logical way (by date yyyy-mm-dd) identify the responsible party and convey the work history, unlike the files without a naming convention.

File Organization Resources

Longwood Research Data Management - Plan and Design
Harvard's Longwood Research Data Management resource provides insight into the plan and design of research including directory structure and file naming conventions.
File Naming Convention Worksheet
This worksheet walks researchers through the process of creating a file naming convention for a group of files. This process includes: choosing metadata, encoding and ordering the metadata, adding version information, and properly formatting the file names.
How to Name a File
Harvard Longwood Medical Area's checklist on file names in research data management.
Data Management and Sharing - Get Organized
Stanford Medicine's Lane Medical Library provides a file organization step-by-step of do's and don'ts in naming and structuring files.
Advanced Renamer
A batch file renaming utility for renaming multiple files and folders at once. (Windows and Mac)