Skip to main content

University
Library

Research Data Management

File Naming Systems & Organization

  • Decide on a naming convention before data collection starts
  • Use consistent, descriptive file names. Make it easy to predict what a file contains.
  • Develop a file naming scheme that makes sense to you.
  • Consider including:
    • Project name or project number
    • Name of file creator
    • Sequence ID
    • Accession Number
    • Location or spatial coordinates
    • Date or date range of project
    • Version number of file
  • Consider how files sort when deciding what element of the file name will go first.
  • Establish a folder hierarchy that aligns with the project. Example: [Project] / [Experiment] / [Instrument or Type of File]
  • Include an explanation of your naming convention along with any abbreviations or codes in your readme.txt file. 

File Naming Conventions

  • Keep the filename short (aim for less than 25 characters)
  • Use underscores instead of spaces
  • Avoid special characters such as: " / \ : * ? < > [ ] & $ .
  • Use the dating convention: YYYY-MM-DD or YYMMDD
  • Use the 3-letter file extension to indicate the file format, such as .txt, .pdf, or .csv.
  • When using number, use leading zeros to make sure files sort in sequential order. Use 001, 002, ...020, 021 … instead of 1, 2… 20, 21…

Case Study: File Naming Done Well - Excellent example of a method to name thousands of image files. File names include study site, water depth, date, and more.

File Formats

  • Whenever possible use open, uncompressed, non-proprietary formats
  • Convert to open or uncompressed formats
    • .doc to .txt
    • .xls to .csv
    • .jpg to .tif
    • .ppt to .pdf (exception due to ubiquity)
    • .mp3 to .aif or .wav
    • .proproj to .mxf or .mov
  • Keep raw data raw (Save a copy of the original format just in case)
  • Unencrypted data are best, though encryption is appropriate for sensitive data

Metadata & Documentation

  • Create a readme.txt file
  • WHO made it, WHAT you're looking at, WHEN was it created, WHERE was it collected, WHO can use it
  • Document variable names, codes, classification schemes, and algorithms
  • For applications and "playable" files, include the file format, software (including version), and OS used
  • Using a metadata standard helps with interoperability between data sets
    • The Digital Curation Centre maintains a comprehensive list of formal standards across many academic disciplines

Get Credit

  • Cite Your Data
    • Get a persistent identifier such as a DOI or ARK using the EZID service for your data.
    • Contact the library to obtain an EZID.
  • Disambiguate yourself 
    • ORCID provides a persistent identifier that distinguishes you from other researchers. Register for a free account.

Storage & Backups

  • Keep multiple copies of your data: Here, Near & Far
  • Automatic backup is better than manual
  • Periodically test your backup restore
  • Contact UCSC campus ITS for optimal data storage & backup options.

Copyright and Intellectual Property

  • Data is not copyrightable. However, a presentation of data (such as a chart or table) may be.
  • Data can be licensed. Some data providers apply licenses that limit how the data can be used to protect the privacy of study participants or to guide downstream uses of the data (e.g., requiring attribution or forbidding for-profit use). Check license terms of use before republishing.
  • Most databases to which the UC Libraries subscribe are licensed and prohibit redistribution of data outside of UC. For more information on terms of use for databases licensed by the Libraries, contact us.
  • Publish your data under a Creative Commons license to make your wishes explicit.

Confidentiality and Privacy