Skip to main content Main Header
LLS logo

Research Data Management: Organising data

Why is organising your data important

This video is part of MANTRA, the Research Data Management creative commons online course created by the University of Edinburgh Data Library

Finding data that you or your collaborators have created can be challenging as your data and files increase over time.  

Organising your data effectively can help you identify, locate and use your research data files efficiently and effectively. 

 

In this video, Professor Jeff Haywood talks about the importance of research data management.

 

Data format

 What do I need to consider?

 

When choosing the formats that you use to store your data, take into consideration:

  • what software and formats you or your colleagues have used in past projects
  • what software is compatible with hardware you already have
  • how you plan to analyse, sort, or store your data
  • what formats will be easiest to share with colleagues for future projects
  • what formats are at risk of obsolescence, because of new versions or their dependence on particular software
  • what formats will allow opening and reading your data in the future

You may use one format for data collection and analysis, and convert your data to another format for archiving. Also, check whether any disciplinary standards exist for archiving your data. 

Organising files

Why is it important to organise your files?
Organising your files in a logical and consistent way will save time by helping you and others find the correct files, prevent errors and enable reproducibility.
 
Top tips
  • Use any existing conventions, for example, in your research group, department or faculty  
  • Agree on a consistent convention for naming files and folders.
  • Create a logical folder and sub-folder structure and file data accordingly
  • Name your folders so that they are meaningful to yourself and others
  • Create separate folders for current/ongoing and completed work
  • Don't keep everything. Consider what you need to retain and for how long
  • Review your files at regular intervals

File naming

The importance of filenames

  • Good practice dictates that all information (files, datasets, documents, or records) should be identifiable and traceable.

How should I format a filename?

  • A filename is the chief identifier for a research data file. Using agreed and consistent conventions to file names can help prevent confusion, particularly when multiple people are working with the data.  
  • The filename should include as much descriptive information as possible to assist identification.

Key elements to include in your file name include:

  • File name, or full file path 
  • Name/role of file author(s) or originator(s)
  • Date of creation, edit or event which is the subject of the document/file
  • Version number if applicable

Version control

Why is version control important?

Version control involves a process of naming and distinguishing between a series of draft documents which lead to a final (or approved) version.

It is important for:

  • Documents that undergo a lot of revisions
  • Collaborative documents that are being changed by a number of different users
  • Tracing and auditing the development of a document
  • Avoiding conflicting document versions

Top tips

  • Follow naming conventions (see file naming)
  • Include version numbers in the file name
  • Identify on the document (e.g. header/footer), the author, filename, page number and date of creation/revision
  • Give read-only status to definitive versions to allow changes to be controlled

Describing data

What is metadata?

  • You can describe your work by assigning metadata.
  • Metadata is "data about data". It is used to summarise basic information about data and describe or contextualise the data. 

Why is metadata important?

It is important to clearly describe your data to ensure it can be:

  • Searched for and found
  • Understood by any user now and in the future
  • Properly interpreted

Research funders require researchers to create and make metadata openly available.

The key question to ask yourself is

“what information would I need to understand and use this data in twenty years?”

Further information

Metadata examples and links to descriptive standards can be found on the University of Leicester metadata and documentation page

Documentation

What is meant by documentation?

Documentation may sit alongside metadata. This refers to all the information necessary to interpret, understand and use a given dataset, a set of files or a single document.

How should research data be documented?

Research data need to be documented at various levels:

  • Project level: e.g. what the study set out to do, contribution to new knowledge, questions/hypotheses, methodologies, sampling, instruments and measures etc.
  • File or database level: e.g. how all the files that make up the dataset relate to each other, format, superceding or superceded etc.  
  • Variable or item level: A full label explaining the meaning of a variable in terms of how it was operationalised is recommended.

Further information

For more information about on documenting your data see the UK Data Archive.

Where can I store data?

  • It is preferable to use your network drive at the university.  Shared drives could also be utilised for collaborative projects.
  • If you are working with private or confidential data, please ensure that it is encrypted.

Media and hardware that is non-preferable for storing research data, due to vulnerability to loss and damage include:

  • Portable storage media such as CDs, DVDs and memory sticks (USBs, flash drives).  
  • Personal computers and laptops. 

The University is currently developing an RDM infrastructure

Further information

The Mantra Research Data Management Online course includes a useful section on storage and security issues.

Backing up

What does backing up mean?

  • Backing up means storing copies of your data in more than one place.  This ensures that if one of your copies fails or is lost, there are other copies available. 
  • It is recommended that you keep at least 3 copies of your data on at least 2 different media and that you back-up on a regular basis.

Storage

 What do I need to consider?

 

Storing your data securely is important. When deciding where to store your data, take into consideration:

  • Will data be securely stored over time – can integrity be preserved?
  • Is its storage reliable - can data be lost?
  • Can the data be accessed and reused?
  • Is it appropriate for both immediate and long-term needs?
  • Does storage meet relevant standards and requirements of the university, funder and legislation?
  • What is appropriate storage for sensitive or anonymised data?
  • How much storage do I need now and will I need in five years?

Saving email correspondence

  • You may need to save emails and associated attachments to describe your data journey.  
  • Remember Data Protection and Freedom of Information legislation can ask to see any email correspondence.  

The organising email page at the University of Leicester provides more information. 

Useful links

UK Data Archive: Collection of digital research data in the social sciences and humanities and help for creating and managing your data.

Data Documentation Initiative (DDI): An international standard for describing the data produced by surveys and other observational methods in the social, behavioral, economic, and health sciences.

MANTRA Research Data Management training: Free online course produced by the University of Edinburgh

The Digital Curation Centre (DCC):Centre of expertise in digital curation.The DCC provides expert advice and practical help in storing, managing, protecting and sharing digital research data.

Acknowledgements