Manchester Metropolitan University homepage
Library homepage

Research data management

Organise your data

The format in which your data files are stored, will be determined by the type of data you collect e.g. textual, numerical, geospacial, image, etc.

Specialised collection methods or software needed to interpret these data can determine what format your data will take, but it is still important that formats promote long-term preservation and facilitate access. In some cases, it might be necessary to convert data to more sustainable formats.

 

Tips for best practice

Consider formats that are:

  • Standardised: Standardised formats are more widely supported and not as likely to become obsolete. Using standard formats also supports the future readability of the data as software is usable in various applications.
     
  • Interoperable: Standardised formats also promote interoperability where data can be exchanged and read across different systems and software. This promotes usability.
     
  • Open: Open formats make their specifications openly available to everyone. This promotes development and makes it easier to re-engineer software once the original is no longer available. Proprietary formats, on the other hand, may have higher costs and are controlled by companies which may restrict use.

  • Uncompressed and original: Compressed formats can result in data loss. The same is true if the data is converted to another format. Make sure to keep an a copy of the data in the uncompressed or original format if conversion is required.
     

The UK Data Service provides a list of recommended file formats for various data types.

A file name is the principle identifier of the data file. Maintaining a clear, meaningful, and consistent naming practice will help you find the right data quickly.

 
File name elements

Choose elements that are meaningful and relevant to your data. Remember to be consistent and agree to conventions if working in a group. Elements include:

  • Project name or number
  • Date of creation
  • Name of creator
  • Description of content
  • Time, place, or source
  • Version number

The UK Data Service provides more guidance on file names.

 

Tips for best practice
  • Use hyphens [-] or underscores [ _ ] to separate elements. Avoid spaces.
  • Avoid non alpha-numeric characters such as &%?
  • Use sequential numbering systems: 001, 002...010, 011...100. This will keep large numbers of files in sequential order.
  • Format dates YYYY-MM-DD. This will keep files in chronological order.
  • Avoid very long file names. Keep description short but meaningful.
  • Include versioning
 
Versioning

It is important to keep track of the changes you make when processing and analysing data. Recording changes and saving these different versions can save you time if you make a mistake and need to go back or want a record of your work. Remember to always save a master file of the raw or unprocessed data!

The UK Data Service provides more guidance on version control and authenticity.

Structuring your files will make it easier to find and keep track of your data. As with file names, use what is meaningful to your data and be consistent.

 

Tips for best practice
  • Organise folders in a hierarchy and name them appropriately
  • Separate ongoing and completed work
  • Keep data, documentation, and research activities in separate folders.

The UK Data Service provides more guidance on file structuring

Data needs a context if it is to be understood. Throughout your research project, you will need to comprehensively document and describe your data. This will give it a context making it understandable and useable for others. It will also save you time if you have to go back to your data in a month, a year, or five years.

Documenting and describing data is not as difficult as it sounds. In fact, you are probably already doing this! It can be as easy as keeping a lab notebook or making a record of what you are doing.

 

What do I document?

You understand your research better than anyone else, so you are in the best position to answer this. Ask yourself, 'what is needed for someone else to understand my data?'

Three levels of data documentation that can help create a context are:

  • Project-level: Describes the project including aims, design, methods, investigators, specialised instruments and settings, data collection protocols, etc.
     
  • Data-level: Describes the data itself such as variable names, units of measure, codes, contextual information, etc. Data-level documentation typically takes the form of a code book which can be provided alongside the datasets.
     
  • Metadata: Describes the content, context, and provenance of the data and includes the project title, funder, creators, subject keywords, IP rights, licenses, technical formats, etc. Metadata is typically subject specific, structured and adheres to a standard. Much of the metadata can be found in the Data Management Plan and is used to create the record of a dataset and make it discoverable.

 

When do I document?

Do not wait until the end of the project to start documenting! Keep notes and a record throughout the research project.

 

Documentation tools

For project-level documentation, use your Data Management Plan to guide you. Fill in the information as it comes.

For data-level documentation, it can be as simple as a notebook, but there are other tools available, many of which are free. Some include:

  • Nesstar: free web-based software for data and metadata creation with the ability to create a code book.
  • Colectica: used with Microsoft Excel, there is a free standard version that helps you document data and metadata.
  • DdiEditor: produces standardised metadata documentation

 

The UK Data Service provides more guidance on data documentation and resources

Loading ...