• Home
  • Login
  • Welcome to the Staff Intranet

Organising & describing data

 

In this section:

 

When planning how to organise your data, you need to think about two questions:

  • If I want to re-analyse a piece of data, what will I need to know?
  • If I want to find a particular piece of data, will I know where to look?
 

It is advisable to establish good practices at the start of a project, before the task becomes unmanageable. It is easy for data to become disorganised, so consider regularly setting aside a small amount of time to tidy up your files. These guidliens apply to both digital and non-digital data.

 

Describing data

You need to ensure that you keep enough information to interpret the data. If it's never happened to you, you'd be amazed how quickly data becomes unusable because key details of the context have been forgotten.

 

Whatever you need to make sense of your data should be kept with the data files themselves (e.g. reactant concentrations and temperatures, details of how a sample was chosen).

 

For lab-based research, this is often recorded in the lab notebook, so ensure this is kept safe. Record the lab notebook page number with the data files, and if possible, scan the page(s) and keep them with the data too. Even if you're not based in a lab, you may have a research journal that you can use in the same way.

 

Ownership and credit: This information also helps when deciding ownership and assigning credit, so make sure you keep a note of who collected the data and when, especially if it's not you.

 

Legal documentation: If it's relevant for your research, you should also keep safe copies of any legal documentation, such as consent forms or COSHH forms. Your research group, lab or department may have an existing filing system for this, so ask your supervisor.

 

All of this extra information is collectively known as metadata. There are a number of metadata standards in use in different disciplines, along with more generic standards available too. Keeping metadata up to date is a requirement of the Edinburgh Napier Data Management Policy.

 

If you have research data to deposit in the Edinburgh Napier Research Repository, please complete the Edinburgh Napier metadata form​. Specific guidance on how to complete the form can be found in the guidance notes​.​

 

More information

 

Organising files

Organising your data should include the use of naming conventions, whereby files or folders are named in a consistent and meaningful way.

 

Index File

However you choose to arrange your files, make sure you write down what you've decided in an index file (a Word or text document is great for this) that you keep with the files. This only takes a few minutes, but can save hours of searching later.

 

Don't be afraid to change this later as your research develops: just ensure that you reorganise the files so that the index still represents reality.

 

File Structure

Good file and folder organisation will help you to manage your data, enabling data to be quickly and correctly located, identified, and retrieved.

 

There are many "right" ways of organising your files so think about what makes sense for your research. If you're doing experimental work, for example, you might want to organise the results into folders by the date you did the experiment, or by a key experimental condition.

 

The following suggestions will help you to organise your data:

  • Use folders
    • When organising your data, consider using folders to group related files in one location.
    • The number of files or folders per group may vary depending on the nature of your data.
  • Apply meaningful folder names
    • Use clear and appropriate folder names that relate to the area of work or study rather than the individual responsible.
    • This will also avoid confusion if group members leave and is easier for new researchers to use.
  • Structure folders hierarchically
    • Design a folder structure with broad topics at the highest level and specific folders within these.
    • Avoid nesting folders too deeply as this may cause problems with path lengths.
  • Separate current and completed work
    • You may find it helpful to move temporary drafts or completed work into separate folders.
    • This will also make it easier to review what you need to keep as you go along.
  • Control access at the highest level
    • It is easier to set access permissions near the top of your folder structure rather than trying to control permissions for deeply nested folders.
    • This is particularly important if you need to grant someone access to only a subset of your data, in which case you could move these data to a new, higher-level folder.
  • Agree a consistent organisation within your group
    • If working within a research group it is essential that you discuss and agree on consistent and meaningful folder structures and file names, so that everyone can find data within your shared storage area.
    • Ensure you document what you've agreed and store this where the whole group can access it.
    • If you're new to a group or working with a research facility, check whether there is an established procedure to follow.
 

Naming files and folders

Naming conventions are rules that allow electronic and physical records to be named in a consistent and logical way. Good naming practice will enable you to identify and distinguish between similar records, making data retrieval easier.

 

If you create large numbers of data files that would be difficult to name individually, apply your naming convention at the folder level instead.

 

When you agree your naming convention, consider the following suggestions:

  • Keep names short but meaningful
    • if you use abbreviations, keep a record of what these are with the data, so that others can understand and use them
    • for a small number of files, longer, more descriptive titles can be useful (but don't go crazy: try to keep the whole name on the screen)
  • Include dates in YYYY-MM-DD format, according to the international ISO 8601 standard
    • allows files to be sorted into chronological orderavoids confusion when national conventions vary.
  • Try to avoid using spaces - use punctuation such as hyphens or underscores to separate words
    • this is particularly important for files that will be available online.
  • Avoid using dots and special characters
    • characters such as \ / : * ? " < > | may be reserved for the operating system.
  • Capture relevant information in file names rather than relying on basic file properties such as date of creation
    • this will allow processed data relating to a single experiment or study to be grouped together
  • Avoid repeating information in names
    • if you are repeatedly capturing the same information in a file name, consider grouping the files in a folder named with that information.
  • Use family name followed by initials when personal names are used in file or folder names
  • Include version information in name if needed
    • consider how different versions of a file will be identified (version control).

If you have a large number of files to manage, check out this guide to file naming from JISC Digital Media

 

Naming Examples

 

Files in a folder are usually shown sorted by name. You can take advantage of this to have your files appear in a consistent order. Filenames starting with special characters such as @ will appear first, followed by numbers, then the letters A to Z.

 

For example, you might arrange your files as follows:

 

2012-03-07_Subject-A_Audio.mp3

2012-03-07_Subject-A_Transcript-raw.docx

2012-03-07_Subject-A_Transcript-anonymised.docx

2012-04-22_Subject-B_Audio.mp3 

2012-04-22_Subject-B_Transcript-raw.docx 

Interview-plan.docx 

Readme.rtf 

Summary.docx

 

More information

 

Version Control

As you work with your data it is important to distinguish between different versions or drafts of your files. Version control can help you to easily identify the current version of your data so that you avoid working on older or outdated copies. If you are working with others it can also help to link versions of the data to the time and author of the change.

 

There are a number of ways that different versions of data can be managed:

 

File naming - a simple method of version control is to create a duplicate copy and then update version information to create a unique file or folder name.

  • Successive versions can be numbered sequentially, with whole numbers used for major revisions and point changes indicating minor edits
    • e.g. 1-0, 1-1, 1-2, 2-0, 2-1
  • If you are working as part of a group it may help to include the initials of the person who made the change
    • e.g. v1-0jm, v1-1ke, v2-0gb

Version control tables - these are included within documents and can capture more information than using file naming conventions. Version control tables typically include the new version number, date of the change, person who made the change and the nature or purpose of the change.

 

Version control systems - there are many automated systems available that can store a repository of files and monitor access to them, logging who made what change and when. Version control systems are particularly useful for collaborative development of code or software.

 

More information