A sattelite

Organising research

How can I organise my research?

  • Read the information presented here alongside LSE Guidance on How to Organise Your Research Data.
  • Balance the breadth and depth of a hierarchical system of organising research.
  • Consider file naming conventions as part of research organisation.
  • LSE Library also provides support for some reference management software platforms.

Collection level

Structuring research is about making it ways to locate files easily within an organised framework. This not only helps during the research, but should also make it easier to locate files in the future, or if the data is being used by someone else.

Consider a hierarchical folder system of folders within folders if suitable. Otherwise, try to keep to a flatter structure but remember systems by default arrange folders and files alphabetically:

Collection level examples

Do not use too many levels within a hierarchical system. Three or four offers the best balance between breadth and depth

Organise structure around either research activity, type (original, adjusted, or analysis), or kind of material (data or documentation)

Consider date formats. YYYY-MM-DD at the start of a file title will arrange documents in date order better than other methods. If numbering files sequentially, be sure to include enough digits in the file naming convention. For example, a collection with 100 plus files should begin 001

Define what is to be backed-up: data and files (original files, master files, data files, etc.) or the entire data collection and access privileges (passwords, firewall, read and write permissions) and protection against overwriting a backup set (read-only)

Take advantage where possible of tags to organise and search for information.

Organise a periodic reviews of folder structures to clear out folders no longer in use. This could be a shorter or longer review period, depending on the project. Establish rules and procedures for retaining and deleting files

For reference management, LSE Library offers EndNote and Mendeley, which can be used to manage references to articles, books and other reference material used in your research. Other reference management software is also available

Further reading:

Nikola Vukovic (2014), Setting up and Organised Folder Structure for Research Projects.

J.W Crowder, J.S. Marion, M. Reilly (2015). File Naming in Digital Media Research: Examples from the Humanities and Social Sciences. Journal of Librarianship and Scholarly Communication, 3(3), eP1260.

File level

Quantitative

Useful types of variable to manage data may include:

  • Project ID: useful for collections with multiple surveys
  • Data file ID: use with groups of people, waves, and/or country data
  • Version ID: for identifying versions of the data set
  • Respondents ID: suitable corrections, additions, and follow-up interviews
  • Questionnaire ID: to identify questionnaire splits or different languages
  • Interviewer ID: to identify interviewers plus additional interview protocol variables (date, start/end time)

Constructed variables (like aggregating income) harmonised, or derived variables should be inserted after respective original variable(s).

It is also useful to group variables around formal criteria

  • Administrative and technical
  • Content variables
  • Data collection and interviewer variables

Variable names

Defining variable names is one of the most important decisions to take when it comes to managing a data set. Variable names should be simple to aid easy identification, understanding of variables, and convey an element of meaning. The convention can vary according to the nature of the research but be clear and be consistent whatever the convention adopted.

The first column and row are both headers. The first column contains the variable name type with the intersecting row giving a description of each type along with advantages and disadvantages of each one.

Variable name type

Description

Advantage

Disadvantage

Question number

Named after the question number in the survey, q01, q02, q03…

Direct reference to the original question and the order shown in the questionnaire.

Do not refer to the contents, so not easy to identify without the survey or codebook to hand.

Ascending order

Sequential numbering of variables: v01, v02, v03…

Simple linear sequence in the data set so the order of variables is clear.

Do not refer to the contents, so not easy to identify without the survey or codebook to hand.

Mnemonic

Essence of variable acts as a simple memory aid: ICOM for the income of the respondents.

Easier to remember and recall. Useful in longitudinal analysis or where question modules repeated over time.

Mnemonic names are not always clear for others (inside and outside of a research project).

Combination of elements

Combining different elements in the variable. For example, a country-specific variable for party affiliation, UK_LAB for UK Labour Party voters.

Good for variables found in comparative surveys, or demographic variables. Also for complex surveys with variables systematically linked with thematic categories and multiple waves.

Can lead to lengthy variable names. Subsequent waves of the survey, survey of subpopulations or different variables in different countries can require additions to original variable name.

Variable labels

Variables must be supplemented by a label - an informative and brief description of the contents of variable to provide context or distinguish the variables within a data set:

Variable labels examples

The example shown is from the 1999 European Values Survey. It also highlights how variable names and labels can manage information on the number of response categories. In this case, there are two measures of what people consider more important, one has three categories, and the other has two.

Qualitative

  • Use coversheet for transcripts of qualitative interviews.
  •  Standardise editing and the layout of qualitative data
  •  Use non-disclosure agreement for administrative staff and transcribers accessing confidential data

Produce a coversheet for qualitative interviews providing contextual information about the interview consistent within, and where possible, across projects. Depending on the sensitivity of data, coversheets should provide space for recording the data collection event: date, place, interviewer name, interviewee details.

Guidance for researchers and transcribers includes a unique identifier, participant ID, specified uniform layout across the research project – font, spacing, page numbers, header information, and marking sensitive text. These characteristics can also be held in a spreadsheet and mail merged into the documents.

It is useful to create guidance for researchers and transcribers on procedures and uniform conventions to apply to transcriptions. Aim for a consistency between the layout of transcripts and the conventions used in those transcripts, particularly when multiple people are carrying out transcriptions in the same project.

Transcripts should include speaker tags to identify the question and response sequence. For other textual material, use standardised elements from an early stage to avoid the stresses of retrospective formatting when pressure on time and resources is at its highest.

Additional resources:

 UK Data Service: Transcription