Finding existing research data for reuse
Answer
Collecting data can be time consuming and expensive, so reusing existing data is often more efficient and cost-effective. It can also encourage collaboration and help to progress knowledge discovery more rapidly by building upon research already conducted by others.
Before collecting or creating new data, you should consider whether there is any existing data you could use or build upon to answer your research questions. Some research funders (e.g. ESRC) expect you to have searched for existing data before proposing new data collection.
Sources of data
To identify sources of existing data, you will need to think about what kind of data you are looking for:
-
What format might it be in?
-
Are you looking for current or historical data?
-
Are you only interested in certain geographical locations?
-
Will there be GDPR issues to consider?
-
Who might have produced the data you want to find? For example, governments, public bodies, organisations, or research institutions.
Some potential sources of existing data to consider and explore:
-
Re3data.org: Registry of Research Data Repositories
-
FAIRsharing: A catalogue of repositories and databases, as well as standards and ontologies
-
European Data: The official portal for European data
-
University of Manchester Figshare: institutional data repository
-
Funder-specific data centres: UK Data Service, NERC data centres
-
Datacite Commons: search for works, people, organisations or repositories
-
Safepod: secure access to data centres such as ONS and UK Data Service
Reusing data
If you are reusing existing data you should consider the following:
-
Is there sufficient metadata or documentation with the data to enable you to understand and interpret it correctly?
-
Are there any limitations to the data?
-
What licence or terms has the data been shared under, and does this allow you to use, process, and re-share the data as you intend?
It is good practice to cite any data sources you use, in the same way that you would cite other research sources or publications. Data citation enables others to find the data, reuse it and track impact.
Full guidance is available from the Digital Curation Centre on How to cite datasets.