File formats
Answer
Choosing the right file format can help ensure research data remains usable and accessible and will make sharing data easier.
Choosing file formats
The file format you choose might depend on the stage of your research: short-term use (functionality) versus long-term preservation (longevity).
You may choose to use certain file formats when working with your data, such as proprietary formats, due to the hardware or software you use in your research or disciplinary norms which make data collection and analysis easier.
Examples of proprietary formats:
-
Microsoft Excel
-
Microsoft Word
-
Adobe PDF
-
SPSS
-
MP3
-
WAV
In order to ensure that your research data remains accessible in the medium and longer term, ‘non-proprietary’ or ‘open’ formats are recommended.
Examples of non-proprietary and open formats:
-
Comma-separated values (CSV)
-
Rich Text Format
-
JPEG
-
FLAC
For long-term preservation and reuse, file formats should ideally have the following features:
-
Common/popular usage by the relevant research community
-
Unencrypted
-
Uncompressed
-
Open documented standard/publicly available technical specifications
-
Suitable for extracting as well as viewing data
-
Easy to annotate with metadata
The UK Data Service maintain a list of recommended and acceptable file formats for the most common types of data.
Technical information about file formats
PRONOM, the National Archives' online registry of technical information, is a resource for anyone requiring impartial and definitive information about the file formats, software products and other technical components required to support long-term access to electronic records and other digital objects of cultural, historical or business value.
Currently there is no authoritative, comprehensive catalogue of formats, but there is a dynamic listing of scientific data formats which can be consulted.
More information
For further guidance on working with your data, please see our page about organising data.