12- Storing all data may not be necessary if easier to re-measure

From dtls
Jump to: navigation, search

This is one of thirteen recommendations for Data Stewardship as formulated by the Netherlands E-Science Centre.

The E-science centre writes

In some disciplines, storing data will prove more difficult and less efficient than its re-measurement. This approach still requires provision for storing the detailed parameters under which the measurements are traceable and reproducible.

What DTL recommends for the Data Stewardship plan

Answer the following questions:

  • How difficult/expensive is it to generate the data with the same or better quality later?
  • How well can samples be kept? How unique are the samples likely?
  • Will the technique have evolved later so that it is unlikely many people would refer back to your measured data? Would you refer to your own data of last year? And of five years ago?
  • Based on the answers to the questions above: Is it better to keep the data or the samples? Or both?

Experience from DTL

  • In many cases, unique scientific research can be made possible in the future based on putting many comparable existing data sets together. The quantity of the data can improve the sensitivity. This should be a consideration in keeping the data available.
  • In a new, fast evolving field, it is unlikely that many comparable data sets will be built up, and it is likely that better or larger data sets will be acquired later. This negates the previous argument to keep data around.
  • In a fast evolving field, it may be difficult to reproduce your exact measurements later. Keep your (intermediate) data around for reference if this is a concern.

Sector specific

Specific per technology