The future management of data

Jason Codrington

Thursday, 20 Aug 2009 10:12 UTC

There was a general preference amongst participants in our Bath workshop for data to be made freely available to all. Is this the general consensus? In London, participants considered this and asked how can you balance data protection and availability? In Liverpool they wanted to know whether it would be possible to set up one smart database for everything?

Do you think all data should be made available to everyone? let us know your views and experiences of sharing data so that we can find out more about which models work and how developments in technology and the internet have changed the way scientists communicate and share their data.

  • Replies

    Post a reply
    • In Nottingham, I don’t think our discussions on data got very far because everyone had such different viewpoints. Personally, I think data sharing is great in principle but fiendishly difficult to do well in practice. Critical questions include –
      The ‘level’ problem – what level of data should be shared? Raw data? Semi-processed data? Finished results (in more detail than in the journal article)?
      The ‘translation’ problem – how can we have a complete description system for datasets (and the experiment/methods which created them), which is understandable by human or machine AND which is general enough to apply to all manner of future experiments no one has thought of yet. This general description is essential for comparing experiments between labs. At the moment, the ‘methods section’ of publications is the best we’ve got, but we will have to do better to make real data sharing work.
      The ‘sociology’ problem – how can we convince scientists that it is important and worthwhile to share more data and to invest in data-sharing?

    • In my East Midlands group in Nottingham, it was generally agreed that the TODO list for managing metadata, raw data, processed or science ready data, datasets attached to publications and so on is pretty daunting. Though it should be noted that some communities such as astronomy – see AstroGrid – and biosciences have made very real and significant progress in these areas. Studies by groups such as the Joint Information Systems Committee, UK Research Data Services (UKRDS) and Digital Curation Centre (DCC) data managment plan are also attempting to address common areas across research areas at the institutional and national level.

      Therefore in our particular sub group we tried to focus more on the issue of how to deal with information overload in the context of filtering the ever growing literature for papers and datasets of interest. This seemed something worth pursuing in the limited time available and perhaps achievable on a smaller scale using e.g. text mining…

      More on this soon I hope.

    Post a reply

Search forums Advanced search

web feed

Submit this topic to

Advertisement