Data Continuity, Data Accessibility

In an age when the creation of data is growing exponentially and the conversation about big data analytics nears hype proportions, I think the question about data continuity and accessibility becomes increasingly important.

Missing Data

A recent post in Nature by Elizabeth Gibney and Richard Van Noorden regarding the loss of raw data associated with published research articles caught my attention when it was published.

The author summarized the work of a group of scientists who wanted to understand the level of accessibility of research data as a function of the date when results were published. They endeavoured to access the raw data associated with 516 published research articles ranging from 2 to 22 years old, considering factors such as active author email addresses, access to data, etc. In the original work, the investigators found that the odds of associated data being accessible fell by 17% per year and that within 20 years of an article’s publication up to 80% of the associated raw data can be lost along with the lost possibility of future researchers utilizing the data. The authors conclude by advocating for public archiving of data at the time of publication to ensure future accessibility.

First Nimbus ImageThis somewhat disconcerting article was offset by a more positive story by Sid Perkins, published on Science‘ website, describing work being undertaken at the University of Colorado Snow and Ice Data Center to make available archived Nimbus satellite imagery dating back to the mid 1960’s. This involves digitizing analogue data, mosiacing resulting digital images and then adding them to an accessible data archive. To date more than 250,000 images have been made available, adding considerable to a time series record of value in the assessment of issues such as high latitude sea ice variability and tropical and mid latitude weather variability. The extension of the data record to 50 plus years is truly impressive.

While it seems there are many questions around data compatibility (for another discussion), efforts to establish processes to ensure research data continuity and accessibility are to be commended and should be valued, even as new data is being generated at tremendous rates.


Scientists losing data at a rapid rate. Gibney, E. and R Van Noorden. 2013. Nature doi:10.1038/nature.2013.14416

Vines, T. H. et al. 2014. The availability of research data declines rapidly with article age. Curr. Biol. (2013)

Nimbus data rescue: recovering the past to understand the future. 2014.

Long lost data reveals new insights to climate change. Perkins, Sid. 2014.

Ever Been Hit By an Iceberg?

In a previous part of my career I spent a lot of time trying to understand icebergs and their behavior (is that what they consider anthropomorphism?).  A lot of years have gone by since then but I am still curious about these massive pieces of ice.  So I had to post this picture of a massive Antarctic iceberg colliding about to collide with the Mertz Glacier Ice Tongue.

Antarctic Iceberg About to Collide With Ice Tongue

Even more impressive is that NASA captured a time sequence of satellite images showing the collision and fracture that calved a new iceberg.  The first iceberg B-09B is approximately the size of Rhode Island and the new iceberg is estimated to have a mass of 700-800 billion tons!  Talk about a major collision.

Satellite imagery has a wide range of applications – not the least of which is their use in monitoring massive events like this in remote parts of the world.