Great Lakes Bioinformatics Conference 2012

Industry Tracks

Updated May 07, 2012

Industry Track 2
5:00 pm - 5:20 pm
Mendelssohn Theatre
...................................



Presenter:  Chris Robertson, Solution Architect, Cambridge Computer

Topic:  Reigning in Research Data: Adding Intelligence to Conventional File Systems

Description: There is a growing need to put structure on unstructured research data, being driven by a number of different factors:

  • Granting agencies are requiring investigators to specify and conform to data management plans that preserve and make data available to other researchers
  • Research institutions are seeking to charge storage consumption back to individual grants and projects
  • Data storage administrators need better content classification in order to manage data protection and life cycle management
  • As capacity grows and time progresses, it becomes futile to rely on any given individual's memory as the only means for content classification

Unfortunately, research scientists are fundamentally resistant to traditional efforts to put structure on their unstructured data. Commercial content management systems are seen as cumbersome to operate, not to mention that they might add latency to high-performance processing jobs. As such, many scientists have no tools other than directory names to identify their data. Others rely on spreadsheets and home-grown software applications to track their files, but often these links break as files are moved and renamed. Cambridge Computer has embarked on a project to define the best practices for content classification of research data. We are collaborating with a number of leading institutions and have completed development of a working prototype. The purpose of this talk is to share the highlights of our work, stimulate discussion, and make contact with potential collaborators.

[Return to Full Agenda]