MIGO GSC project proposal April 2012

Project Title The “Minimum Information about a Genomic Observatory” (MIGO) checklist

Project Lead Dawn Field (CEH) and Neil Davies (Berkeley)

Initial Team members: being formed, open participation to all interested

Elevator Pitch The “Minimum Information about a Genomic Observatory” (MIGO) checklist aims to capture uniform information about sites of special scientific interest for the wealth of contextualized genomic data they generate (a Genomic Observatory).

Project Summary The MIGO project is a part of the work of the GOs Network: Technological advances in molecular biology and genomics are transforming our understanding of biological systems and the integration of genetic information into future Earth Observing Systems. As genomics – the ability to ‘read’ the DNA of an organism – continues to revolutionize biology, great advances are taking place involving the sequencing of huge numbers of genomes and metagenomes and the DNA barcoding of eukaryotic biotas. Yet these efforts are only a ‘drop in the ocean’, so to speak, of what is required in terms of characterizing biodiversity on Earth.  When it comes to applying the power of the genomics revolution to large-scale biodiversity and ecological studies, we as a community, should begin to focus on sites of long-term research interest. The primary benefit of an increased focus on ‘DNA centric’, ‘place-based’ research is the potential to quantify the complete set of interactions between living organisms in a particular environment (ecosystem) from microbes to macrobes. We therefore propose to launch an international network of ‘Genomic Observatories’ (GOs), a collaboration of research sites pioneering genomic research and embedded in, and cutting across, several ecological and biodiversity science networks. GOs will be selected based on their rich histories of data collection with long-term commitments to future research across a broad range of disciplines.

As part of building this network we proposed to build a GOs Portal. In the first instance this will include a registry of sites, their key features and an indication of the types of data they generate, including all genomic data. We wish to make these descriptions uniform to maximize their usefulness. We therefore wish to develop the “Minimum Information about a Genomic Observatory” (MIGO) specification. It would dovetail with descriptions of sites from other networks while making sure there is an adequate way to describe genomic data as well.

Which existing projects, if any, does this one replace/complement/subsume? Explain briefly why an extra project is needed/justified (what gap does it fill?). This is a new checklist that will require MIxS and be built on existing efforts to describe scientific sites of study (ILTER). MIGO will be registered at MIBBI.org and use MIBBI modules whereever possible to speed up the process of consensus-building.

How does this project fit into GSC’s mission statement? Genomic Observatories of the future are perhaps our best chance to capture highly contextualized, long-term genomic data. The data from GOs should be GSC compliant. Describing GOs and their data in a consistent manner will help close the loop between sampling, contextual information and well-annotated genomic information in the public domain.

Have you spoken about the project already within GSC? (on a call, at a formal GSC meeting, would like to request time to present at a future meeting). We (Neil and Dawn) formally presented this idea at GSC 12 along with the first presentations of the genomic observatories ideas and ran a break out group where we looked at the ILTER spreadsheet for describing sites as as basis for this project. We organized a session on GOs at GSC 13 and made an open call for additional GO sites to come forward. We plan to follow up at GSC 14 which is focus on GOs.

Will you start a GSC working group? If not, why not (i.e. subgroup within developers group, existing external community, etc). Yes, we plan to start a GSC working group in time.

How do you wish to further engage the GSC? We wish to recruit relevant members to the working group from inside and outside the GSC to work within the GSC on this project.

Do you already have a website or do you wish to create a home page for the project in the GSC website? We wish to develop this project inside the GSC as a core project and use GSC resources.

What other resources might you like from what the GSC can offer? All available – likely a mailing list to start with.

What kind of timeline are you working to for building consensus, releasing a first version etc? We are looking for help/funding to help get this project off the ground. GSC 14 will be pivotal to getting to the next step with MIGO.

What resources will be required for completion? Developing the checklist requires at least one technical lead to push the project forward and a balanced working group to make sure the content is appropriate/useable.

What are your current plans for publishing/promoting the project? Discussions within GSC, we are still at very early stages.

References or relevant websites (for further reading)? GOs Network: http://www.genomicobservatories.org and citations listed.