towards a richer set of information to describe our complete genome collection

5th GSC Workshop

From Genomic Standards Consortium

Main Page -> GSC Meetings see also Category:Meetings



The 5th GSC Workshop


This workshop is being funded by NERC with organisational support provided by NIEeS and the EBI.

On this page:

[edit] Dates

Date: 12 - 14 December 2007


[edit] Venue

The Venue for this 5th workshop will be the European Bioinformatics Institute (EBI), Cambridge, UK. The local host is Peter Sterk and the workshop will take place within the new and state-of-the-art IT training suite at the EBI.


[edit] Background

This 5th workshop will focus on finalizing a stable version of the MIGS checklist for publication. It will also focus heavily on hands on work with the newly created "Genomic Contextual Data Markup Language" (GCDML). This is a much richer XML schema that also implements MIGS/MIMS in addition to containing a wider range of additional elements for describing genomes and metagenomes.


[edit] Agenda and Presentations

The agenda is packed with great talks, and yet we need to make sure we have time for group discussions. There are many 10 minute talks because we want people to be able to provide a diversity of views, ideas, and material for seeking group consensus on several key issues.


All speakers should aim for no more than 1 slide for each 2 minutes of your talk. For example, a 10 minute talk should only be 4-5 slides of significant content. Chairs will be strict when it comes to sticking to the timetable. These talks are designed to introduce a topic/project and specific progress towards GSC goals.


If you have any questions about the content or scope of your talk/session, or would like to suggest a change to the agenda, please write to the chair of your session (or to dfield 'at' ceh.ac.uk to pass any query along to the right person/people).


Agenda (Word Document)


Day 1 - Wednesday 12th December 2007

7:30-8:30

Breakfast at College

9:30

Bus Departs from College for EBI

10:00-10:30

Registration/coffee

Introduction to 5th GSC workshop

Dawn Field, CEH Oxford
presentation

Pre-Workshop Session: MIGS/MIMS and GCDML Technical Session – Active work on final version of MIGS checklist for publication and GDCML
Session Chair: Tanya Gray

10:30

Review of the current MIGS/MIMS checklist – setting the stage and finalizing scope
George Garrity, Michigan State

11:00

Review of feedback thus far on the development of the Genomic Contextual Data Markup Language (GCDML). Technical Review of the status of GCDML – Towards a Roadmap
Renzo Kottmann, MPI Bremen

workshop presentation

Short talks on the use of GCDML

11:30

The use of GCDML (MIGS/MIMS) by CAMERA
Leonid Kagan and Sean Murphy, JVCI presentation

11:40

Describing all Sanger genomes using MIGS and GCDML

       Nick Thomson, Sanger Institute

MIGS

11:50

Extending GCDML (MIGS/MIMS) reporting for biodiversity measures: 16S environmental data and the AMO database

               Philip Goldstein, University of Colorado 


presentation
xml example 1
xml example 2


12:00

Get the most out of your metagenome: computational analysis of environmental sequence data - Extending MIGS/MIMS to Minimess

               Jeroen Raes, EMBL

Minimess - extending mims for comparative metagenomics (Jeroen Raes)
12:10 The Ecological Metadata Language (EML) and its relationship to GCDML
         Inigo San Gil, Long Term Ecological Research (LTER) Network Office
       
PowerPoint Slides
12:20 Open discussion
12:50 Wrap up and Actions for session

13:00-14:00

Lunch

Formal Workshop Opens

14:00

Welcome, background and goals of workshop: Scope, Syntax and Semantics
Workshop Organizers

Session I: Setting the Stage and finalizing MIGS
Session Chair: Dawn Field

14:15

The Genomic Contextual Data Markup Language (GCDML) – A Summary and completion of support for MIGS/MIMS as it will be published
Renzo Kottmann, MPI Bremen

14:30

Outcomes of NCBI/ASM workshops: Gene nomenclature and the consensus CDS project

                 Tatiana Tatusova, NCBI


presentation

15:00

INSDC Update – genome project database and genome project ids are now mapped
Guy Cochrane, EMBL presentation & Ilene Mizrachi, NCBI presentation

15:30

Coffee

16:00

The Environment Ontology (EnvO) project – describing the environmental context of biological samples
Norman Morrison, NERC Environmental Bioinformatics Centre (NEBC)</br>

workshop presentation 8.3MB

16:10

Gaz – an open source community-developed Gazetter
Neil Sarkar,
The Encyclopedia of Life (EoL) Woods Hole

workshop presentation

16:20

An introduction to the Barcode of Life project – towards the description of barcodes
Sujeevan Ratnasingham, University of Guelph

16:30

Finalizing MIGS for publication
George Garrity, Michigan State

17:30

Close of Day 1

18:00

Bus departs for College

19:30

Dinner at College

Day 2 - Thursday 13th December 2007

Session II: Extending GCDML from curated metadata to derived calculations: (gene calling and genomic annotation)
Session Chair: Nikos Kyrpides, Joint Genome Institute

7:30-8:30

Breakfast at College

8:30

Bus departs College

9:00

Introduction to the goals of this session
Nikos Kyrpides

Brief talks: Current practice and perspectives within the major sequencing centers

9:10

Nikos Kyrpides, Joint Genome Institute

9:20

Nick Thomson, Sanger Institute


GSC_Annotation

9:40

Chinnappa Kodira, Broad Institute


presentation

9:50

Jian Xu, Washington University

10:00

Ramana Madupu , JCVI

10:10

Eric Pelletier, Genoscope

workshop presentation PDF (822 ko)

10:20

Owen White, University of Maryland

PPT presentation

10:30

Coffee

11:00

Discussion – Linkages, use of GDCML, can we standardize?

13:00

Lunch

Session III:  Towards Controlled Vocabularies and Ontologies for describing genomes and metagenomes: a focus on EnvO

         Chair: Lynette Hirschman, MITRE 


14:00

The GEMINA database: contribution to, and uptake of the EnvO/Gaz ontologies

           Lynn Schriml, University of Maryland 


Image:Schriml.pdf

14:15

The use of CVs and Ontologies in the Genomes Online Database (GOLD): Grouping genomes by habitat for comparative analysis
Nikos Kyrpides, Joint Genome Institute

14:30

The Future of MIGS/MIMS text-mining: Working with NCBI

           Lynette Hirschman, MITRE 


presentation

14:45

Discussion CVs and Ontologies needed for GCDML compliance

TOPICS:

Experiences using ontologies w GCDML

How to link out to other ontologies:  food, taxon, anatomy,...

EnvO-Lite: definition, use cases, granularity?

Establish Term Vetting Group

15:30

Coffee

 

 

Session IV:  Towards a single, global list of genomes and metagenomes: A Genomic Rosetta Stone
Chair: Peter Sterk, European Bioinformatics Institute

16:00

Mapping StrainInfo.net, SILVA, GOLD, and GCat: A working foundation for the Genomic Rosetta Stone
Peter Dawyndt

16:10

Mapping to the Ribosomal Database contents to PIDs and into the Genomic Rosetta Stone

           James Cole (Michigan State)


presentation

16:15

Panel Discussion: Tatiana Tatusova, Guy Cochrane, Dave Ussery, James Cole, Tanya Gray, Frank Oliver Glöckner, Nikos Kyrpides

Who has mapped to PIDs?

What should the GSC’s Genomic Rosetta Stone Look like?

Actions and Timelines

17:30

Close

18:00

Bus departs for College

19:30

Dinner at College

Day 3 - Friday 14th December 2007

7:30-8:30

Breakfast at College

8:30

Bus departs College

Session V: GSC Roadmap
Chairs: George Garrity and Frank Oliver Glöckner

9:00

Strategy Session
Funding Opportunities, future publications, Essential Linkages, workshops (ISMB BoFs etc)                    

MIGS-MIMS/GCDML Technical Session
GCDML working session

10:30

Coffee

11:00

Reviews of morning session and setting actions for the future
George Garrity Michigan State and Frank Oliver Glöckner, MPI Bremen

13:00

Lunch

14:00

Wrap up: Review of Actions

           Workshop Co-organizers


5th GSC workshop summary presentation, Dawn Field

15:00

Formal close of workshop (Organizers) and Coffee

15:30

Departures

 

[edit] WHAT YOU CAN DO AT THE 5th GSC WORKSHOP TO HELP THE GSC REACH ITS GOALS

The purpose of these GSC workshops is to involve the wider, international community in the goal of describing our collection of genomes/metagenomes in more detail. This means extending the type of information captured through a new specification (MIGS/MIMS), setting up a way to capture data (GCDML, the Genome Catalogue), the generation of a unified list of genomes/metagenomes (The Genomic Rosetta Stone), and the development of appropriate ontologies (e.g. EnvO).


The workshop will be a mix of updates since the last workshop (e.g. GCDML, the Environment Ontology) and introductory talks from new communities also interested in describing/analyzing genomes/metagenomes (e.g. Sequencing Centres) or with broad and relevant experiences capturing metadata (e.g. Encyclopedia of Life, BarCode of Life Project, the Ecological Metadata Language (EML), MINIMESS proposal).


[edit] Add to the Wiki

The GSC Wiki is open for anyone to edit and many of you have. Please feel free to make an account and contribute at any time.


[edit] Changes to MIGS checklist

This page in the wiki has been set up to hold feedback on the MIGS checklist and will be resolved at the workshop. The proofs will be resumitted to Nat Biotech on Monday the 17th. Please feel free to add comments!


MIGS Checklist Proofs


[edit] OUTCOME OF WORKSHOP: Final MIGS checklist

OUTCOME OF WORKSHOP: Here is the online outline of the checklist built by George Garrity and used at the workshop to discuss the checklist and add additional examples. This is a final version of the checklist: http://gensc.sourceforge.net/docs/migsmims/



[edit] Create MIGS Compliant Genome Reports

Our main goal will be to finalize the checklist for publication. Hopefully we can do this in the time allotted on the first day (to be led by George Garrity) and then spend lots of time on getting to grips with the Genomic Contextual Data Mark Up Language (GCDML) for the rest of the meeting.


If you are willing, you can print out the checklist and try to describe your own genome or metagenome. This would be invaluable as it would guide which elements you think work/don't work. Any data you provide can then be put into the Genome Catalogue and you will get credit for being among the first to fill out a report!


As soon as MIGS is stabilized and we can finish the corresponding XML, we have promises of batch uploads of all Sanger microbial genomes and a large set of phage genomes among other reports. We are looking for more contributions.


[edit] GCDML

Renzo Kottmann, with input from many, has released a first version of GCDML and we are hoping it will be considered by any genomic/metagenomic database for the capture of curated and calculated information. A key part of the development of GCDML in the future will be its extension beyond capture of MIGS/MIMS to additional sources of curated data to calculated data. Nikos Kyrpides has put together an excellent session with representatives of the major sequencing centres to discuss the standardization of gene calling and genomic annotation.


Here is more information available

[edit] Work on the Genomic Rosetta Stone

We will be looking, now that EMBL genomes are mapping to Genome Project Identifiers (PIDs), to proceed with the mapping of local identifiers across genomic databases. As a start, we have a mapped set of the first 500 published genomes from GOLD as collated by Paul Swift and mapped to PID by Peter Sterk. This file has the PIDs in green on the worksheet "Micro_genomes_first_500" along with RefSeq IDs and accession numbers for all chromosomes. This is a real milestone and a great start to jumpstarting the bigger project. Several groups have already agree to join the effort and are working to map their local identifiers to PID (have been doing so for a long time now). We need to update information on who wants to be involved, whether webservices are available for harvesting the mappings, and how we intend to build a central 'resolver' system. An online prototype is available from Tanya Gray and more information can be found here:


More information available here: http://gensc.org/gc_wiki/index.php/Genomic_Rosetta_Stone#A_Genomic_Rosetta_Stone



[edit] Catalogue any sequence datasets not found in the INSDC

There are an increasing number of large sequence-based datasets that are not found in the INSDC. This page aims to provide a comprehensive list of such datasets to flag up the growing issue of a need for a unified list of all genomes and metagenomes.



REASON: No flowgrams, unassembled. The first published pyrosequencing data sets (4 pooled samples) from natural viral communities. Unique identifiers given for CAMERA, SCUMS, and GCat but could not submit unassembled pyrosequencing data to GenBank and not in Trace Archive as flowgrams no available.


  • "Roesch et al The ISME Journal Pyrosequencing enumerates and contrasts soil microbial diversity (2007) 1, 283 – 290


REASON: Sequences too small. "As Genbank does not provide accession numbers for sequences of this length, these 562 sequences are provided in the supple-mentary material (Table S1). "


Note: Small read archives under development


[edit] Contribute to the upcoming special issue of OMICS from the 5th GSC Workshop

Dawn Field and George Garrity have been asked by the OMICS Editor and Chief Eugene Kolker to produce a special issue of OMICS based on the 5th GSC Workshop.


Background and how to contribute: GSC Special issues of OMICS

[edit] Travel to the EBI (workshop venue)

A coach/bus will be provided each day to transport delegates from the accommodation at Jesus College, Cambridge, to the EBI and back again in the evenings. Pick-up location will be from Victoria Avenue within the college, and times will be shown in the agenda that can be downloaded from this wiki page, but are likely to be 9.30am on Wed 12th and 8.30am on Thurs 13th and Fri 14th. Return times from the EBI to Cambridge will be 6.00pm on the 12th and 13th, and 3.30pm on the 14th.

For those delegates travelling directly to the EBI: Maps and directions are available from the EBI website

If you are travelling directly to the EBI, please be aware that you will need to check in at the main gate reception. Your names have already been registered with security, please tell them you are visiting the EBI and your contact person is Peter Sterk. Security passes and workshop badges can be collected from the EBI reception desk.

Additional information on the Wellcome Trust Genome Campus, facilities and local weather can be found at http://www.ebi.ac.uk/Information/Site_Info/site_info.html.

[edit] Accommodation

11, 12 and 13 December

Accommodation will be paid for by the GSC and provided for participants on the nights of Wednesday 12th and Thursday 13th December at Jesus College, Cambridge. Please indicate on your registration the nights that you require accommodation.
Accommodation can also be provided on Tuesday 11th December for those participants travelling long distance. Please indicate on your registration if you require accommodation for Tuesday 11th December.

Breakfast is included at the college and will be available from 7.30am - 8.30am.

Jesus College is situated in the heart of Cambridge and is easily accessible by bus, rail or car. A map of Jesus College and travel directions can be found on their college website at http://www.jesus.cam.ac.uk/contacts/travel.html.

On arrival at Jesus College, please contact the Porters Lodge to obtain a key to your room. A map of the college is available here.

Transport from Jesus College to the workshop venue at the EBI is provided as stated above.

[edit] NOTE: Internet Access

If require Internet access in your room accommodation at the college, then we will need your laptop "Machine Address" 3 weeks before the event to register access. Please email the Machine Address to Ms Caroline Wills-Wright, email CARLIS@wpo.nerc.ac.uk. before 20th November. This is not wireless access, so you will need to provide your own Internet cable to connect to the College ethernet.


14 December

Accommodation for the 14th December in Cambridge can also be arranged, but you will need to request this by sending a request by email to Ms Caroline Wills-Wright, email CARLIS@wpo.nerc.ac.uk. Accommodation for the 14th will necessitate a move from Jesus College as rooms are not available at the college.

If you have an early flight booked on Saturday and require accommodation closer to the airport for the night of Friday 14th, then please book this yourself and the expense will be reimbursed to you.


[edit] Travel to Cambridge

The above Jesus College link to their travel page has details on how to get to Cambridge from most airports and local cities. A summary for Heathrow to Cambridge is:-

Rail

Take the Heathrow Express rail service to Paddington rail station (departs every 15 mins), then the underground tube to Kings Cross rail station (departs every 30 mins), then a train to Cambridge (departs every 30 mins). Then take a taxi from Cambridge rail station to Jesus College. Total cost about £35.00 and takes just over 2.0 hours if you get your timing right.

Coach

Take the National Express Jetlink coach from the Central Bus Station at Heathrow airport direct to Cambridge Drummer Street bus station. Coaches run every hour - tickets from the coach driver. Then from the Cambridge bus station walk 10 mins to Jesus College. Total cost about £25.00 and takes about 2.5 - 3.0 hours. A timetable with pick-up points is available here Image:Bus heathrow.pdf.


[edit] Workshop Lunches and Dinners

During the workshop, we will have a buffet lunch outside the meeting room on Wednesday. On Thursday and Friday, lunch will be at the site restaurant. Lunch vouchers will be provided.

There are two scheduled dinners for the workshop delegates, both are held in the Upper Hall at Jesus College. Dinner is at 7.30pm on Wednesday 12th and 7.30pm on Thursday 13th December. Please indicate any special dietary requirements during the registration process.


We are also arranging dinner for the 11th at a local Cambridge seafood restaurant (like last time!).


Dinner on the 11th will be at the Loch Fyne Restaurant in central Cambridge (http://www.lochfyne.com/Restaurants/Locations.aspx). The reservation (~20 people!) is for 8:30 and we will meet at the Porter's Lodge at Jesus College at 8 pm to walk into Town. It is a set meal with a Christmas theme and is sponsored by the GSC.

[edit] Reimbursement of Expenses

Reasonable travel costs will be reimbursed to the delegates on submission of a signed NERC T&S form along with all original receipts for the expenditure being claimed. Travel can include all travel to and from the venue. A NERC T&S form can be downloaded from here.


Completed forms with receipts must be posted to Ms Caroline Wills-Wright, CEH Oxford, Mansfield Road, Oxford, OX1 3SR. email CARLIS@wpo.nerc.ac.uk.

[edit] Two Page Summary

You can download and print a two page summary of the venue, accommodation, travel and T&S claim form from here.


[edit] MIGS/MIMS and GCDML - Outstanding Issues

MIGS/MIMS and GCDML - Outstanding Issues

[edit] MIGS/MIMS and GCDML - Actions

MIGS/MIMS and GCDML - Actions

Loading...