towards a richer set of information to describe our complete genome collection

GSC Roadmap

From Genomic Standards Consortium

The Roadmap developed at the 3rd GSC workshop

The draft 3rd GSC workshop report is now available from curator 'at' ceh.ac.uk. It contains a high level 10-point Roadmap for the GSC for the next year to Oct 2007.


More specific aspects of implementing the roadmap can be found here as we develop them


Populate the Genome Catalogue with information imported from other data bases:

Based on offers by NCBI, EBI and GOLD to supply their data:


1) We will import key identifiers to create 'stubs' for all the bacterial / archaeal genomes. We will do this by:

  • extending the Genome Catalogue XML schema which implements MIGS to include the following information
  • parsing the below spreadsheets into corresponding XML documents
  • map gold stamps onto genome project ids
  • resolve any conflicts within lists of genomes
  • develop a new view that allows uses to scroll for their genome and link directly to relevant MIGS input form


Status of data contributions:

  • Tatiana has provided a spreadsheet of bacterial and archaeal genomes from the Genome Project database.


ftp://ftp.ncbi.nih.gov/genomes/Bacteria/lproks_0.txt


From this file we will import:

Genome Project id (first column) Linked to: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=genomeprj&cmd=Retrieve&dopt=Overview&list_uids=12997

Taxonomy id (second column) Linked to : http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=155978

Organism name (no link)


  • Further, Gold data are available in downloadable format from GOLD and Nikos is confirms he is happy to see links to GCat developed. We will therefore also add:

gold stamp linked to:

http://genomesonline.org/GOLD_CARDS/Gc00457.html


2) work to pull out MIGS compliant information from these sources (for example, the NCBI data already has information for oxygen-requirement)


3) Extend this approach to other types of genomes (metagenomes next)

Loading...