GSC Roadmap
From Genomic Standards Consortium
The Roadmap developed at the 3rd GSC workshop
The draft 3rd GSC workshop report is now available from curator 'at' ceh.ac.uk. It contains a high level 10-point Roadmap for the GSC for the next year to Oct 2007.
More specific aspects of implementing the roadmap can be found here as we develop them
Populate the Genome Catalogue with information imported from other data bases:
Based on offers by NCBI, EBI and GOLD to supply their data:
1) We will import key identifiers to create 'stubs' for all the bacterial / archaeal genomes. We will do this by:
- extending the Genome Catalogue XML schema which implements MIGS to include the following information
- parsing the below spreadsheets into corresponding XML documents
- map gold stamps onto genome project ids
- resolve any conflicts within lists of genomes
- develop a new view that allows uses to scroll for their genome and link directly to relevant MIGS input form
Status of data contributions:
- Tatiana has provided a spreadsheet of bacterial and archaeal genomes from the Genome Project database.
ftp://ftp.ncbi.nih.gov/genomes/Bacteria/lproks_0.txt
From this file we will import:
Genome Project id (first column) Linked to: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=genomeprj&cmd=Retrieve&dopt=Overview&list_uids=12997
Taxonomy id (second column) Linked to : http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=155978
Organism name (no link)
- Further, Gold data are available in downloadable format from GOLD and Nikos is confirms he is happy to see links to GCat developed. We will therefore also add:
gold stamp linked to:
http://genomesonline.org/GOLD_CARDS/Gc00457.html
2) work to pull out MIGS compliant information from these sources (for example, the NCBI data already has information for oxygen-requirement)
3) Extend this approach to other types of genomes (metagenomes next)