GSC Change Log
From Genomic Standards Consortium
These updates were originally posted to the gensc-wg-all mailing list.
Planned updates:
- Meeting report drafted and in circulation to co-authors - available upon request
- Implemented MIGS XIDs for tracking evolution of schema - XID's written to the current schema give a new ID to every element in the schema
- XID's now linked into the sourceforge tracker for future discussion
- GSC Toolkit containing generic version of the XID writer, google maps, lat/long converter
- INSDC<->MIGS mapping posted
- OBI (ontology) <->MIGS(ontology) document posted
- Basic validation of reports against current schema
- Ability to edit genome reports after schema changes
- Batch Upload feature (e.g. from spreadsheets)
- link to MIBBI site using GCat
Feb 02, 2007
Proofs of 3rd GSC workshop circulated.
Feb 02, 2007
Dear All
A preview of the new Genome Catalogue web site is available: http://darwin.nox.ac.uk:8080/sandbox/gcat
The database has been populated with existing reports plus a batch upload of reports auto-generated from the NCBI resource listing completed microbial genomes: http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi
A function will be added to allow people to 'claim' these reports and take ownership of the report.
The next release of the Genome Catalogue is due to be published in the next 2-3 weeks. A full description of the new features in the Genome Catalogue will be provided then - if you would like further information beforehand please let me know.
Tanya
Jan 27, 2007
Dear All,
A few updates:
1. The MIGS paper continues to be part of the Nat Biotech community consultation process - please advertise it as appropriate. Sorry for the delay in getting this message out, but I know many of you are already aware of it. To get MIGS accepted for publication we have to successfully make it through the consultation. There is an email on the site for posting comments. Thus far, we haven't received any from the wider community. There are also a variety of other proprosals for consideration so please consider contributing. Many thanks to HUPO for leading this issue (see HUPO announcement at end of this email):
Nat Biotech: http://www.nature.com/nbt/consult/index.html
2. The workshop report from Sept has been accepted and is in press at Comparative and Functional Genomics.
3. We are soon to set the date for the 4th workshop in Cambridge. As discussed at the 3rd workshop, it will hopefully be in early June. We are just waiting to hear back from NIEeS on exact dates they can host.
4. The Genomes Online Database has added links to GCat identifiers - they can be seen on the 'gold_card' page for each genome. For example:
GOLD: http://genomesonline.org/GOLD_CARDS/Gc00141.html
5. We have selected the GSC logo and it will appear shortly on the updated GCat site. Many thanks to all who contributed and who voted! Many thanks to Dave Hancock for putting his artistic touch to good use for us! As Dan Haft eloquently stated in his email vote, he selected it because "The letters GSC are intertwined with the DNA, because DNA sequence reports and GSC reporting standards should be that closely tied." Many of you felt the same.
6. Links pointing to the GSC website from community websites will help advertise its existence. Please think about linking to it from the websites you hold - many thanks to Nikos and GOLD for being perhaps the first to do so in the "Links" page:
GOLD: http://www.genomesonline.org/links.htm
7. Norman Morrison and I will be attending the OBI workshop this week in San Diego to represent the GSC - we'll post an update following the workshop.
OBI: http://obi.sf.net
8. The MICheck website cataloguing checklist has had a name change to MIBBI and the website is starting to accumulate content. MIGS has been officially registered.
MIBBI: http://mibbi.sf.net
all the best for 2007
Dawn
[edit] =========================
Dear Collegue,
The HUPO Proteomics Standards Initiative (PSI) defines community standards for data representation in proteomics to facilitate data comparision, exchange and verification. As of today, four manuscripts from the HUPO PSI, as well as three more manuscripts with participation from PSI, are offered for community consultation by Nature Biotechnology.
The manuscripts are under consideration for publication by Nature Biotechnology and represent the results of open community standards initiatives. To broaden the basis of the standards process, the Nature Biotechnology website is making the manuscripts available for public comment and participation in the pre-publication stage.
Please review the manuscripts at
http://www.nature.com/nbt/consult/index.html
and participate in the development of the standards by sending your comments to biotech[at]natureny.com.
--
Happy New Year,
Henning Hermjakob for all participants of the HUPO Proteomics Standards Initiative
Jan 18, 2007
Update from Tanya: http://darwin.nerc-oxford.ac.uk/gc_wiki/index.php/News_Update
Jan 12, 2007
The 'one-stop-shop' for MIxxx checklist is starting to take shape and MIGS is registered. Thanks to Peter for being the official GSC co-ordinator within the project.
http://micheck.sourceforge.net/
MICheck is soon to change its name as there is already a MICheck project - all thoughts welcome! Current front runner is "Minimum Information about a Biological Investigation" (MIBBI)...(http://mibbi.sf.net)
We soon hope to see more MIxxx checklists register.
Oct 2006
We are going to soon delete all entries that don't have formal GCat identifiers from the live version of the catalogue found here.
http://darwin.nox.ac.uk/gsc/gcat/intro-view-reports
If you would like to convert test reports into real reports, GCat identifiers are now formally implemented and anyone can request or reserve one just by asking (send an email to curator@ce...). We are still looking for more examples, especially if soon going into press. Otherwise, please delete any test entries from the workshop.
Here is a list of reserved GCat identifiers - we don't have all the reports yet but they are promised.
http://darwin.nerc-oxford.ac.uk/gc_wiki/index.php/Reserved_GCat_identifiers
Special thanks to Frank Oliver and Mike Allen!
Oct 2006
Hello all,
please find below a list of updates for the GSC and the Genome Catalogue.
If you would like to contribute to discussions on the development of the Genome Catalogue and implementation of the MIGS specification, please subscribe to the gensc-devel@li... project mailing list: https://lists.sourceforge.net/lists/listinfo/gensc-devel
Tanya
GSC Change Log
latest version of XML schema and code deposited in sourceforge: http://gensc.cvs.sourceforge.net/gensc/
agreement that the main site on http://gensc.sf.net is now the production version of GCat and will undergo only periodic code updates from the sandbox
>10 new pages added to the GSC Wiki - in particular, extensive pages on feature requirements, added ability to upload files
updated MIGS to 1.1 and posted to website, updated MIGS XML schema to 1.1, all changes in new MIGS Change Log, most fields are now captured using terms from a CV
GCat identifiers implemented as discussed at workshop
speaker slides posted to web (thanks to NIEeS!) and final version of agenda available on Wiki - see GSC Meetings
Web site visitor statistics are available - please email curator@ce...
Genome Catalogue Change Log
Updates to the Genome Catalogue: ...............................................................
Genome Map * Interactive mapping of genomes on a global map with markers linked to genome reports (for those with gcat identifiers) and report authors. Uses Google Maps and requires latitude and longitude values in the database http://darwin.nox.ac.uk/gsc/gcat/map/genomemap.html
Genome Geographical Location * an output of the Genome Map function, the latitude and longitude of genomes in the Genome Catalogue are available in an XML file available at http://darwin.nox.ac.uk/gsc/gcat/genomemap
Web services (REST-style web services are supported for an increasing number of resources, including:
-- retrieve the XML schema that has been uploaded and selected to generate report input forms * http://darwin.nox.ac.uk/gsc/gcat/xsd
-- retrieve latitude/longitude coordinates for all reports in the Genome Catalogue http://darwin.nox.ac.uk/gsc/gcat/genomemap
-- retrieve user contact details and activity in HTML * e.g. http://darwin.nox.ac.uk/gsc/gcat/user/dfield
-- retrieve genome report in XML or HTML using the gcat identifier * e.g. http://darwin.nox.ac.uk/gsc/gcat/report/021_GCAT http://darwin.nox.ac.uk/gsc/gcat/report/021_GCAT/xml
Updates to the Genome Catalogue sandbox:
...............................................................................
A sandbox version of Gcat has been created for testing purposes. The location of the sandbox is http://darwin.nox.ac.uk/sandbox/gcat. The sandbox can be used freely. It requires a separate user account to the Genome Catalogue.
New features included in the sandbox are listed below:
Function to transform XML schema and insert XID attribute for each element in the schema, with a fixed value to uniquely identify that element http://darwin.nox.ac.uk/sandbox/gcat/xid
Function to transform XML schema and insert XCOM and XPUB attributes for each element in the schema, with the purpose to allow capture of comments and a PubMed id for each MIGS input field in the genome report XML instance * http://darwin.nox.ac.uk/sandbox/gcat/xcomxpub
Web services
-- query genome reports by element name and value e.g. http://darwin.nox.ac.uk/sandbox/gcat/view-reports/taxonomic_group/Metagenome
-- retrieve the XML schema that has been uploaded and selected to generate report input forms, transformed to CSV file including links to element definition pages http://darwin.nox.ac.uk/sandbox/gcat/xsd2csv
-- retrieve information on a selected XML instance element using an XID * e.g. http://darwin.nox.ac.uk/sandbox/gcat/element/6000034_xid
-- download genome report as CSV * e.g. http://darwin.nox.ac.uk/sandbox/gcat/report/0054_GCAT/csv
Active hyperlinks in HTML format genome reports for email addresses, URL's prefixed by http:// and taxid
Genome report validation against the active XML schema in the Edit Report Page
Browse pages lists reports with the most recent changes, and also provides summary of reports in the database with links to a page listing reports selected by taxon group
Genome report XML instance elements no longer contain attributes that are not defined in the XML schema, e.g. @help, @id. This has removed attributes that would prevent XML instance validation against the XML schema.
A more accurate search function has resulted from removal of attributes from the genome report XML instances that contained text describing the MIGS element.
GSC Toolbox added with generic function to transform latitude and longitude in degrees-minutes-seconds format to digital representation.
Re-introduced * calendar for date field * to standardise date input format
GSC blog * trial blog, looking at how useful it might be to convey news on GSC and the Genome Catalogue
In Development
............................
capture of comments and PubMed id's for each input form field
batch upload function
advanced search functions
Feature requests can be posted on the sourceforge project web site : http://sourceforge.net/projects/gensc
Oct 11, 2006
-- REST-style URL's, e.g. search by element name/element value, csv format for report
-- relevance fields now textarea
-- hyperlinked email, http://* taxid in HTML report
-- report validation on edit report page
-- CSV format for individual reports
-- transformation of the live xml schema to a csv representation /gsc/gcat/xsd2csv
-- browse page, list last ten modified reports in descending order
-- provides summary for each taxonomic group, and link to page listing reports for a selected taxon group
-- removed @help and @id attributes from genome reports – as a result, the search function is more accurate and has - removed attributes that would prevent instance validating against schema
-- function to add xcom and xpub attributes to schema to allow capture of comments and pubmed id in the genome report instance – transforms live schema via url ie gsc/gcat/xcomxpub
-- xid function to insert xid attributes into xml schema
-- toolbox – added function to calculate digital lat/long from degrees-minutes-seconds representation
Sept 2006
Dear All,
Many thanks again to everyone contributing to the GSC (including those not at the workshop).
1) Many thanks to Ian Frame, the slides from the workshop can already be found online at
http://archive.niees.ac.uk/cgi-bin/event_list.cgi
2) If anyone has notes from the workshop and would like to be involved in the write up of the meeting report please let me know!
3) Tatiana has just presented a slide accouncing the GSC at the Genome Informatics meeting in Hinxton (wil post to Wiki). The content largely summarizes the outcome of the workshop in brief as well. Comments and feedback welcome. Let us know if you re-use it in any form.
Towards richer descriptions of our collection of genomes and metagenomes: The Genomic Standards Consortium
The goal of this international community is to promote mechanisms that standardize the description of genomes and exchange and integration of data
- Currently working on a "Minimum Information about a Genome Sequence" (MIGS) specification
- As a result of the 3rd GSC workshop held Sept 11-13 2006 soon to release new resources:
- Version 1.0 of MIGS for wider public consultation (0.9 currently posted)
- Alpha stage data capture software implemented online called the Genome Catalogue (GCat)
- First GCat ID reserved for accepted genome paper (000001_GCAT) and agreement from EMBL/Genbank/DDBJ curators to work with submitters to produce MIGS compliant genome reports in addition to complete INDS submission
- MIGS / INSDC mapping
- Dump of EMBL and Genbank genomes into GCat for the sake of quantifying the usage of optional qualifiers (like lat_lon for geographic location) and comparison as a first step towards a single, global list of genomes
Sept 2006
- google maps feature added - any entries with valid latitude and longitude can be viewed on a world map with hyperlinks to genome reports with gcat identifiers: http://darwin.nox.ac.uk/gsc/gcat/map/genomemap.html
- latest version of XML schema and code deposited in sourceforge: http://gensc.cvs.sourceforge.net/gensc/
- 'sandbox' version of GCat created for testing the latest code: http://darwin.nox.ac.uk/sandbox/gcat
- agreement that the main site on http://gensc.sf.net is now the production version of GCat and will undergo only periodic code updates from the sandbox
- >10 new pages added to the GSC Wiki - in particular, extensive pages on feature requirements, added ability to upload files
- updated MIGS to 1.1 and posted to website, updated MIGS XML schema to 1.1, all changes in new MIGS Change Log, most fields are now captured using terms from a CV
- REST-style_Web_Services defined for accessing an increased number of resources in the Genome Catalogue (e.g. Gcat identifiers, the latitude and longitude of all genomes in the database, etc)
- GCat identifier resolution service implemented (supports access by URL containing identifier)
- GCat identifiers implemented as discussed at workshop
- speaker slides posted to web (thanks to NIEeS!) and final version of agenda available on Wiki - see GSC Meetings