GCDML
From Genomic Standards Consortium
The Genomic Contextual Data Markup Language (GCDML) is a core project of the Genomic Standards Consortium (GSC) that implements the “Minimum Information about a Genome Sequence” (MIGS) specification and its extension, the “Minimum Information about a Metagenome Sequence” (MIMS).
On this page: |
[edit] Aims of GCDML
In overview, MIGS/MIMS will be central to GCDML and GCDML will provide the GSC’s official implementation of the checklist. Beyond, the minimum descriptors of MIGS/MIMS, GCDML will be open and extensible to evolve with the needs of the community.
[edit] From Minimum Reporting...
The first step of this international community has been to define the “Minimum Information about a Genome Sequence” (MIGS) and “Minimum Information about a Metagenome Sequence” (MIGS) specifications. Use of MIGS/MIMS will provide a mechanism for capturing a consensus-driven minimum set of metadata describing aspects of genomes and metagenomes such as geographic location and habitat type from which the sample was taken as well as the details of the sequencing method used.
[edit] ... To Maximum Reporting
It is the aim of the GSC to provide support for the richer capture of contextual data describing genomes and metagenomes by developing the Genomic Contextual Data Markup Language (GCDML). The support of maximum reporting of such projects, though, will require a much richer set of descriptors. Such descriptors must cover both the origin and processing of a sample, from the time of sampling up to sequencing, and the subsequent analysis.
GCDML seeks to specifically support ‘maximal’ reporting of contextual data and the desire of groups in the GSC to include more descriptors beyond the minimal MIGS/MIMS.
[edit] What is Contextual Data?
The set of metadata describing aspects of genomes and metagenomes such as geographic location and habitat type from which the sample was taken as well as the details of the processing of a sample, from the time of sampling up to sequencing, and the subsequent analysis is in the focus of GCDML.
This suite of metadata is collectively referred to here as contextual data.
[edit] Using XML for Modeling Contextual Data
GCDML is implemented using XML Schema. GCDML aims to take full advantage of the benefits of an XML representation of genomic contextual data. XML provides a machine readable representation of metadata that facilitates the capture, exchange and comparison of large amount of data. XML is widely used to build data capture and exchange formats.
[edit] GCDML Satellite Meeting 2008
A technical meeting on GCDML was held on Oct 13th and 14th 2008 prior to the main GSC 6 meeting. The co-organizers are Renzo Kottmann, Peter Sterk and Dawn Field.
A livley discussion let to a list of changes to GCDML 1.6.0 which are all incorporated in GCDML version 1.7.0
See GCDML Satellite Meeting 2008 for details
Download agenda (PDF version). and Presentation sildes
[edit] GCDML Publication
A publication in the OMICS special issue was published and gives further details on the scope and general design decisions of GCDML.
- You can download the the paper from OMICS and corresponding PubMed entry
To cite this paper:
Renzo Kottmann, Tanya Gray, Sean Murphy, Leonid Kagan, Saul Kravitz, Thierry Lombardot, Dawn Field, Frank Oliver Glockner. OMICS: A Journal of Integrative Biology. June 1, 2008, 12(2): 115-121. doi:10.1089/omi.2008.0A10.
Any feedback is welcome.
[edit] Contact
- You can mail to gensc-gcdml at lists.sourceforge.net for specific GCDML topics.
- To follow the discussions subscribe to https://lists.sourceforge.net/lists/listinfo/gensc-gcdml
- Mail archives of gensc-gcdml mailing list
- Participation in telecons is open to everybody interested
[edit] Documentation
| The most up-to-date information on GCDML are on the sourceforge web pages:
http://gensc.sourceforge.net/gcdml/ Link to import xsd into e.g. Oxygen XML: http://gensc.sf.net/ns/gcdml/1.7.0/base/gcdml.xsd |
[edit] GCDML Development
GCDML is actively developed by members of the GSC further information can be found here.
[edit] Releases
Several releases were already made 1.7.0 is the most recent.
[edit] Release 1.7.0
This release includes
- All changes revealed and discussed during the 6th GSC Meeting are included in this release.
- an XSLT update file to transform 1.6.0 report files to 1.7.0
- JAXB binding files and ant targets to auto generate a Java API for use in other software projects.
[edit] Release 1.6.0
This release marks several major improvements since the last workshop. New features include:
- Two kinds of reports are available now
- Implement Habitat-Lite
- Detailed use of controlled vocabulary for units of measurement
- Improved documentation within schema and in docbook
- Bugfixes
Documentation is being written - please give feedback and addition welcome.
Documentation is available as:
HTML: http://gensc.sourceforge.net/gcdml/1.6.0/doc
PDF: http://gensc.sourceforge.net/gcdml/1.6.0/doc/gcdml_single_doc.pdf
Docbook5: http://gensc.sourceforge.net/gcdml/1.6.0/doc/gcdml_single_doc.xml
Documentation is work in progress and not up to date...
It is recommended to upgrade to this schema version.
[edit] GCDML Examples
Phage genomes in GCDML found at the MegX database: http://www.megx.net/gcdml/gcdml.html