Habitat-Lite
From Genomic Standards Consortium
Main Page->EnvO Project - Inaugural EnvO Workshop - Second EnvO CSHL workshop - GSC EnvO Case Study
NEWS: The RDP is running a user survey on habitat terms that are most important to users. Click on the RDP website and the quick survey will automatically launch in your browser: http://rdp.cme.msu.edu
NEWS: Habitat-Lite paper available: Pubmed
Towards a consensus-driven Habitat-Lite: a short list of high-level terms for describing habitat
On this page: |
[edit] Introduction
The GSC is interested in the description of 'sample' including the habitat. We are therefore exploring the creation of a limited list of terms for describing habitat.
Increasingly, short lists of habitat terms are being used to annotate databases and undertake a variety of analyses. These lists are continuously being developed a new because there is not yet a central place to put and compare lists.
We are:
1. collecting habitat lists in use within the GSC community
2. determining the overlaps in these lists
3. assessing whether it it feasible to 'merge' (unify) these lists into a single, short list, that might be widely adopted for its broad scope and suitable coverage of a wide range of samples (genomes, metagenomes, 16S, etc).
[edit] List of terms describing habitat
A list of illustrative sources of habitat information is included in the OMICS paper and in this document: http://gensc.org/gc_wiki/index.php/Image:Table_of_habitat_terms.doc
The actual terms will be added to the wiki in the future.
[edit] Habitat-Lite Version 0.1
A first pass version of Habitat-Lite is below. All terms were selected from the Environment Ontology (EnvO) in which the GSC partipates.
Biome 1 freshwater ENVO:00000873 2 marine ENVO:00000447 3 terrestrial ENVO:00000446 Environment of sample (basic descriptors, tailored to current genome and metagenome data sets) 4 soil ENVO:00001998 5 water ENVO:00002006 6 air ENVO:00002005 7 sediment ENVO:00002007 8 sludge ENVO:00002044 9 waste water ENVO:00002007 10 hot spring ENVO:00000051 11 hydrothermal vent ENVO:00000215 12 organism-associated ENVO:00002032 13 extreme environment ENVO:00002020 14 food ENVO:00002002 15 biofilm ENVO:00002034 16 microbial mat ENVO:01000008 17 fossil ENVO:00002164
There is also a .obo file (as a .txt file): Habitat-Lite version 0.1 in .obo format
[edit] Suggest Improvements and issues
Notes following Lynette's first pass search of the "isolation_field" of Genbank documents:
1. We probably need to add "Aquatic" (lumps marine and freshwater but commonly used)
2. "Organism-associated" will be 'host' or 'host-associated' in current records. EnvO simply didn't want to be restrictive as 'host-associated' suggests a pathogen/host relationship (i.e. not symbiote).
3. Conclusions related to this version of Habitat-Lite are as follows (based on Habitat-Lite OMICS paper):
• The set of terms should support certain inferences useful for search; for example, that a sample labeled soil is also terrestrial, or that a sample from a hydrothermal vent is also extreme.
• Consistent annotation requires guidelines for general terms such as terrestrial and aquatic (currently not present in Habitat-Lite), to instruct annotators to annotate to the most specific term possible.
• The notion of extreme environment is problematic in that it should be annotated in addition to a more specific term, such as hot spring – thus requiring that certain entries be associated with two Habitat-Lite terms.
• Organism-associated needs to be sub-divided by linking out to other ontologies or controlled vocabularies (specifically, a taxon hierarchy and perhaps a high level anatomy ontology). • Fossil is an example of a currently infrequently used term, but a candidate “exceptional importance” term that could be useful in the future for searching.
[edit] Open Call for Participation
We are making an open call for evaluation of this list of habitat terms just that we can make a consensus-driven version of this list that best suits community needs. These terms list would then be implemented in GCDML [ref] and used in the first instance to fill the “Habitat” field of the MIGS compliant Genome Catalogue database (http://gensc.org).
Please post comments into the wiki or write to the lead author of this project: Lynette Hirschman, lynette@mitre.org