The Environment Ontology (EnvO) Project

The Environment Ontology (EnvO) Project Proposal April 2012

Project Title: The Environment Ontology (EnvO) Project

Project Lead: Pier Luigi Buttigieg (Max Planck Institute for Marine Microbiology, Jacobs University Bremen gGmbH)

Team members:

  • Norman Morrison (NERC Environmental Bioinformatics Centre, The University of Manchester)
  • Dawn Field (NERC Centre for Ecology and Hydrology, The University of Oxford)
  • Suzanna Lewis (Lawrence Berkeley National Laboratory)
  • Barry Smith (National Center for Biomedical Ontology / University at Buffalo)
  • Michael Ashburner (Department of Genetics, University of Cambridge)

See also list of attendees at the Inaugural EnvO Workshop (co-sponsored by the GSC and the NERC Environmental Bioinformatics Centre) and EnvO Workshops II (CSHL), III (Manchester) and IV (Dagstuhl)

Elevator pitch: EnvO is a community-developed ontology for the standardised description of environments associated with specimens, samples and observations (SSOs). = Primary use ADD SECONDARY USES: to provide a controlled vocabulary for the description of environments of all sorts. Environments also used in metagenomics studies of environments themselves — are they the same as metagenomics of samples?

ADD: AT ALL SCALES. The more communities involved the more we can profit from lessons learned.

EnvO provides researchers with a tool to concisely contextualise their data, even when measurement of environmental parameters was not possible. EnvO may be applied to both contemporary and legacy SSO data.

Project Summary: Access to data associated with biological specimens, samples and observations (SSOs) can be achieved through a number of different routes. Examples include an SSO’s molecular signature (i.e. its DNA sequence, protein sequence, RNA sequence, etc), geographic coordinates, phenotypic characteristics, timestamp, or the characteristics of its environment.

Descriptions of an SSO’s environment are rarely well-controlled and difficult to rely upon for data access or comparison. However, these descriptions are a vital source of contextual information, particularly when exact measurements of environmental parameters are not available. EnvO addresses this issue by providing an ontology for the standardised description of environments associated with an SSO. The ontology comprises syntactically and semantically controlled terms maintained in a logically coherent structure. Researchers may use these terms to annotate their SSOs with robust and consistent descriptors. Once this key field is standardised, other aspects of computational knowledge organisation can follow such as consistent access, retrieval, integration, linking and reasoning of data.

Which existing projects, if any, does this one replace/complement/subsume? At its inception in 2007 at ISMB Vienna there was no formal ontology for contextualising the environment of SSOs. To date there are a number of controlled vocabularies available of environmental descriptors, but none with the specific scope of describing the environment of SSOs or as comprehensive as EnvO.

Explain briefly why an extra project is needed/justified: As noted above, EnvO is already a component of a number of GSC projects; however,  we believe that a dedicated project would better coordinate EnvO’s existing working group with other GSC projects.

How does this project fit into GSC’s mission statement? Part of the GSC’s mission statement is to provide to the research community highly contextualised genomic data.  There has been a long history and demand for environmental contextualisation as part of the drive towards GSC MIxS compliance.  Describing environments in a consistent manner will provide new scientific insights by placing well-annotated genomic information in the public domain. EnvO may also be applied retroactively: legacy SSO data may be annotated with EnvO terms based on full-text descriptions (when available) of the SSO’s environment. Thus, EnvO may be used to standardise and re-mobilise both contemporary and legacy genomic data.

Have you spoken about the project already within GSC? EnvO has been a core part of GSC meetings and MIxS development since 2007, there have been a number of EnvO presentations at GSC meetings, and EnvO has also spawned other projects such as EnvO-lite.  As a result of Barry Smith attending GSC 12 we initiated discussions about whether EnvO could be formally brought under the GSC.  Dawn Field brought this potential option to the GSC Board’s attention at the GSC 12 Board meeting and the Board voted to accept EnvO in principle if this was the wish of the wider EnvO community.  An action was placed on Barry Smith and Norman Morrison to consult the community and reach a decision.

Will you start a GSC working group? The EnvO project has been a working group within the GSC for the past 5 years. We have had 4 official working group meetings in that time. We are now requesting EnvO be officially brought within the GSC umbrella.

How do you wish to further engage the GSC? We wish to continue developing the ontology in collaboration with members of the GSC, in particular the content and developers groups and also the GSC Biodiversity Working Group.  There are a large number of database groups that could use EnvO in the GSC and a number that already do so.

Do you already have a website or do you wish to create a home page for the project in the GSC website? We have a website (http://www.environmentontology.org) and have content on the GSC Wiki (http://wiki.gensc.org/index.php?title=EnvO_Project). We are in the process of updating this content on a new website expected to go live in the next few months.

What other resources might you like from what the GSC can offer? EnvO has a mailing list: Obo-envo@lists.sourceforge.net

What kind of timeline are you working to for building consensus, releasing a first version etc? EnvO Version 1 is currently in use and has a number of adopters.

What resources will be required for completion? We hope to seek further funding for EnvO as a part of larger grants and/or as a separate grant.

What are your current plans for publishing/promoting the project? The EnvO paper has been published in the Journal of Biomedical Semantics, see: www.jbiomedsem.com/content/4/1/43

References or relevant websites (for further reading):