close
Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jun 1;31(11):1872-4.
doi: 10.1093/bioinformatics/btv045. Epub 2015 Jan 24.

ENVIRONMENTS and EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life

Affiliations

ENVIRONMENTS and EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life

Evangelos Pafilis et al. Bioinformatics. .

Abstract

The association of organisms to their environments is a key issue in exploring biodiversity patterns. This knowledge has traditionally been scattered, but textual descriptions of taxa and their habitats are now being consolidated in centralized resources. However, structured annotations are needed to facilitate large-scale analyses. Therefore, we developed ENVIRONMENTS, a fast dictionary-based tagger capable of identifying Environment Ontology (ENVO) terms in text. We evaluate the accuracy of the tagger on a new manually curated corpus of 600 Encyclopedia of Life (EOL) species pages. We use the tagger to associate taxa with environments by tagging EOL text content monthly, and integrate the results into the EOL to disseminate them to a broad audience of users.

Availability and implementation: The software and the corpus are available under the open-source BSD and the CC-BY-NC-SA 3.0 licenses, respectively, at http://environments.hcmr.gr.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Top: The “Overview” tab of the EOL taxon pages show a subset of the ENVO terms obtained through text mining; an extended list of such terms is available in the “Data” tab. Parts of the page have been resized to improve readability. Bottom: The latter list provides links to the EOL text sections where each term was found (highlighted in bold)

References

    1. Bossy R., et al. (2013) BioNLP shared task 2013—an overview of the bacteria biotope task. ACL 2013. In: Proceedings of the BioNLP Shared Task 2013 Workshop, pp. 161–169.
    1. Buttigieg P.L., et al. (2013) The environment ontology: contextualising biological and biomedical entities. J. Biomed. Semantics, 4, 43. - PMC - PubMed
    1. Gwinn N.E., Rinaldo C. (2009) The biodiversity heritage library: sharing biodiversity literature with the world. IFLA J., 35, 25–34.
    1. Hirschman L., et al. (2008) Habitat-lite: a GSC case study based on free text terms for environmental metadata. OMICS, 12, 129–136. - PubMed
    1. Pafilis E., et al. (2013) The SPECIES and ORGANISMS resources for fast and accurate identification of taxonomic names in text. PLoS One, 8, e65390. - PMC - PubMed

Publication types