iPhylo

This is a very early demo of iPhylo, a database of people, publications, sequences, specimens, images, taxa, and (ultimately) trees. iPhylo is a descendant of my bioGUID and SemAnt projects. iPhylo shares much with these projects, but drops the use of a triple store in favour of an entity-attribute-value model. Like bioGUID, iPhylo relies on a suite of web services (most external, some I've developed locally) to locate and resolve identifiers.

Goals

The goal of iPhylo is to treat biodiversity objects as equal citizens. Each object has a unique identifier, associated metadata, and is linked to other objects (for example, a specimen is linked to sequences, sequences are linked to publications, etc.). By following the links it is possible to generate new views on existing information, such as a map for study that doesn't have any maps. Below is a map generated for Brady et al. (doi:10.1073/pnas.0605858103), based on links between sequences and specimens. You can start to browse these links in iPhylo by clicking here.

Error, browser must support "SVG"

(If you can't see the map you'll need a different browser, such as Firefox 2 or Safari 3)

By linking objects together we can also track the provenance of data, and ultimately build "citation networks" of specimens, sequences, etc. For background see my paper on "Biodiversity informatics: the challenge of linking data and the role of shared identifiers" (doi:10.1093/bib/bbn022, preprint at hdl:10101/npre.2008.1760.1)

How does it work?

More on this later, but basically iPhylo resolves identifiers for PubMed records, GenBank sequences, museum specimens, publications, etc. and adds the associated metadata to a local database. Wherever possible it resolves any links in the metadata (e.g., if a GenBank record mentions a specimen, iPhylo will try and retrieve information on that specimen). When you view an object in iPhylo, these links are displayed. iPhylo will also try and convert bibliographic records to identifiers (such as DOIs) if no identifiers are provided, and also extracts georeferences for specimens and sequences, either from original records or by using a georeferencing service. Taxonomic names are resolved using uBio, and are treated as "tags."

Examples

To do

Much too much. This demo doesn't include images and geospatial queries, both of which are implemented, nor does it enable users to add their own data (easy to do), or edit the data (I've experimental code to do this).