Automated species identification

Automated species identification

The automated identification of biological objects such as insects (individuals) and/or groups (e.g., species, guilds, characters) has been a dream among systematists for centuries. The goal of some of the first multivariate biometric methods was to address the perennial problem of group discrimination and inter-group characterization. Despite much preliminary work in the 1950s and '60s, progress in designing andimplementing practical systems for fully automated object biological identification has proven frustratingly slow. As recently as 2004 Dan Janzen updated the dream for a new audience:

"The spaceship lands. He steps out. He points it around. It says ‘friendly–unfriendly—edible–poisonous—safe– dangerous—living–inanimate’. On the next sweep it says ‘Quercus oleoides—Homo sapiens—Spondias mombin—Solanum nigrum—Crotalus durissus—Morpho peleides— serpentine’. This has been in my head since reading science fiction in ninth grade half a century ago."

The species identification problem

Janzen’s preferred solution to this classic problem involved building machines to identify species from their DNA. His predicted budget and proposed research team is “US$1 million and five bright people.” However, recent developments in computer architectures, as well as innovations in software design, have placed the tools needed to realize Janzen’s vision in the hands of the systematics community not in several years hence, but now; and not just for DNA barcodes,but for digital images of organisms too. A recent survey of results accuracy results for small-scale trials (<50 taxa) obtained by such systems (Gaston and O’Neill 2004) shows an average reproducible accuracy of over 85 percent with no significant correlation between accuracy and the number of included taxa or the type of group being assessed (e.g., butterflies, moths, bees, pollen, spores, foraminifera, dinoflagellates, vertebrates). Moreover, these identifications—often involving thousands of individual specimens—can be made in a fraction of the time required by human experts and can be done on site, on demand, anywhere in the world.

These developments could not have come at a better time. As the taxonomic community already knows, the world is running out of specialists who can identify the very biodiversity whose preservation has become a global concern. In commenting on this problem in palaeontology as long ago as 1993, Roger Kaesler recognized:

"“… we are running out of systematic paleontologists who have anything approaching synoptic knowledge of a major group of organisms ... Paleontologists of the next century are unlikely to have the luxury of dealing at length with taxonomic problems … [Paleontology] will have to sustain its level of excitement without the aid of systematists, who have contributed so much to its success.”".

This expertise deficiency cuts as deeply into those commercial industries that rely on accurate identifications (e.g., agriculture, biostratigraphy) as it does into a wide range of pure and applied research programmes (e.g., conservation, biological oceanography, climatology, ecology). It is also commonly, though informally, acknowledged that the technical, taxonomic literature of all organismal groups is littered with examples of inconsistent and incorrect identifications. This is due to a variety of factors, including taxonomists being insufficiently trained and skilled in making identifications (e.g., using different rules-of-thumb in recognizing the boundaries between similar groups), insufficiently detailed original group descriptions and/or illustrations, inadequate access to current monographs and well-curated collections and, of course, taxonomists having different opinions regarding group concepts. Peer review only weeds out the most obvious errors of commission or omission in this area, and then only when an author provides adequate representations (e.g., illustrations, recordings, gene sequences) of the specimens in question.

Systematics too has much to gain, both practically and theoretically, from the further development and use of automated identification systems. It is now widely recognized that the days of systematics as a field populated by mildly eccentric individuals pursuing knowledge in splendid isolation from funding priorities and economic imperatives are rapidly drawing to a close. In order to attract both personnel and resources, systematics must transform itself into a “large, coordinated, international scientific enterprise” (Wheeler, 2003). Many have identified use of the Internet—especially via the World Wide Web — as the medium through which this transformation can be made. While establishment of a virtual, GenBank-like system for accessing morphological data, audio clips, video files and so forth would be a significant step in the right direction, improved access to observational information and/or text-based descriptions alone will not address either the taxonomic impediment or low identification reproducibility issuessuccessfully. Instead, the inevitable subjectivity associated with making critical decisions on the basis of qualitative criteria must be reduced or, at the very least, embedded within a more formally analytic context.

Properly designed, flexible, and robust, automated identification systems, organized around distributed computing architectures and referenced to authoritatively identified collections of training set data (e.g., images, gene sequences) can, in principal, provide all systematists with access to the electronic data archives and the necessary analytic tools to handle routine identifications of common taxa. Properly designed systems can also recognize when their algorithms cannot make a reliable identification and refer that image to a specialist (whose address can be accessed from another database). Such systems can also include elements of artificial intelligence and so improve their performance the more they are used. Most tantalizingly, once morphological (or molecular) models of a species have been developed and demonstrated to be accurate, these models can be queried to determine which aspects of the observed patterns of variation and variation limits are being used to achieve the identification, thus opening the way for the discovery of new and (potentially) more reliable taxonomic characters.

References cited

* Gaston, K. J., and M. A. O'Neill. 2004. "Automated species identification—why not?" Philosophical Transactions of the Royal Society of London, Series B 359:655–667 (see [http://www.journals.royalsoc.ac.uk/(kqmnfs2d54r4owreisnhkg45)/app/home/contribution.asp?referrer=parent&backto=issue,8,19;journal,37,229;linkingpublicationresults,1:102022,1] ).
* Janzen, D. H. 2004. "Now is the time". Philosophical Transactions of the Royal Society of London, Series B 359:731–732 (see [http://www.ucalgary.ca/~dsikes/zool575/readings/Janzen%20(2004).pdf] ).
* Kaesler, R. L. 1993. "A window of opportunity: peering into a new century of paleontology. Journal of Paleontology" 67:329–333.
* Wheeler, Q. D. 2003. "Transforming taxonomy". The Systematist No. 22:3–5

External links

Here are some links to the home pages of three mature species identification systems. Whileall were initially designed to identify specious invertebrate groups, the SPIDA and DAISY system are essentially generic and capable of classifying any image material presented. The ABIS system is restricted to insects with membranous wings as it operates by matching a specific set of characters based on wing venation.

* [http://research.amnh.org/invertzoo/spida/common/index.htm The SPIDA system]
* [http://www.informatik.uni-bonn.de/projects/ABIS ABIS]
* [http://www.tumblingdice.co.uk/daisy DAISY]


Wikimedia Foundation. 2010.

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

  • Digital Automated Identification SYstem (DAISY) — Developer(s) Mark A. O Neill Stable release 2.0.0 / December 1, 2010; 11 months ago ( …   Wikipedia

  • Mark A. O'Neill — For other people named Mark O Neill, see Mark O Neill (disambiguation). Mark A. O Neill Mark A. O Neill Born …   Wikipedia

  • Fingerprint — This article is about human fingerprints. For other uses, see Fingerprint (disambiguation) …   Wikipedia

  • police — /peuh lees /, n., v., policed, policing. n. 1. Also called police force. an organized civil force for maintaining order, preventing and detecting crime, and enforcing the laws. 2. (used with a pl. v.) members of such a force: Several police are… …   Universalium

  • Fluorescence in situ hybridization — A metaphase cell positive for the bcr/abl rearrangement (associated with chronic myelogenous leukemia) using FISH. The chromosomes can be seen in blue. The chromosome that is labeled with green and red spots (upper left) is the one where the… …   Wikipedia

  • Taxonomie — Eine Taxonomie (altgr. táxis ,Ordnung’ und nómos ,Gesetz’) oder Klassifikationsschema ist ein einheitliches Verfahren oder Modell, um Objekte eines gewissen Bereichs (ggf. unter Zuhilfenahme eines Klassifikationsinstruments) nach bestimmten… …   Deutsch Wikipedia

  • Fluorescent in situ hybridization — can be used to find out where the fluorescent probe bound to the chromosome. FISH is often used for finding specific features in DNA. These features can be used in genetic counseling, medicine, and species identification. Probes Probes are often… …   Wikipedia

  • Barbastelle — Taxobox name = Barbastelle image width = 240px status = VU | status system = IUCN2.3 regnum = Animalia phylum = Chordata classis = Mammalia ordo = Chiroptera familia = Vespertilionidae genus = Barbastella species = B. barbastellus binomial =… …   Wikipedia

  • Greater Horseshoe Bat — Taxobox name = Greater Horseshoe Bat status = LR/nt | status system = IUCN2.3 status ref = [IUCN2006|assessors=Chiroptera Specialist Group|year=2000|id=19517|title=Rhinolophus ferrumequinum|downloaded=11 May 2006 Database entry includes a brief… …   Wikipedia

  • Common Pipistrelle — Conservation status Least Concern …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”