The problem with DNA barcoding
Issue 10 of Biodiversity Science introduced DNA barcoding as a method of biodiversity assessment, giving the example of Malaise trapping in the tropical forests of Honduras.
This technology has been revolutionary in terms of enabling species-level biodiversity data to be generated in the absence of taxonomic expertise. However, the barcoding method described in Issue 10 remains costly and time consuming because each specimen is processed in a separate reaction, and this means that data cannot be produced at very large scales.
In the Honduran project, for instance, it took two months to identify 5355 specimens from 16 Malaise trap samples. A well-designed study investigating the biodiversity response to a given change in the environment (while controlling appropriately for additional factors) is likely to require much greater sampling effort than this.
Thankfully, there is now a way to accelerate this process so that data on invertebrate biodiversity can be obtained at very large scales.
Instead of processing each specimen individually, the contents of each trap sample are blended together to form a multi-species ‘soup’. DNA is then extracted en masse from the soup, amplified using universal primers, and sequenced on a ‘next-generation’ sequencing machine. These differ from standard sequencers in that they can process very large numbers of specimens/species in a single reaction. Moreover, many samples can be pooled together in a single run to further reduce time and cost.
Species identifications are made by comparing sequences against a reference database (eg www.boldsystems.org) within an automated bioinformatics pipeline, and the result is a table giving the number of sequences assigned to each species in each sample. This can be used for performing any standard ecological analysis. Species that do not occur in the reference database cannot be named, but they can still contribute to diversity metrics, since they are identified as species-level entities (similar to the concept of morphospecies). Furthermore, they can almost always be identified to a higher taxonomic level.
Much research effort has been focused on validating metabarcoding by applying it to samples of known species composition (ie asking “does what came out match what we know went in?”). These studies have found that metabarcoding accurately recovers alpha-diversity (species richness) and beta-diversity (species turnover) information, in addition to generating the same management recommendations as morphological biodiversity datasets.
In short, metabarcoding is reliable and ready for use.
Time and cost: an example from the UK
In a recent collaboration between Forest Research and our research group at the University of East Anglia, 120 Malaise trap samples were collected from a UK plantation forest to investigate the relationship between forest stand characteristics and arthropod biodiversity. It would have taken years to identify all the specimens in every trap using either morphology or standard barcoding, so the traditional approach would have been to pick out a few easy-to-identify indicator groups, ignoring the rest of the specimens. Instead, we metabarcoded the entire contents of all 120 trap samples. This was done in a single sequencing run and actually used less than half of its total capacity, meaning that we could have sequenced 250 trap samples in one go!
Overall, the time and cost involved in processing 120 samples to the point of ecological analysis was almost identical to that reported for processing the 16 Honduran samples using standard barcoding, which illustrates the efficiency and cost-effectiveness of this method.
Comparing results from the UK and Honduras
It is interesting to compare the results of the two studies. As would be expected, fewer species were detected in the UK study, despite the greater sampling effort (1141 species from 120 traps, compared with 1720 from 16). However, the figure shows that the relative contribution of different taxonomic groups is remarkably consistent. In both cases, insects account for over 90% of arthropod species, and are themselves dominated by Diptera, while Coleoptera, Lepidoptera and Hemiptera each account for less than 10% of insect species.
Other research groups have shown that the same methods can be used to describe aquatic communities from DNA that organisms (including vertebrates) deposit in water; similarly, plant communities can be sequenced from DNA in soil, and elusive mammals have been detected in remote rainforest areas by sequencing the blood inside invertebrate parasites such as leeches.
Altogether, these advances mean that, for the first time, it is possible and practical to measure something approaching complete species diversity across large spatial and temporal scales. This has huge implications for our ability to understand how our actions affect biodiversity, and to manage natural environments accordingly. This is extremely exciting, and we hope that these methods will soon be taken up by environmental managers as well as by academic researchers.