Sliding window analyses for optimal selection of mini-barcodes, and application to 454-pyrosequencing for specimen identification from degraded DNA
Citations
Altmetric:
Date
2012
Type
Journal Article
Keywords
Fields of Research
Abstract
DNA barcoding remains a challenge when applied to diet analyses, ancient DNA studies, environmental DNA samples and,
more generally, in any cases where DNA samples have not been adequately preserved. Because the size of the commonly
used barcoding marker (COI) is over 600 base pairs (bp), amplification fails when the DNA molecule is degraded into smaller
fragments. However, relevant information for specimen identification may not be evenly distributed along the barcoding
region, and a shorter target can be sufficient for identification purposes. This study proposes a new, widely applicable,
method to compare the performance of all potential ‘mini-barcodes’ for a given molecular marker and to objectively select
the shortest and most informative one. Our method is based on a sliding window analysis implemented in the new R
package SPIDER (Species IDentity and Evolution in R). This method is applicable to any taxon and any molecular marker.
Here, it was tested on earthworm DNA that had been degraded through digestion by carnivorous landsnails. A 100 bp
region of 16 S rDNA was selected as the shortest informative fragment (mini-barcode) required for accurate specimen
identification. Corresponding primers were designed and used to amplify degraded earthworm (prey) DNA from 46
landsnail (predator) faeces using 454-pyrosequencing. This led to the detection of 18 earthworm species in the diet of the
snail. We encourage molecular ecologists to use this method to objectively select the most informative region of the gene
they aim to amplify from degraded DNA. The method and tools provided here, can be particularly useful (1) when dealing
with degraded DNA for which only small fragments can be amplified, (2) for cases where no consensus has yet been
reached on the appropriate barcode gene, or (3) to allow direct analysis of short reads derived from massively parallel
sequencing without the need for bioinformatic consolidation.
Permalink
Source DOI
Rights
Copyright: © 2012 Boyer et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Creative Commons Rights
Attribution