DNA barcoding provides a rapid, accurate, and standardised method for species-level identification using short DNA sequences. Such a standardised identification method is useful for mapping all the species on Earth, particularly when DNA sequencing technology is cheaply available. There are many nations in Asia with many biodiversity resources that need to be mapped and registered in databases.
DNA barcoding helps researchers to understand evolutionary and genetic relationships by assembling molecular, morphological, and distributional data. Species-level identification through DNA barcoding is usually accomplished by the retrieval of a short DNA sequence from a standard part of the genome (i.e., 650-base fragment of the 5' end of the mitochondrial cytochrome c oxidase I (COI) gene for animal species) from the specimen under investigation. The barcode sequence from each unknown specimen is then compared with a library of reference barcode sequences derived from individuals of known identity.
The accurate identification of animal species depends on the availability of reference sequence data, which are currently deposited in many public libraries, including GenBank, BOLD, Medicinal Materials DNA Barcode Database (MMDBD), International Nucleotide Sequence Database Collaboration (INSDC), and Barcode Index Number System (BIN). These databases contain a variety of sequences assigned to corresponding taxa, which is useful for comparative analysis of sequence variations. For example, GenBank, a commonly used database in barcoding studies, has included more than 1 Terabase sequence data with relatively broad taxon coverage. Another database BOLD has already collected more than 2 million COI sequences from about 170,000 species, and INSDC has recorded extensive Cytb and ITS sequence information. In addition, MMDBD focuses on the barcode information of medicinal plants and animals (over 1,700 species) listed in the Chinese Pharmacopoeia, American Herbal Pharmacopoeia, and other related references. Interestingly, MMDBD also includes sequence information of common adulterants and substitutes.