Despite being powerful for a wide range of applications, metabarcoding can only recover limited information about species abundance or biomass within an individual sample (frequency of occurrence across multiple samples can be a useful proxy for abundance where the sampling scheme allows). Metabarcoding can also fail to detect some taxa (‘dropouts’) the universal primers used to amplify the barcode gene are inevitably not optimal for every species.

This presents a problem where high confidence data on the distribution and/or abundance of particular species is required. This might indicator species, or those of ecological, cultural or economic importance that form the focus of a project.

In this scenario, we can use an alternative technique that avoids the amplification step and uses information from the whole mitochondrial genome instead of just the short barcode region. To do this, we first create a reference database by sequencing the mitochondrial genomes of the target species, which can now be done quickly and cost effectively (Gillet et al., 2014; Crampton-Platt et al., 2014).

Here is a simplified representation of a metagenomics pipeline:

We then sequence the bulk samples and assess species presence by matching genome-wide sequence fragments against the references, in a process known as read matching, to give high confidence species identifications and more reliable quantitative data.

Here is a simplified representation of the read matching pipeline:


This can be a particularly useful approach for long-term monitoring programmes because the reference database need be created only once at the outset of the project. Subsequent monitoring can then be carried out quickly and cheaply.

Contact us to discuss your metagenomics requirements.