Home » Info Hub » eDNA Glossary

eDNA Glossary

eDNA Glossary

Overcoming the jargon

As you discover more about the scope and applications of DNA-based monitoring you may come across some unfamiliar terminology. Here, you can find all the terms and technologies associated with our processes and methods.

We are continuously reviewing our DNA glossary in line with industry developments.

If you can’t find what you’re looking for, or if you have a query, please get in touch.

eDNA

Short for ‘environmental DNA’. Refers to DNA deposited in the environment through excretion, shedding, mucous secretions, saliva etc. This can be collected in environmental samples (e.g. water, sediment) and used to identify the organisms that it originated from. eDNA in water is broken down by environmental processes over a period of days to weeks. It can travel some distance from the point at which it was released from the organism, particularly in running water. eDNA in soil can bind to organic particles and persist for a very long time (sometimes hundreds or thousands of years). eDNA is sampled in low concentrations and can be degraded (i.e. broken into short fragments), which limits the analysis options.

Organismal DNA

Refers to DNA sampled directly from the organism through whole organism collection (e.g. invertebrates), swabbing, blood sampling, clipping etc. Usually high concentration and non-degraded. The location of the organism at the time of sampling is definitively known. Overall there are fewer uncertainties than for eDNA.

Community DNA

Refers to DNA extracted from a mixture of different organisms. Could be eDNA (environmental samples almost always contain DNA from a mixture of species) or organismal DNA (e.g. homogenised insect trap samples).

PCR / amplification

Polymerase chain reaction. A process by which millions of copies of a particular DNA segment are produced through a series of heating and cooling steps. Known as an ‘amplification’ process. One of the most common processes in molecular biology and a precursor to most sequencing-based analyses.

Primers

Short sections of synthesised DNA that bind to either end of the DNA segment to be amplified by PCR. Can be designed to be totally specific to a particular species (so that only that species’ DNA will be amplified from a community DNA sample), or to be very general so that a wide range of species’ DNA will be amplified. Good design of primers is one of the critical factors in DNA-based monitoring.

qPCR

Stands for ‘quantitative PCR’, sometimes also known as ‘real-time PCR’. A PCR reaction incorporating a coloured dye that fluoresces during amplification, allowing a machine to track the progress of the reaction. Often used with species-specific Primers where detection of amplification is used to infer presence of the target species’ DNA in the sample. If the species is not present in the sample, no fluorescence will be detected. The high specificity of the qPCR method makes it ideal for situations where a single target is required. The most common use of qPCR testing is for detection of Great Crested Newts from water samples.

Sanger Sequencing

Traditional DNA sequencing. Each reaction produces a single sequence so it only works on amplified DNA of a single species. A sequence is a series of nucleotide bases represented by the letters A, T, C & G. Here is the sequence of part of the 12S gene for a minnow (Phoxinus phoxinus): CACCGCGGTTAAACGAGAGGCCCTAGTTAATAATTGACGGCGTAAAGGGTGGTTAGGGGGTGTAATGTAATAAAGCCGAATGGCCCTTTGGCTGTCATACGCTTCTAGGTGTCCGAAGCCCAACATACGAAAGTAGCTTTAAGAAAGTCCACCTGACGCCACGAAAACTGAGAAA

High-Throughput Sequencing

Technology developed in the 2000s that produces millions of sequences in parallel. Enables thousands of different organisms from a mixture of species to be sequenced at once, so community DNA can be sequenced. Various different technologies exist to do this, but the most commonly used platform is Illumina’s MiSeq. Also known as Next-Generation Sequencing (NGS) or parallel sequencing.

Barcode Genes

Refers to genes that can be used for species identifications. Different regions of DNA mutate at different speeds. Fast-changing regions are useful for population studies and paternity testing, while the most stable regions can be used for assessing deep evolutionary relationships between groups of organisms. Certain regions change at just the right rate to be stable within a species but different between species. These are known as barcode genes. The official barcode gene for animals is Cytochrome Oxidase 1 (COI or cox-1). Other genes used as animal barcodes include 12S, 16S, 18S and Cytochrome-b (cytb). For plants, the most commonly used genes are MatK, rbcL, trnL and ITS.

 

Metabarcoding

Refers to identification of species assemblages from community DNA using barcode Genes. PCR is carried out with non-specific primers, followed by high-throughput sequencing and bioinformatics processing. Can identify hundreds of species in each sample, and 100+ different samples can be processed in parallel to reduce sequencing cost. Read more about metabarcoding by clicking here

Reference Databases

Refers to libraries of DNA sequences (usually from barcode genes) that have been generated from species of known identity. Sequences from unidentified organisms – obtained either by Sanger sequencing or high-throughput sequencing – are compared against a reference database to make species identifications. Databases can be curated (e.g. the Barcode of Life Database – BOLD – www.boldsystems.org) or uncurated (e.g. Genbank – www.ncbi.nlm.nih.gov). In curated databases, identifications are scrutinised and verified; in uncurated databases they are not. GenBank is therefore far more extensive than BOLD, but contains many errors.

Bioinformatics

Refers to a data processing pipeline that takes the raw sequence data from high-throughput sequencing (often 20 million sequences or more) and transforms it into usable ecological data. Key steps for metabarcoding pipelines include quality filtering, trimming, merging paired ends, removal of sequencing errors such as chimeras, clustering of similar sequences into molecular taxonomic units (each of which approximately represents a species), and matching one sequence from each cluster against a reference database. The output is a species-by-sample table showing how many sequences from each sample were identified as each species.

More Questions?

Read our FAQS

Online Portal

Register now

Client Testimonials

Read more

More Questions?

Read our FAQS

Client Testimonials

Read more

Online Portal

Register now

Stay Informed

As global pioneers in our field, we’re constantly accelerating the scope and impact of our technologies and services.

Thank you for subscribing to our mailing list.
There was an error trying to process your subscription. Please try again later.

Already registered for my.naturemetrics?

Log In Now

Would you like to register for an account?

Contact us today and one of our team will set you up with a my.naturemetrics account.

Register Now
Go to Top