Frequently Asked Questions
Read some of the questions we frequently get asked about our services below.
Environmental DNA applications
Yes, eDNA is a great surveillance tool for invasive species because it can often detect them at much lower populations levels than conventional surveys would. If the species in question is in the reference library, we will be able to identify it in metabarcoding datasets. Single-species qPCR tests can also be used to screen for the presence of particular species. There are good qPCR tests for species such as zebra and quagga mussels, and signal crayfish, and in the US eDNA has been used extensively for tracking the invasion of Asian carp in waterways around the Great Lakes (e.g. Jerde et al., 2013).
Is there any update on regulatory acceptance of eDNA methods for fish by the Environment Agency in England?
The Environment Agency is itself using eDNA for monitoring fish communities, and is working on a tool that would generate a Water Framework Directive (WFD)-compliant index score for lake fish communities, based on the work carried out in collaboration with the University of Hull (e.g. Lawson-Handley et al., 2019). It is best to check directly with the agencies with regard to specific projects.
Metabarcoding can be used to generate high-resolution datasets on the meiofaunal invertebrates (nematodes etc) and microorganisms living within the ocean floor sediments of areas earmarked for impact or restoration. These small organisms are numerous and respond quickly to impacts. As such metabarcoding can be used to track species biodiversity and community composition over time in relation to e.g. drilling impacts, and or restoration efforts. eDNA can also provide data on fish and marine mammal communities, and collecting water from different depths in the water column can reveal the different communities at each level.
Are reference libraries shared between institutes, i.e. is there a large shared global database (publically) available so conservation/research all over the world can be shared and help other studies in remote areas?
Large publicly available reference libraries do exist. These include the National Center for Biotechnology Information (NCBI) database, also known as Genbank, and the Barcode of Life Database (BOLD), and these are used as the basis for our species identification pipelines. However, Genbank in particular (which is the most extensive database) is known to contain many errors, so we have applied our own careful curation and quality control measures to a downloaded version and it is this that is used in our pipelines. Although these databases are often incomplete for poorly studied areas, they can be augmented with data from local or private databases and also through barcoding studies (where tissue or swabs from animals identified in the field are sequenced).
DNA in the environment
How long does DNA last for in the environment? Is there a risk of finding something that is no longer there?
The average half life of eDNA is about 48 hours but this varies depending on environmental conditions and small amounts of DNA have been known to last for weeks. The degradation of the DNA is slowest when it's cold, dark, or when the DNA is bound to sediment, and faster in more acidic environments. Collins et al., 2018 provides a good overview of eDNA persistence in marine environments, and Li et al., 2019 showed that there was no detectable eDNA signal 48 hours after removal of fish from small lakes. Findings are typically that eDNA analysis gives a good snapshot of contemporary communities and not historical records.
How do water currents affect the results? How does it work with movement of water downstream in rivers or in currents in the sea?
While eDNA has been known to theoretically travel many kilometres in rivers, its constant deposition and decay makes the probability of detection increasingly small over larger distances and depending on the size of the river and flow rates. In marine environments it was originally thought that water/DNA would be so well mixed that there would be limited spatial resolution. However, this was found not to be the case. In fact DNA from animals in specific habitats can be detected using eDNA in marine environments with surprisingly good spatial resolution, at least in shallow to moderately deep waters (see Port et al., 2016 for an example). In deep water, thermoclines, haloclines and strong currents could affect eDNA and as such multiple samples are recommended at different depths for best results.
Do you have examples of where you can demonstrate that eDNA can differentiate between different points in a fast flowing river?
In our Amazon baseline study we found that samples from consecutive sampling locations (c. 10 km apart from one another) were quite independent of one another even in a large river that is several hundred metres wide and flowing quickly. Shoaling species were useful here because you would see a large amount of DNA from them at one point and then no detection at the next location downriver. We did see some transfer of DNA where a natural barrier (a steep gorge) caused a significant change in species composition, and the sample taken just 1km or so downstream of the gorge still contained the DNA of the species in the upriver section. The river was flowing very fast here. In smaller, lowland rivers eDNA might integrate information over an area of up to around 1km upstream (there is some really nice work being carried out on this by researchers in Belgium at the moment using cage experiments).
Yes, this is definitely recommended. We suggest that triplicate or at minimum duplicate samples are taken at sites in rivers and marine environments for best results. This will maximise detection of rare species while also building confidence in the replicability of the approach for recovering the more common species. In still water (ponds and lakes) DNA does not always mix well so a subsampling approach should be adopted whereby subsamples are collected from a section of the shoreline and mixed before filtering. In a lake typically one kit will suffice for 400m of shoreline, where subsamples are collected every 20m. Where budgets may constrain the number of samples that can be collected, we work with our clients to help design a survey that will maximise the amount of information given the constrained number of samples.
The best way to avoid contamination is to use our sampling kits and follow the instructions provided. NatureMetrics filters are fairly robust to contamination because the filter membrane is enclosed within a plastic housing. However, you need to take care not to introduce contamination via the vessel you use to collect the water from the waterbody or to hold it while you filter. For this reason, the NatureMetrics sampling kits contain a sterile single-use collecting bag as well as gloves to prevent the introduction of your own DNA into the sample. If you are using a bucket to collect and/or hold the water, this will need to be cleaned with a 10% solution of household bleach to remove any traces of DNA, the bleach should be disposed of responsibly and then rinsed thoroughly with clean, distilled water before sampling.
A very small amount! So be careful about eating fish for lunch if you’re going sampling, we recently managed to tell one of our clients what he had eaten for lunch before taking the sample! Possible environmental sources of contamination include fishing bait and waste water from kitchens, which needs to be taken into account when choosing sampling locations - especially in populated areas.
Are there any potential issues with only collecting samples from the boundaries of a large water body. For GCN it's okay of course, but fish or other species that don't frequent that part of the water?
This will depend on the level of mixing in the water body, which may vary seasonally. For water bodies where there is a lot of mixing (i.e. rivers) eDNA is more homogeneously distributed. In still water (i.e. ponds) then there is much more spatial heterogeneity and so the probability of detection is lower if water is taken from a single point, and appropriate (sub)sampling design is key to cover all microhabitats. However, see Lawson-Handley et al. (2019) for a comprehensive study of spatial dynamics of eDNA in large lakes. This study concluded that shoreline sampling was sufficient to detect all species in Lake Windermere during the winter when more mixing occurred, and only missed one species (Arctic Charr, which lives deep in the middle of the lake) during summer when there was less mixing.
For the marine research, how can we be convinced that the sampling point can represent the wide range of study area? Do we have to take a lot of samples in proportion to the size of area?
eDNA in the marine environment is much more dilute than in freshwater systems and so the detection probability for any species in a given sample will be lower (note other survey methods are also less sensitive in the open ocean), and means it is important to filter more water and collect a greater number of samples in the marine environment. Generally, the more samples you can collect, the more representative and comprehensive your dataset will be. At the moment it’s very difficult to say how many samples are needed for a comprehensive survey in the ocean, or what depths these should be collected from, and spatial interpretation is also difficult because of the complexity of currents and other aspects of oceanography - there is definitely the opportunity for lots of large-scale research here! However, eDNA does still provide a lot of data in the marine environment and compares very favourably with alternative tools in this regard. In one pilot study we did in the North Sea, just three eDNA samples detected ⅔ of the species that had been recorded in a 2-year netting survey that had cost £150k.
This is dependent on multiple site factors and the study question. In river systems for example, confluence points in rivers represent areas of water and DNA mixing of two potentially distinct fish communities. Therefore a point collected within the tributary as well as upstream and downstream of its confluence with the main stem may be required to determine fish community equivalence in the different parts of the river system. Similarly this logic could be applied to barriers or dams in rivers. Other site factors to consider would be pollution sources, major land use or habitat differences and differences in riparian vegetation.
eDNA provides replicable and meaningful data on relative abundance of aquatic organisms, but not absolute abundance (except in some very specific cases where extensive calibration has taken place - see Levi et al., 2018 for an example of using eDNA to count salmon in Alaska). Some behavioural factors affect the amount of DNA given off by a particular species at a particular time (e.g. spikes of DNA associated with breeding or high levels of activity), and there are some interspecific differences in DNA shedding - for instance, small active fish tend to give off more DNA than large, slow ones. In rivers, if you detect a small trace of a species it is difficult to tell whether this means there are a small number of individuals close to the sampling point or a larger number some distance upstream. That said, overall the rank abundance of species based on eDNA data tends to be a good reflection of the community.
Does each DNA sequence represent 1 individual? Like you might have 12 DNA sequences of the same animal species, does it mean 12 different individuals of that same species were recorded?
A single sequence does not represent a single individual. This question is linked to the wider discussion on whether sequence count is correlated with abundance. See above.
Is there potential, now or in the future, to identify individuals within a population and calculate/estimate the size of a species population?
There is potential. We currently use relatively conserved and high copy regions of the genome to identify taxa, but the inherent properties of these DNA regions, which makes them well suited for species identification, also makes them less than ideal for individual/population level assessment. You would need to look at a more quickly evolving region of the genome, but these tend to be much more difficult to work with for various reasons. We’ve made some initial forays into this and some studies suggest that it may be possible to identify different haplotypes within the same species, but essentially this is still very much in the research phase. See Sigsgaard et al., 2020 for a recent review and synthesis of progress in this area.
NatureMetrics filter kits don’t have a shelf-life so can be kept until the need to use them. Once a sample has been taken and the preservative solution added as described in the kit instructions, it is stable at ambient temperature for several weeks. Samples should not be frozen or left in direct sunlight.
How much is there a risk that the eDNA collection can be compromised e.g. traces of DNA held in sediment
The persistence of eDNA is typically short lived and new eDNA will typically overwhelm remnant eDNA. That said, traces of eDNA can theoretically be detected from older sources, but these will likely be trace amounts, present only in very small fragments and screened out following quality control. A bigger risk is environmental contamination from fishing bait or wastewater, which should be taken into account when designing sampling campaigns, especially in populated areas.
In some respects yes, but this will depend on the target groups being studied. Studies we have conducted have indicated that more species are typically detected in the wet season in the tropics. This is likely as a result of more DNA being washed from riparian areas into rivers in the wet season. However increased water volumes following heavy rainfall may also slightly reduce sensitivity by diluting the DNA signal. These finescale spatial dynamics are still being investigated across a number of different projects, although eDNA gives good data in all seasons!
DNA in the lab
Contamination in the lab is a risk that we take extremely seriously. This is one of the advantages of being a commercial laboratory where we have full control over the use of our space and movement of people and equipment within the system, but it is one of the reasons that our costs are higher than those sometimes reported by academic research institutions. We operate a unidirectional workflow from kit preparation → DNA extraction → pre-PCR → post-PCR. DNA extractions from filters are carried out in a dedicated cleanroom facility where tissue samples are never handled, and which features positive air pressure with HEPA filters. Regular disinfectant schedules are in operation across all our labs, which includes a minimum of two daily cleans of all surfaces using chemicals that remove DNA before and after operations start. Surfaces are regularly cleaned between procedures to avoid cross sample contamination. For equipment (e.g. laminar flow hoods and pipettes) additional cleansing is carried out using DNA removal wipes. High intensity UV lights provide overnight irradiation in our laboratories, and UV light is also used to irradiate the flow hoods for 30 minutes prior to every PCR set-up. In addition to these steps, we operate a robust quality control system where negative controls are integrated throughout the workflow to check for contamination. If any of these negative controls show signs of contamination, the analysis is repeated. As a result of these measures we very rarely experience issues related to contamination in the laboratory.
If you're testing whether a single species is present or not, what is the advantage of using qPCR over PCR? And qPCR versus Metabarcoding?
qPCR (probe-based) assay is often more reliable than PCR (end-point PCR visualised on a gel) because the binding of the probe represents a third point (in addition to the two primer sequences) at which the target sequence must be matched, thereby reducing the risk of non-target amplification causing false positive results . qPCR is much faster to carry out in the lab than metabarcoding, which requires a substantial amount more lab work and computational processing. However, because qPCR and other types of single-species screening assay are ‘blind’ tests which give a positive or negative result without providing a sequence to confirm species identity, the assays need to be extremely rigorously validated before you can rely heavily on the results for decision-making in management contexts (Thalinger et al., 2020 gives a comprehensive overview of this). This can take a long time and is an expensive process. Metabarcoding can provide data on many more species than qPCR, and because it generates the DNA sequences that are used for determining species identity, high confidence can be ascribed to detections even early on in the assay validation process.
For our eDNA filter kits we use a salt and detergent-based lysis solution. This solution is non-hazardous, stable at room temperature, and doesn’t have the logistical difficulties associated with ethanol, which is another type of preservative which can be used. Ethanol is required to be used currently for Great Crested Newt tests in the UK. We have a range of preservatives that we use for other types of sample e.g. soil and insects, which we chose depending on the project, location, speed from sampling to processing etc.
DNA data processing - bioinformatics
Species not yet in the reference database cannot be definitively identified without obtaining a reference sequence. However, in many cases there are sufficient references from congeneric species to make a confident identification at genus-level. In a scenario where only one representative of that genus is expected in the sampled community, a putative species label can be associated with the metabarcoding sequence we generate, pending confirmation from reference material. Species unknown to science would similarly be identified as best we can given their similarity with available reference data, but we cannot distinguish between gaps in the reference database and gaps in taxonomic knowledge.
This is closely related to the previous question. We frequently encounter taxa where we are unable to make a definitive species-level assignment. This can happen where several species have identical sequences in the region targeted by the assay, or where there are gaps in the reference data. In these cases we make a taxonomic assignment at the lowest level at which we are confident, given the available reference data.
Can you give more details about the probabilistic species identification algorithm you use and are your methods public?
Our identification pipeline uses a published probabilistic algorithm. The algorithm accounts for gaps in the sequence reference databases by comparing the content of that database with expected taxonomic diversity, allowing the probability of a query sequence arising from an unreferenced species to be calculated correctly. This reduces the likelihood of overconfident assignments to species level due to database gaps. As the taxonomy is a key input to the algorithm, the probability of assignment can be estimated at each level and an acceptance threshold can be applied to ensure only high confidence assignments are retained.
We do sometimes see different sequences assigned to the same species, however our assays are chosen to target species-level variation across a broad taxonomic group. These are therefore relatively slow-evolving markers that do not coincide with regions targeted for conservation genetics. Population genetics is theoretically possible with this sequencing technology but requires significant R&D in the selection and testing of the marker for each target species and is not something we currently offer as a service.
We generate Operational Taxonomic Units (OTUs) as part of our data processing pipeline. This is a taxonomy-free clustering approach based purely on how similar the sequences generated are to one another. The threshold at which the clusters are defined varies between assays because of the properties of different gene regions that are used. We treat each OTU as a species-level entity and attempt to make a taxonomic assignment for each but this will not always be possible at species level. All OTUs are included in estimates of diversity, regardless of whether we are able to assign a species label.
Environmental DNA for Great Crested Newt and other single species Detection
We have assessed the efficacy of using filtration vs. the conventional sampling methods and we’ve shown (in common with other researchers, e.g. Spens et al., 2016) that the detection probabilities of filter kits is higher. This is likely because the standard GCN eDNA kits only actually sample 90 ml of water, while our filter kits, which have been designed to deal with the high turbidity of ponds, typically process an order of magnitude more water. The filters are easier to deal with logistically because they don’t involve shipping ethanol, are easier to process in the laboratory, and much more resistant to contamination. Unfortunately, a move to filter-based kits is not up to us, but the evidence has been made available to Natural England along with contact details of independent scientists who they can consult, and we hope that results derived from these kits may be accepted in coming years.
What about the risk of false positives e.g. from a Heron eating a GCN and pooping in the pond or ducks moving between ponds and carrying DNA from one to the other?
The technology is very sensitive, so theoretically there is potential for this type of natural contamination. We would expect that this is very rare though. When clients have speculated to us that they believe this may explain a surprising positive result, subsequent surveys have usually confirmed GCN presence.
For ponds, a single kit is usually enough as long as subsamples have been merged from around the perimeter of the pond. As the size of the waterbody increases, so should your sampling effort. The precise number of samples will usually be a trade-off between your budget and the amount of spatial resolution you need or the importance of detecting rare species. For most lakes, 5-10 samples are sufficient, given appropriate merging of subsamples but you may want to take more if you are targeting rare species. In rivers and streams, it depends on the size and flow rate of the waterbody and the area that you need to survey. We are always happy to advise on sampling design for specific projects.
DNA for different taxonomic groups
Yes, we offer qPCR tests for both Batrachochytrium dendrobatidis (Bd) and Batrachochytrium salamandrivorans (B-sal) chytrid species.
Yes. We’ve detected otters (Lutra lutra) with both single-species qPCR assays and metabarcoding assays. We’ve even detected Neotropical otters (Lontra longicaudis) and giant river otters (Pteronura brasiliensis) in Peru. Otters do seem to be slightly underrepresented in metabarcoding datasets given how much time they spend in the water. This may be because a lot of eDNA originates from faeces, and otter latrines are on land rather than in the water. We can also ID otters from their spraints (and even use metabarcoding of the spraints to understand their diets).
We have made some initial forays into analysing algal eDNA, but the group is such a diverse and informally named group that a single assay for so many different evolutionary lineages makes it difficult. Nevertheless we have managed to amplify and sequence algal eDNA, but this pipeline is still in its infancy and the reference databases very incomplete, so we would regard this as being during the R&D phase.
We have not yet worked on coral eDNA and it is a complex group. However we do think this is an exciting application with great potential and would love to develop it as part of a collaborative research project.
We do analyse DNA from soil, from which we typically generate data on soil fauna, bacteria and fungi. These groups are incredibly diverse, which gives them great power to indicate even fine scale ecological changes. This is a very active area of research for us.
This is classic invertebrate-derived DNA (iDNA). While we’ve never directly handled leeches within our immediate lab team, we’ve analysed leech iDNA data before and our co-founder Prof. Douglas Yu has processed over 30,000 leeches in his lab in China in the hunt for a possibly-extinct antelope (Ji et al., 2020)!
Currently, in addition to vertebrates, we can also analyse for Insects, Bacteria, Fungi, Crustaceans and Mussels. We have made some forays into detecting algae and diatoms - but these assays are still being developed and optimised. Diet analyses can also be performed on bird and bat faeces.
DNA based methods and traditional taxonomy
Do you think that traditional taxonomic skills and natural history research are still important? Should we be concerned that the use of eDNA could lead to a loss of taxonomic skills in the scientific community?
There are many reasons why morphological taxonomy is still important. For a start, molecular taxonomy in the sense that we use it relies on a reference database underpinned by traditional taxonomy - and this will always be the case. Molecular taxonomy can’t on its own discover and describe new species - that will always rely on traditional taxonomic skills. There are simply not enough taxonomists in the world to be able to generate the monitoring data that we need at the scales we need it to underpin decision-making - especially in the tropics - so there is a need for new tools and approaches. We believe that widespread adoption of molecular tools will actually highlight the need for good taxonomists.
NatureMetrics kits, products and business
NatureMetrics will send you a report that summarises your results primarily in a species by sample table. We also provide quality control tests and checks that have been carried out. We can also send you the species-by-sample table in Excel format. This will give taxonomic identification at multiple levels and tell you how many sequences each species was represented by in each sample. For large and complex projects, we can also offer more extensive reporting and ecological analysis.
For non EU samples how do you cope with Nagoya Protocol obligations on access to genetic resources? Are they a barrier?
They are an important consideration in many projects and one of the reasons that we have begun to form partnerships with in-country labs in many cases where we have opportunities for large projects in non-EU countries. Every country implements the Nagoya protocol in a different way so some places are easier to work in than others and we have to take it on a case-by-case basis. We invest in building up strong networks of governmental and non-governmental stakeholders in the regions where we work, and supporting the development of key resources such as national reference libraries.
Interesting to hear there is a Peruvian lab that you collaborate with and that has the capacity/equipment to do these types of analysis. Do you also work with labs in other parts of the world?
Yes, we are in discussions with labs in Liberia, Ghana, Mozambique, South Africa, Singapore, Malaysia and Indonesia. We also have contacts in labs in Brazil and Colombia. Quality control is really important when working with partner labs so we have to make sure that we have sufficient resources to cover this when we enter into new partnerships.
You can find these on the Publications section of our website
What costs are associated with the lab processing and how do overall costs of these type of surveys compare with conventional surveys, for example fish nettings?
Analysis of an eDNA sample costs £275 + VAT and we provide discounts for conservation NGOs and researchers. This includes the sampling kit and full laboratory analysis with QC testing and reporting. Electrofishing is an order of magnitude more expensive and often detects fewer species than a single eDNA sample (small, bottom-dwelling fish such as bullheads and sticklebacks are routinely missed by electrofishing). Cost comparison with netting depends very much on what kind of boat you would need to use for the netting and whether you were in an environment where you could collect eDNA samples from the shoreline (e.g. in a lake). Because eDNA is more sensitive for fish surveys than netting is, even if you have to use a boat to collect the eDNA samples, you will need less field effort to capture the same amount of data (see Hänfling et al., 2019 for comparison of catch per unit effort in Lake Windermere).
Future Research and Development
Are there options being tested that skip the PCR step that may bias some results due to primer bias?
PCR, the stage that you amplify the target region of the genome, relies on designing accurate primers that can bind to a complementary part of the target genome. If there are any mismatches between the primers and the target region then the binding is less efficient and these inefficiencies are carried through the process. These binding inefficiencies will differ among taxa and this is what’s called primer bias. At best, primer bias will result in slightly fewer sequences being detected for that taxon, but at worse it could result in false negatives. PCR free methods (i.e. that don’t have that amplification step) avoid the need for primers and are based around the sequencing of genomic DNA and then the subsequent stitching together of that information for the purposes of identification (among other things). Our co-founder, Prof. Douglas Yu, recently published a paper on PCR-free metagenomics of pollen (Peel et al., 2019), but for the moment this remains too expensive to be a commercially viable option for routine monitoring. Moreover it is not a good solution for eDNA samples as the concentration of target DNA in these samples is just too low for PCR-free methods to provide useful data. We are keeping an eye on this exciting area of research and are sure it will progress quickly.