- Research article
- Open Access
A systematic survey of regional multi-taxon biodiversity: evaluating strategies and coverage
BMC Ecology volume 19, Article number: 43 (2019)
In light of the biodiversity crisis and our limited ability to explain variation in biodiversity, tools to quantify spatial and temporal variation in biodiversity and its underlying drivers are critically needed. Inspired by the recently published ecospace framework, we developed and tested a sampling design for environmental and biotic mapping. We selected 130 study sites (40 × 40 m) across Denmark using stratified random sampling along the major environmental gradients underlying biotic variation. Using standardized methods, we collected site species data on vascular plants, bryophytes, macrofungi, lichens, gastropods and arthropods. To evaluate sampling efficiency, we calculated regional coverage (relative to the known species number per taxonomic group), and site scale coverage (i.e., sample completeness per taxonomic group at each site). To extend taxonomic coverage to organisms that are difficult to sample by classical inventories (e.g., nematodes and non-fruiting fungi), we collected soil for metabarcoding. Finally, to assess site conditions, we mapped abiotic conditions, biotic resources and habitat continuity.
Despite the 130 study sites only covering a minute fraction (0.0005%) of the total Danish terrestrial area, we found 1774 species of macrofungi (54% of the Danish fungal species pool), 663 vascular plant species (42%), 254 bryophyte species (41%) and 200 lichen species (19%). For arthropods, we observed 330 spider species (58%), 123 carabid beetle species (37%) and 99 hoverfly species (33%). Overall, sample coverage was remarkably high across taxonomic groups and sufficient to capture substantial spatial variation in biodiversity across Denmark. This inventory is nationally unprecedented in detail and resulted in the discovery of 143 species with no previous record for Denmark. Comparison between plant OTUs detected in soil DNA and observed plant species confirmed the usefulness of carefully curated environmental DNA-data. Correlations among species richness for taxonomic groups were predominantly positive, but did not correlate well among all taxa suggesting differential and complex biotic responses to environmental variation.
We successfully and adequately sampled a wide range of diverse taxa along key environmental gradients across Denmark using an approach that includes multi-taxon biodiversity assessment and ecospace mapping. Our approach is applicable to assessments of biodiversity in other regions and biomes where species are structured along environmental gradient.
The vast number of species on Earth have yet to be described, challenging our understanding of biodiversity . For a deeper understanding of what determines the distribution of species across the planet, comprehensive data on species occurrence and environmental conditions are required. While some progress has been made in understanding the distribution of biodiversity at coarse spatial resolution, our knowledge of biodiversity at high spatial resolution is deficient . In this study, we consider biodiversity as the richness and spatial turnover of taxonomic units, whether species or operational taxonomic units (OTUs) derived by eDNA (environmental DNA) metabarcoding. While progress has been made in the interpretation and prediction of richness and turnover of vascular plants and vertebrates, various types of bias, e.g. temporal, spatial, and taxonomic bias , have constrained similar advances for less well-known, but diverse groups such as fungi and insects . As a result, conservation management is typically based on biodiversity data from a non-random subset of taxa .
Recent developments in molecular techniques—in particular the extraction and sequencing of eDNA—hold the promise of more time-efficient sampling and identification of species [5, 6]. Further, eDNA enables the exploration of communities and organisms not easily recorded by traditional biodiversity assessment, such as soil-dwelling nematodes . In fact, PCR-based methods combined with DNA sequencing have already provided valuable insight into the taxonomic diversity within complex environmental samples, such as soil [8,9,10] and water [e.g. 11, 12]. Due to the ongoing rapid development in DNA sequencing technologies, with the emergence of next generation sequencing (NGS) techniques—generating billions of DNA sequences —an environmental sample could now be analyzed to a molecular depth that gives an almost exhaustive picture of the species composition at the site of collection. Despite this potential, rigorous assessments with complete taxonomic coverage from eDNA samples are still missing [6, 12, 14]. To assess the suitability and potential of eDNA data in complementing—or even replacing—traditional field survey data, tests on comprehensive data sets are needed.
Undertaking an ambitious biodiversity field study across a wide geographical space comes with major logistical and methodological challenges. It is not clear what environmental gradients structure biodiversity across the tree of life and for most taxa standardized field protocols to sample species occurrences are non-existent. The recently developed ecospace framework suggests that biodiversity varies in relation to its position along environmental gradients (position), the availability of biotic resources, such as organic matter and structures e.g. trees for epiphytes (expansion), and spatio-temporal extent of biotopes (continuity) . Environmental conditions and local processes can be a template shaping local biodiversity (e.g. through environmental filtering) [16, 17]. This template is highlighted by the ecospace position of sampled biotopes in abiotic environmental space. In addition to the physico-chemical conditions shaping abiotic gradients—particularly important to autotrophic organisms—the presence and abundance of specific biological resources, crucial to heterotrophic organisms, such as specialist herbivores, detritivores and saproxylic species are likely important and thus should be considered . The quantification of biotic resources and structures, e.g. dead wood, dung and carcasses, is not often included in community studies, despite the limited knowledge in the area [15, 19] which speaks for further studies. Spatial and temporal processes at regional extent, such as extinction, speciation and migration, shape species pools and thereby set the limits to local richness and species composition [16, 17, 20]. In order to improve our understanding of biodiversity patterns, local and regional factors should be considered concurrently [17, 21].
In this study, we used the ecospace framework as guideline to develop a comprehensive sampling design for large-scale mapping of variation in biodiversity and environmental variation across Denmark. Data collection and analysis was carried out as part of a research project (called Biowide).
The fundamental and radical claim of ecospace is that a low-dimensional environmental hyperspace can be used to predict and forecast variation in multi-taxon species richness. Testing this claim demands a dataset covering the variation of the terrestrial environment in a region of significant spatial coverage, a representative sample of the major taxonomic groups contributing to α-diversity and a mapping of the most important environmental factors defining the conditions for the terrestrial biota.
The project aimed to cover all of the major environmental gradients, including variation in soil moisture, soil fertility and succession, as well as habitats under cultivation. Within this environmental space spanned by 130 40 × 40 m sites, we performed a systematic and comprehensive sampling of the environment and biodiversity. We combined traditional species observation and identification with modern methods of biodiversity mapping in the form of massive parallel sequencing of eDNA extracted from soil samples.
In this paper we present the inventory and evaluate whether we achieved the comprehensive and representative data collection needed to test the claimed generality of the ecospace framework.
Study area and site selection
We aimed to characterize biodiversity across the country of Denmark (Fig. 1a)—a lowland area of 42,934 km2 and an elevational range of 0–200 m above sea level. While there are some limestone and chalk outcrops, there is no exposed bedrock in the investigated area. Soil texture ranges from coarse sands to heavy clay and organic soils of various origins . Land use is dominated by arable land (61%), most of which is in annual rotation, while forests are mostly plantations established during the 19th and 20th centuries. Scrubs cover approximately 17%, natural and semi-natural terrestrial habitats some 10%, and freshwater lakes and streams 2%. The remaining 10% is made up of urban areas and infrastructure [23, 24].
When selecting sites, we considered major environmental gradients, the potential size of the sampling units (sites), as well as practicalities of sampling across the large geographical space within the same season. The sites were 40 × 40 m which was a compromise between within-site homogeneity and the representativeness of a particular habitat type. We stratified site selection according to the identified major environmental gradients, including the intensity of human land use. We measured 30 sites that were cultivated habitats and 100 sites that were natural and semi-natural habitats. This balance between natural and cultivated habitat was chosen, because we expected cultivated habitats to have shorter environmental gradients. The cultivated subset represented major land-use categories and the natural subset was stratified across natural gradients in soil fertility, soil moisture, and successional stage from sparsely vegetated to closed canopy forest, (Additional file 1: Appendix A). We deliberately excluded linear features, such as hedgerows and road verges, urban areas with predominantly exotic plants as well as saline and aquatic habitats, but included temporarily inundated heath, dune depressions and wet mires.
The final set of 25 sampling classes consisted of six cultivated habitat types; three types of fields (rotational, leys, and oldfield) and three types of plantations (beech, oak, and spruce). 18 natural classes consisted of all factorial combinations of natural soil fertility (fertile or infertile), moisture (dry, moist, or wet), and successional stage (low vegetation with bare soil, closed herb/scrub, or forest) (Additional file 1: Appendix A). Finally, we included a class of perceived areas of high species richness  in Denmark. These sites were selected subjectively by performing a public poll among active natural history volunteers in the Danish nature conservation and nature management societies. The 25 classes were replicated in each of five geographical regions within Denmark (Fig. 1a). The result was 130 sites with 18 natural, 6 cultivated, and two perceived areas of high species richness evenly distributed across each of five geographic regions of Denmark (Table 1). For logistical reasons, we did not place any sites on Bornholm although we acknowledge that this island is geologically different than the rest of Denmark.
For the 18 natural habitat classes, site selection through stratified random sampling was guided by a large nation-wide dataset of vegetation plots in semi-natural habitats distributed across the entire country (n = 96,400 plots of 78.5 m2 each, http://www.naturdata.dk) from a national monitoring and mapping project  and in accordance with the EU Habitats Directive . We used environmental conditions computed from plant indicator values to select candidate sites for each class. First, we calculated plot mean values for Ellenberg indicator values based on vascular plants species lists  and Grime CSR-strategy allocations of recorded plants , the latter were recoded to numeric values following Ejrnæs and Bruun . We excluded saline and artificially fertilized habitats by excluding plots with Ellenberg S > 1 or Ellenberg N > 6. We then defined stratification categories as: fertile (Ellenberg N 3.5–6.0), infertile (Ellenberg N < 3.5), dry (Ellenberg F < 5.5), moist (Ellenberg F 5.5–7.0), wet (Ellenberg F > 7.0), early succession (Grime R > 4 and Ellenberg L > 7 or > 10% of annual plants), late succession (mapped as forest), mid succession (remaining sites).
To reduce transport time and costs, all 26 sites within each region were grouped into three geographic clusters (Fig. 1a). The nested sampling design allowed us to take spatially structured species distributions into account . The procedure for site selection involved the following steps:
Designation of three geographic clusters within each region with the aim to cover all natural classes while (a) keeping the cluster area below 200 km2 and (b) ensuring high between-cluster dispersion in order to represent the geographic range of the region. In practice, perceived areas of high species richness were chosen first, then clusters were placed with reference to the highest ranking areas of high species richness and in areas with a wide range of classes represented in the national vegetation plot data .
Representing the remaining 24 classes in each region by selecting 8–9 potential sites in each cluster. Sites representing natural classes were selected from vegetation plot data. Cultivated classes were assumed omnipresent and used as buffers in the process of completing the non-trivial task of finding all classes within each of three cluster areas of < 200 km2 in each region.
Negotiating with land owners and, in case of disagreement, replacing the preferred site with an alternative site from the same class.
After each of the 130 sites were selected using available data, we established each 40 × 40 m site in a subjectively selected homogenous area that accounted for topography and vegetation structure. Each site was divided into four 20 × 20 m quadrants, and from the center of each quadrant a 5 m radius circle (called a plot) was used as a sub-unit for data collection to supplement the data collected at site level (40 × 40 m) (Fig. 1b).
Collection of biodiversity data
For each of the 130 sites, we aimed at making an unbiased and representative assessment of multi-taxon species richness. Data on vascular plants, bryophytes, lichens, macrofungi, arthropods and gastropods were collected using standard field inventory methods (Additional file 2: Appendix B). For vascular plants, bryophytes and gastropods, we collected exhaustive species lists. For the remaining taxonomic groups that are more demanding to find, catch, and identify, we aimed at collecting a reproducible and unbiased sample through a standardized level of effort (typically 1 h). Multiple substrates (soil, herbaceous debris, wood, stone surfaces and bark of trees up to 2 m) were carefully searched for lichens and macrofungi at each site. For fungi, we visited each site twice during the main fruiting season in 2014—in August and early November—and once during the main fruiting season in 2015—between late August and early October. Specimens that were not possible to identify with certainty in the field were sampled and, when possible, identified in the laboratory. For arthropod sampling, a standard set of pitfall traps (including meat-baited and dung-baited traps), yellow Möricke pan traps and Malaise traps were operated during a fixed period of the year. In addition, we used active search and collection methods, including sweep netting and beating as well as expert searches for plant gallers, miners and gastropods. Finally, we heat-extracted collembolas and oribatid mites from soil cores. Due to the limited size of the sites relative to the mobility of mammals, birds, reptiles and amphibians, data on these groups were not recorded. Records of arthropods were entered in https://www.naturbasen.dk. Records of fungi were entered in https://svampe.databasen.org/. All species occurrence data and environmental data has been made available at the project home page http://bios.au.dk/om-instituttet/organisation/biodiversitet/projekter/biowide/. Species data will be made available for GBIF (http://www.gbif.org) through the above-mentioned web portals. Specimens are stored at the Natural History Museum Aarhus (fungal specimens at the fungarium at the Natural History Museum of Denmark). For further details on the methods used for collection of biodiversity data see Additional file 2: Appendix B.
Collection of eDNA data
We used soil samples collected from all 130 sites for the eDNA inventory. At each site, we sampled 81 soil cores in a 9 × 9 grid covering the entire 40 × 40 m plot and pooled the collected samples after removal of coarse litter. We homogenized the soil by mixing with a mixing paddle mounted on a drilling machine. A subsample of soil was sampled from the homogenized sample and DNA was extracted for marker gene amplification and sequencing . We chose the MiSeq platform by Illumina for DNA sequencing. MiSeq is adapted to amplicon sequencing . For further details on methods for eDNA data generation and considerations on eDNA species richness and community composition measures see Additional file 2: Appendix B.
Data from the fungal eDNA community matrix was mapped to the Darwin Core data standard (http://rs.tdwg.org/dwc/) and wrapped in a DwC archive for publication to the Global Biodiversity Information Facility. The ‘dataGeneralizations’ field was used to indicate the identity of OTUs towards the UNITE species hypothesis concept , Sampling sites were included as WKT polygons in the ‘footprintWKT’ field and sampling site names were included in the ‘eventID’ field. The representative sequences (OTUs) were included using the GGBN amplification extension. The dataset is available from gbif.org (https://0-doi-org.brum.beds.ac.uk/10.15468/nesbvx).
Site environmental data
We have followed the suggestion in Brunbjerg et al.  to describe the fundamental requirements for biodiversity in terms of the ecospace (position, expansion and spatio-temporal continuity of the biotope).
To assess the environmental variation across the 130 sites, we measured a core set of site factors that described the abiotic conditions at each site. Environmental recordings and estimates included soil pH, total soil carbon (C, g/m2), total soil nitrogen (N, g/m2) and total soil phosphorus (P, g/m2), soil moisture (% volumetric water content), leaf CNP (%), soil surface temperature (°C) and humidity (vapour pressure deficit), air temperature (°C), light intensity (Lux), and boulder density. For further details on methods used to collect abiotic data see Additional file 2: Appendix B.
We collected measurements that represent the expansion or biotic resources which some species consume and the organic and inorganic structures which some species use as habitat. Although many invertebrates are associated with other animals, for practical reasons, we restricted our quantification of biotic resources to the variation in live and dead plant tissue, including dung. We measured litter mass (g/m2), plant species richness, vegetation height (of herb layer, cm), cover of bare soil (%), bryophyte cover (%) and lichen cover (%), dead wood volume (m3/site), dominant herbs, the abundance of woody species, the number of woody plant individuals, flower density (basic distance abundance estimate, ), density of dung (basic distance abundance estimate), number of carcasses, fine woody debris density (basic distance abundance estimate), ant nest density (basic distance abundance estimate), and water puddle density (basic distance abundance estimate). For further details on methods used to collect expansion data see Additional file 2: Appendix B.
Mapping of temporal and spatial continuity
For each site, we inspected a temporal sequence of aerial photos (from 1945 to 2014) and historical maps (1842–1945) starting with the most recent photo taken. We defined temporal continuity as the number of years since the most recent major documented land use change. The year in which a change was identified was recorded as a ‘break in continuity’. To estimate spatial continuity, we used ArcGIS to construct four buffers for each site (500 m, 1000 m, 2000 m, 5000 m). Within each buffer we estimated the amount of habitat similar to the site focal habitat by visual inspection of aerial photos with overlays representing nation-wide mapping of semi-natural habitat. For further details on methods for collection of continuity data see Additional file 2: Appendix B.
To illustrate the coverage of the three main gradients (moisture, fertility, and successional stage) spanned by the 130 sites, Ellenberg mean site values (mean of mean Ellenberg values for the four 5 m radius quadrats within each site) for soil moisture (Ellenberg F), soil nutrients (Ellenberg N) and light conditions (Ellenberg L) were plotted relative to Ellenberg F, N and L values for a reference data set of 5 m radius vegetation quadrats (47,202 from agricultural, semi-natural and natural open vegetation and 12,014 from forests (http://www.naturdata.dk) . Mean Ellenberg values were only calculated for quadrats with more than five species and 95 percentile convex hull polygons where drawn for the reference data set as well as the Biowide data set.
We assessed the coverage for each taxonomic group across sites as well as within each site for spiders, harvestmen, and insect orders represented by at minimum of 75 species, for which we had abundance data by comparing the number of species found to the estimated species richness of the sample using rarefaction in the iNEXT R-package . Coverage regarding habitat types was assessed by constructing species accumulation curves for arable sites, plantations and natural sites. To visualize the habitat type related differences in expansion (biotic resources) we created a radar chart illustrating flower density, dead wood volume, plant richness, litter mass and dung density for natural habitats (early, mid, late succession), plantations and arable sites.
To further evaluate the turnover component of biodiversity and how well we covered the environmental gradient for our inventory, we related community composition to the measured environmental variables (abiotic and biotic) based on a Nonmetric Multidimensional Scaling (NMDS) analyses in R v. 3.2.3  using the vegan R-package  and the plant species × site matrix as well as the macrofungi species × site matrix. Abiotic and biotic variables were correlated with ordination axes to facilitate interpretation. In order to ensure that geographical variation in species composition and diversity was adequately assessed, we calculated, for each geographical region, the relative proportion of major species groups, and the component of beta diversity nestedness and turnover. The latter was done using the betapart R-package . To illustrate and substantiate the adequacy of the eDNA sampling design and subsequent laboratory protocols, we correlated basic biodiversity measures of community composition (NMDS axes) and richness for plant eDNA (ITS2 marker region) with the same measures for our observed plant data (see Additional file 2: Appendix B for detailed methods). To illustrate the cross correlation among the main taxonomic groups spearman rank correlations for vascular plants, mosses, lichens, macrofungi, gastropods, gallers/miners and arthropods at order level were calculated.
The 130 sites were distributed in 15 clusters nested within five regions across Denmark (Fig. 1a). The measured variables differed according to the initial stratification of sites based on simple indicators (Table 1, Fig. 2a, b, ranges of measured variables in Additional file 3: Appendix C). Managed sites (plantations and agricultural fields) revealed little variation in soil moisture (Fig. 2b). The perceived areas of high species richness spanned the full variation of natural sites regarding fertility, moisture and successional stage (Fig. 2b).
The selected 130 sites covered the main gradients reflected by a huge reference dataset from a national monitoring program (Fig. 3) as judged from a vegetation-based calibration of site conditions regarding moisture, fertility and succession (light intensity). Biowide data seemed to increase the upper range of the fertility gradient, which can be explained by the inclusion in Biowide of rotational fields that were not included in reference data (Figs. 2b, 3).
The environmental expansion of ecospace, which was measured as the amount and differentiation of organic carbon sources, varied among habitat types with high litter mass in tree plantations and late successional habitats, high plant species richness in early and mid-successional habitats, high dung density in open habitats (early successional and fields) and high amounts of dead wood in late successional habitats (Fig. 4). Spatial and temporal continuity varied for the 130 sites with less spatial continuity at larger buffer sizes (Additional file 4: Appendix D). The number of species found per site differed with taxonomic group with the highest number for arthropods and macrofungi and lowest for gastropods and lichens (Additional file 5: Appendix E). There was no clear difference in relative richness, nestedness and turnover of taxonomic groups across geographic region.
We collected 1774 species of macrofungi (corresponding to 54% of the number of macrofungi recorded in Denmark), 200 lichens (19%), 663 vascular plants (42%) and 254 bryophytes (41%) during the study period. We collected 75 species of gastropods (75%), 330 spiders (58%), 99 hoverflies (33%), 123 carabid beetles (37%) and 203 gallers and miners species (21%). For all groups, the number of species found was higher in natural (n = 90) than in cultivated (n = 30) sites, but across taxonomic groups, plantations and agricultural fields harbored species not found in other habitat types—plantations were particularly important in harboring unique species of macrofungi (Table 2, Additional file 6: Appendix F). The taxonomic sample coverage calculated by rarefaction within the 130 sites was high overall (range: 0.86–0.99), but highest for gastropods and spiders and lowest for gallers and miners (Table 2). Species accumulation curves for three habitat categories (arable, plantation and natural) saturated at approximately the same high level (Additional file 7: Appendix G).
The inventory was unprecedented in detail for Denmark and resulted in a total of 110 new macrofungi, 1 new lichen and 32 new invertebrate species (of which 12 were gallers and miners and 3 spiders) that had not previously been documented in Denmark (Table 2).
Turnover of plant communities among sites was adequately described by the NMDS ordination, which accounted for 81% of the variation in plant species composition (when correlating the original distance matrix with distances in ordination space, 3-dimensional, final stress = 0.102) of which 26%, 26%, and 11% could be attributed to axis 1, 2 and 3, respectively. Likewise for macrofungal communities the NMDS ordination accounted for 72% of the variation in species composition (3-dimensional, final stress = 0.146) of which 35%, 21% and 14% could be attributed to axis 1, 2, and 3, respectively. The major gradients in plant species composition of the 130 sites correlated strongly with soil fertility (NMDS axis 1 strong correlation with soil N, P and pH), successional stage (NMDS axis 2 strong correlation with light intensity and opposite correlation with litter mass and number of large trees) and soil moisture (NMDS axis 3 strong correlation with measured soil moisture), reflecting the gradients that the sites were selected to cover (Fig. 5, see correlation matrix for the rest of the environmental variables in Additional file 8: Appendix H). Macrofungal species composition showed the same gradients, however succession and fertility swapped with succession as primary gradient (NMDS1) and fertility as secondary gradient (NMDS2). NMDS axis 3 reproduced a strong correlation with soil moisture.
Spearman Rho correlations between observational plant species richness and eDNA OTU ‘richness’ as well as observational plant community composition (as represented by NMDS axes 1–3) and eDNA OTU composition were both strong and confirmative for a recovery of plant diversity by metabarcoding of soil-derived DNA (R2richness = 0.652, R2composition = 0.577–697, Fig. 6). Plant diversity (richness and composition) inferred from soil derived DNA thus resembled similar metrics derived from direct observation of plant communities, which has also been investigated in more detail in . We found cross-correlations among species richness of different taxonomic groups to be predominantly positive or non-significant (Fig. 7). Negative correlations typically involved insect taxa like Diptera, Lepidoptera, and Orthoptera and e.g. Fungi.
Using ecospace as a conceptual framework , we developed a sampling design for mapping terrestrial biodiversity across Denmark represented by numerous, diverse taxa. Across the 130 surveyed sites, covering a tiny fraction (0.0005%) of the total land area of Denmark, we observed approximately 5500 species, of which 143 represented new species records for the country. Our stratification procedures allowed us to cover the local and national environmental variation across Denmark using only 130 sites of 40 × 40 m each and provided a good across and within site coverage of diverse groups of invertebrates and fungi. Finally, the study demonstrates that eDNA data, once properly curated , can be used as an important supplement to classical biodiversity surveys.
Since, environmental filtering is an important process in community assembly , the most obvious design principle for a biodiversity inventory is to stratify sampling according to major abiotic and biotic environmental gradients [e.g. 42]. In strongly human-dominated landscapes, such stratification should incorporate both cultivated and non-cultivated areas and since environmental gradients are often narrower in cultivated areas, this needs to be taken into account. We found a close correspondence between the variation in average Ellenberg values at our sites and those extracted from a very large vegetation database comprising vascular plant species lists from a national monitoring program. This indicates that we managed to cover the main environmental gradients found across Denmark. Turnover of plant and macrofungi communities was significantly linked to moisture, light and fertility and allows us to generalize relationships between environment and biodiversity derived from local measurements to a large spatial extent. We note that the use of stratified random sampling implies a biased representation of rare and common environmental conditions. On the other hand, a completely random sampling would have led to limited representation of natural biotopes and their disproportionate contribution to the total biodiversity may have been missed. Our results indicate that our sampling design and site selection was successful both regarding unbiased taxonomic coverage in geographic regions and habitat types.
While the ecospace framework helped structure our sampling, it also proved challenging with respect to trade-offs between site size and homogeneity (related to ecospace position), methods to quantify biotic resources (assessing ecospace expansion) and definitions of temporal and spatial continuity. Ideally, abiotic and biotic conditions should be homogenous across a site in order to ensure that site measurements reflect the abiotic position and biotic expansion . A smaller area would be more likely to be homogenous, but would be less representative. Across long environmental gradients, homogeneity and representativeness may also vary among for example, grassland, heathland, and forest. Similarly, while counting the number of different plant species is easy, accounting for the relative contribution of each species to total biomass and measuring the availability of different biotic resources such as dead wood, woody debris, litter, dung, flowers and seeds is much harder but likely to be highly important as indicated by the differences across habitat types. Finally, spatial and temporal continuity is hard to quantify due to data limitations and because past soil tillage, fertilization, or other land management or disturbance regimes have not been recorded and must be inferred indirectly. In addition, an unambiguous definition of continuity breaks is impossible given that most land use changes and derived community turnover occur gradually over time. We estimated spatial continuity using broad habitat classes at a range of scales (500 m, 1000 m, 2000 m, 5000 m) acknowledging that the dependency on spatio-temporal continuity depend on the mobility, life history and habitat specificity of different species. Our estimate of temporal continuity were also limited by the availability of aerial photographs and maps, which while not perfect, is good relative to other parts of the world. Despite these constraints, our estimates of spatial and temporal continuity varied among sites and were uncorrelated, which allowed us to statistically test for their relative roles.
We aimed at equal sampling effort per site in terms of trapping and searching time. However, this was challenged by an array of practicalities. The preferred species sampling methods varied among taxonomic groups [43, 44] and despite our application of a suite of methods, including passive sampling in pitfall traps and Malaise traps, baited traps, soil core sampling and active search, our taxonomic coverage was still incomplete (e.g. aphids, phorid flies and other species-rich groups living in the canopy are inevitably under-sampled). Our budget also forced us to be selective with the morphology-based identification of the most difficult species groups, in particular within Hymenoptera and Diptera. Among identified groups, across-site sample coverage was consistently high (> 0.86) and typically close to 1, which indicates that very few unseen species remain to be recorded in each community. Invertebrate sampling and identification is extremely time consuming and relies on rare taxonomic expertise. The within site sample coverage could only be calculated for spiders and insect orders for which abundance data were available. Median values of within site sample coverage were also consistently above 0.5, which we consider adequate for cross-site comparisons. We spent more than half of the inventory budget on invertebrate sampling and identification. Invertebrates constitute by far the largest fraction of the total biota and, for many species, the adult life stage is short-lived, highly mobile, and the range of active species varies with season [45, 46]. Trapping also implies a certain risk of suboptimal placement or vandalism by visiting humans, domestic livestock or wild scavengers. The resulting number of invertebrate species per site is relatively high and revealed a considerable variation, which gives ample opportunity for comparative analyses. Although, we did not obtain full coverage of all species in every habitat category, the relative distribution of sites in arable habitats, plantations and natural sites seemed sufficient as reflected by comparable saturations of the three species accumulation curves. The high number of new species for Denmark, particularly macrofungi, can most likely be attributed to the effort, but also to the inclusion of habitat types that would otherwise have been avoided or overlooked during opportunistic field surveys . Limited budgets in biodiversity studies may justify monitoring of a smaller number of taxonomic groups representing the overall biodiversity as indicated by the positive cross correlations among most taxonomic groups.
Although methods for DNA extraction, amplification, sequencing and bioinformatics processing are continuously improved and may lead to better biodiversity metrics from environmental samples, collecting representative samples from larger areas with unevenly distributed species remains a challenge. We pooled and homogenized large amounts of soil, followed by extraction of intracellular as well as extracellular DNA, from a large subsample, to maximize diversity coverage within a manageable manual workload. Biodiversity metrics based on plant DNA were correlated to the same metrics for observational plant data. This indicates that the procedure for sampling, DNA extraction and amplification can be assumed to be adequate for achieving amplicon data to quantify variation in biodiversity across wide ecological and environmental gradients for plants, but most likely also for other organisms present in the soil. These methods are promising for biodiversity studies of many organism groups that are otherwise difficult to sample and identify (e.g. nematodes, fungi, protists, and arthropods). High throughput sequencing (HTS) methods produce numerous errors [e.g. 47, 48] and it has been suggested that richness measures should be avoided altogether for HTS studies . Despite the remaining challenge of relating genetic units to well-known taxonomic entities, our results along with those presented in  indicate that reliable metrics of α-diversity and community composition are achievable. With respect to taxonomic annotation, reference databases are far from complete and the taxonomic annotation of reference sequences are often erroneous. Furthermore, for many groups of organisms, we have still only described and named a fraction of the actual species diversity, and the underlying genetic diversity within and between species is largely unknown for most taxa, leading to uncertainties in OTU/species delimitation and taxonomic assignment of sequence data. This also means that ecological interpretation of OTU/species assemblages assessed by eDNA is largely impossible as there is little ecological knowledge that can be linked to OTUs. Thus, for eDNA-based biodiversity assessment to further mature, molecular biologists, ecologists, and taxonomists need to work closely together to produce well-annotated reference databases. Our environmental samples for eDNA, including soil and litter samples as well as extracted DNA will be preserved for the future. This material represents a unique resource for the further development of methods within ecology and eDNA. As more efficient technologies become available in the future, it will be possible to process this material at an affordable cost and derive further insights on the relationship between traditional species occurrence, OTU data and environmental variation.
We have presented a comprehensive sampling design to obtain a representative, unbiased sample of multi-taxon biodiversity stratified with respect to the major abiotic gradients. By testing and evaluating the sampling design, we conclude that it is operational and that observed biodiversity variation may be attributed to measured abiotic and biotic variables. We developed our sampling design based on the ecospace concept, and with this study, we took the first step towards general models and model inferences with transferability to terrestrial ecosystems and biotas in other parts of the world. Given the overall extent of the environmental gradients remains constant through time we believe the sampling design is also useful for monitoring biodiversity i.e. tracking changes in biodiversity through time. Meta-barcoding of environmental DNA offers a promising supplement to traditional inventories (economically and logistically), but barcode reference libraries are still far from complete. Thus, combining classical taxonomic identification with metabarcoding of environmental DNA currently appears to offer a promising approach to biodiversity research.
Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Mora C, Tittensor DP, Adl S, Simpson AGB, Worm B. How many species are there on earth and in the ocean? PLoS Biol. 2011;9(8):e1001127.
Jetz W, McPherson JM, Guralnick RP. Integrating biodiversity distribution knowledge: toward a global map of life. Trends Ecol Evol. 2012;27(3):151–9.
Isaac NJB, van Strien AJ, August TA, de Zeeuw MP, Roy DB. Statistics for citizen science: extracting signals of change from noisy ecological data. Methods Ecol Evol. 2014;5(10):1052–60.
Nichols JD, Williams BK. Monitoring for conservation. Trends Ecol Evol. 2006;21(12):668–73.
Taberlet P, Coissac E, Pompanon F, Brochmann C, Willerslev E. Towards next-generation biodiversity assessment using DNA metabarcoding. Mol Ecol. 2012;21(8):2045–50.
Thomsen PF, Willerslev E. Environmental DNA—an emerging tool in conservation for monitoring past and present biodiversity. Biol Conserv. 2015;183:4–18.
Porazinska DL, Giblin-Davis RM, Esquivel A, Powers TO, Sung WAY, Thomas WK. Ecometagenetics confirm high tropical rainforest nematode diversity. Mol Ecol. 2010;19(24):5521–30.
Andersen K, Bird KL, Rasmussen M, Haile J, Breuning-Madsen H, Kjær KH, Orlando L, Gilbert MTP, Willerslev E. Meta-barcoding of ‘dirt’ DNA from soil reflects vertebrate biodiversity. Mol Ecol. 2012;21(8):1966–79.
Taberlet P, Prud’Homme SM, Campione E, Roy J, Miquel C, Shehzad W, Gielly L, Rioux D, Choler P, ClÉMent J-C, et al. Soil sampling and isolation of extracellular DNA from large amount of starting material suitable for metabarcoding studies. Mol Ecol. 2012;21(8):1816–20.
Yoccoz NG, Brathen KA, Gielly L, Haile J, Edwards ME, Goslar T, von Stedingk H, Brysting AK, Coissac E, Pompanon F, et al. DNA from soil mirrors plant taxonomic and growth form diversity. Mol Ecol. 2012;21(15):3647–55.
Ficetola GF, Miaud C, Pompanon F, Taberlet P. Species detection using environmental DNA from water samples. Biol Lett. 2008;4(4):423–5.
Thomsen PF, Kielgast JOS, Iversen LL, Wiuf C, Rasmussen M, Gilbert MTP, Orlando L, Willerslev E. Monitoring endangered freshwater biodiversity using environmental DNA. Mol Ecol. 2012;21(11):2565–73.
Shokralla S, Spall JL, Gibson JF, Hajibabaei M. Next-generation sequencing technologies for environmental DNA research. Mol Ecol. 2012;21(8):1794–805.
Taberlet P, Coissac E, Hajibabaei M, Rieseberg LH. Environmental DNA. Mol Ecol. 2012;21(8):1789–93.
Brunbjerg AK, Bruun HH, Moeslund JE, Sadler JP, Svenning J-C, Ejrnæs R. Ecospace: a unified framework for understanding variation in terrestrial biodiversity. Basic Appl Ecol. 2017;18:86–94.
Belyea LR, Lancaster J. Assembly rules within a contingent ecology. Oikos. 1999;86(3):402–16.
Ricklefs RE. Community diversity—relative roles of local and regional processes. Science. 1987;235(4785):167–71.
Horák J, Kout J, Vodka Š, Donato DC. Dead wood dependent organisms in one of the oldest protected forests of Europe: investigating the contrasting effects of within-stand variation in a highly diversified environment. For Ecol Manag. 2016;363:229–36.
Seibold S, Bässler C, Brandl R, Gossner MM, Thorn S, Ulyshen MD, Müller J. Experimental studies of dead-wood biodiversity—a review identifying global gaps in knowledge. Biol Conserv. 2015;191:139–49.
Laliberté E, Zemunik G, Turner BL. Environmental filtering explains variation in plant diversity along resource gradients. Science. 2014;345(6204):1602–5.
Houseman GR, Gross KL. Linking grassland plant diversity to species pools, sorting and plant traits. J Ecol. 2011;99(2):464–72.
Rubæk GH, Kristensen K, Olesen SE, Østergaard HS, Heckrath G. Phosphorus accumulation and spatial distribution in agricultural soils in Denmark. Geoderma. 2013;209–210:241–50.
Arler F, Jørgensen MS, Galland D, Sørensen EM. Kampen om m2 - Prioritering af fremtidens arealanvendelse i Danmark. Fonden Teknologirådet. 2015. http://www.tekno.dk/wp-content/uploads/2015/08/Prioritering-af-fremtidens-arealanvendelse-i-Danmark.pdf. Accessed 14 Oct 2019.
Nygaard B, Juel A, Fredshavn JR. Ændringer i det § 3-beskyttede naturareal 1995–2014. Resultater fra Naturstyrelsens opdateringsprojekt. In: Edited by Aarhus Universitet DNCfMoE, 106 s. - Teknisk rapport fra DCE - Nationalt Center for Miljø og Energi nr. 79; 2016.
Williams P, Gibbons D, Margules C, Rebelo A, Humphries C, Pressey R. A comparison of richness hotspots, rarity hotspots, and complementary areas for conserving diversity of British birds. Conserv Biol. 1996;10(1):155–74.
Terrestriske Naturtyper 2004–2015. NOVANA. Aarhus Universitet, DCE–Nationalt Center for Miljø og Energi. http://www.novana.au.dk. Accessed 14 Oct 2019.
Council Directive 92/43/EEC: on the conservation of natural habitats and of wild fauna and flora. European Commission. 1992.
Hill MO, Mountford JO, Roy DB, Bunce RGH. Ellenberg’s indicator values for British plants: ECOFACT volume 2: Technical Annex. Natural Environment Research Council: Centre for Ecology and Hydrology; 1999.
Grime JP, Hodgson JG, Hunt R. Comparative plant ecology: a functional approach to common British species. London: Unwin Hyman; 1989.
Ejrnæs R, Bruun HH. Gradient analysis of dry grassland vegetation in Denmark. J Veg Sci. 2000;11(4):573–84.
Lomolino MV, Riddle BR, Brown JH, Brown JH. Biogeography. Sunderland: Sinauer Associates; 2006. p. 65–96.
Bijl Lvd, Boutrup S, Jensen PN. NOVANA. Det nationale program for overvågning af vandmiljøet og naturen. Programbeskrivelse 2007-2009 - del 2. Danmarks Miljøundersøgelser, Aarhus Universitet. Faglig rapport fra DMU nr 615 2007.
Schmidt P-A, Bálint M, Greshake B, Bandow C, Römbke J, Schmitt I. Illumina metabarcoding of a soil fungal community. Soil Biol Biochem. 2013;65:128–32.
Koljalg U, Nilsson RH, Abarenkov K, Tedersoo L, Taylor AFS, Bahram M, Bates ST, Bruns TD, Bengtsson-Palme J, Callaghan TM, et al. Towards a unified paradigm for sequence-based identification of fungi. Mol Ecol. 2013;22(21):5271–7.
White N, Engeman R, Sugihara R, Krupa H. A comparison of plotless density estimators using Monte Carlo simulation on totally enumerated field data sets. BMC Ecol. 2008;8(1):6.
Hsieh TC, Ma KH, Chao A. iNEXT: an R package for rarefaction and extrapolation of species diversity (Hill numbers). Methods Ecol Evol. 2016;7(12):1451–6.
R Core team. R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2015.
Oksanen J, Blanchet FG, Kindt R, Legendre P, O’Hara RB, Simpson GL, Solymos P, Stevens MHH, Wagner H. Package ‘vegan’: Community Ecology Package. Version 1.17-2. 2010. http://cran.r-project.org/web/packages/vegan/vegan.pdf. Accessed 14 Oct 2019.
Baselga A, Orme D, Villeger S, De Bortoli J, Leprieur F. betapart: partitioning beta diversity into turnover and nestedness components. R package version 1.5.1. 2018. https://cran.r-project.org/package=betapart. Accessed 14 Oct 2019.
Frøslev TG, Kjoller R, Bruun HH, Ejrnæs R, Brunbjerg AK, Pietroni C, Hansen AJ. Algorithm for post-clustering curation of DNA amplicon data yields reliable biodiversity estimates. Nat Commun. 2017;8:1188.
Kraft NJB, Adler PB, Godoy O, James EC, Fuller S, Levine JM. Community assembly, coexistence and the environmental filtering metaphor. Funct Ecol. 2015;29(5):592–9.
Gillison AN, Brewer KRW. The use of gradient directed transects or gradsects in natural resource surveys. J Environ Manag. 1985;20:103–27.
Popic TJ, Davila YC, Wardle GM. Evaluation of common methods for sampling invertebrate pollinator assemblages: net sampling out-perform pan traps. PLoS ONE. 2013;8(6):e66665.
Standen V. The adequacy of collecting techniques for estimating species richness of grassland invertebrates. J Appl Ecol. 2000;37(5):884–93.
CaraDonna PJ, Petry WK, Brennan RM, Cunningham JL, Bronstein JL, Waser NM, Sanders NJ. Interaction rewiring and the rapid turnover of plant–pollinator networks. Ecol Lett. 2017;20(3):385–94.
Valverde J, Gómez JM, Perfectti F. The temporal dimension in individual-based plant pollination networks. Oikos. 2016;125(4):468–79.
Brown SP, Veach AM, Rigdon-Huss AR, Grond K, Lickteig SK, Lothamer K, Oliver AK, Jumpponen A. Scraping the bottom of the barrel: are rare high throughput sequences artifacts? Fungal Ecol. 2015;13:221–5.
Kunin V, Engelbrektson A, Ochman H, Hugenholtz P. Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates. Environ Microbiol. 2010;12(1):118–23.
Bálint M, Bahram M, Eren AM, Faust K, Fuhrman JA, Lindahl B, O’Hara RB, Öpik M, Sogin ML, Unterseher M, et al. Millions of reads, thousands of taxa: microbial community structure and associations analyzed via marker genes. FEMS Microbiol Rev. 2016;40(5):686–700.
Ejrnæs R, Frøslev TG, Høye TT, Kjøller R, Oddershede A, Brunbjerg AK, Hansen AJ, Bruun HH. Uniquity: a general metric for biotic uniqueness of sites. Biol Conserv. 2018;225:98–105.
We thank Ako O. Mirza for plant and soil lab work, Vagn Alstrup and Roar Skovlund Poulsen for lichen surveys, volunteers that have helped in data collection, land owners, Karl-Henrik Larsson for aid in identifying critical corticioid fungi, Leif Örstadius for identifying Psathyrella collections. In regard to identification of invertebrate specimens, we would like to thank Henning Petersen for identifying Springtails (Collembola), Hjalte Kjærby for identifying Grasshoppers (Orthoptera) and Harvestmen (Opilliones), Kåre Fog for identifying snails (Gastropoda), Kåre Würtz Sørensen for identifying various Wasps (Symphyta, Spechidae, Crabronidae, and Vespidae), Lars Dyhrberg Bruun for identifying spiders (Araneae), Lars Brøndum for identifying Hoverflies (Syrphidae), Carrion Beetles (Silphidae) as well as Scarabs (Scarabidae), Lars Skipper for identifying True bugs (Heteroptera), Maja Møholt for identifying Dung beetles (Aphodius, Onthophagus and Geotrupidae) and Cantharidae, Marianne Graversen for identifying Longhorn beetles (Cerambycidae) and Ladybugs (Coccinellidae), Peter Wiberg-Larsen for identifying Caddisflies (Trichoptera), Mathias Holm for identifying True weevils and Seed weevils (Curculionidae and Apionidae), Monica Aimeé Harlund Oyre for identifying various Dipterans (Syrphida, Tachinadae, Stratiomyidae, Acroceridae, Rhagionidae, Tephritidae, Plastytomatidae, Asilidae) as well as Strepsipterans (Strepsiptera) and Book- and Barklice (Psocoptera), Morten D. D. Hansen for identifying Bees (Apoidea), Carrion beetles (Silphidae), Click beetles (Elateridae), Scarabs (Scarabaeidea) and Dung beetles (Aphodius, Onthophagus and Geotrupidae), Ole Fogh Nielsen for identifying net-winged insects (Neuroptera) and Strepsipterans (Strepsiptera), Oskar Liset Pryds Hansen and Emil Skovgaard Brandtoft for identifying Ground beetles (Carabidae), Sofie Amund Kjeldgaard and Steffen Kjeldgaard for identifying Owlet moths (Noctuidae), Mathias G. Skytte for identifying Rove beetles (Staphyllinidae), Simon Haarder for identifying galling and mining arthropods, and Ulrik Hasle Nielsen for identifying Cicadas (Cicadoidea) as well as numerous other volunteers. In regard to carrying out the eDNA lab work we would like to thank Anne Aagaard Lauridsen, Sarah Mak, Stine Raith Richter, Carlotta Pietroni and Ida Broman Nielsen. Thomas Stjernegaard Jeppesen (GBIF) is thanked for making the fungal eDNA dataset publicly available through the GBIF portal.
RE received a grant from VILLUM foundation (VKR-023343) for the Biowide project. VILLUM foundation did not play a role in the design of the study, data collection, analyses, interpretation of data or writing of the manuscript.
Ethics approval and consent to participate
No permission was required to carry out the sampling.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Site characteristics for each of the 130 40 × 40 m sites.
Sampling design for data collection.
Ranges of environmental (abiotic and biotic) variables measured within the 130 sites as well as species richness of various taxonomic groups.
Temporal and spatial continuity for the 130 Biowide sites.
Relative richness of arthropods, bryophytes, gastropods, lichens, macrofungi and vascular plants across all 130 sites.
Number of species in each arthropod family for natural habitats, perceived areas of high species richness, arable land and plantations.
Species accumulation curves for arable sites, plantations and natural sites.
Correlation matrix for NMDS axes 1, 2 and 3 and environmental variables.
About this article
Cite this article
Brunbjerg, A.K., Bruun, H.H., Brøndum, L. et al. A systematic survey of regional multi-taxon biodiversity: evaluating strategies and coverage. BMC Ecol 19, 43 (2019). https://0-doi-org.brum.beds.ac.uk/10.1186/s12898-019-0260-x
- Abiotic gradients
- Biotic factors