Global perspective of environmental distribution and diversity of Perkinsea (Alveolata) explored by a meta-analysis of eDNA surveys



Phylogenetic classification of environmental Perkinsea sequences

Using a phylogenetic approach, we investigated the diversity and community structure of Perkinsea ASV sequences retrieved from the published EukBank dataset37.

We first conducted a phylogenetic analysis using a reference tree derived from an alignment of 1790 characters, which included representative alveolate groups and environmental sequences. This analysis successfully retrieved all major clusters described in the literature, including NAG01, Perkinsidae, Xcellidae, Parviluciferaceae, and environmental clusters (referred to as ‘Perkinsea environmental cluster 01–04’). These clusters exhibited robust node support in both maximum-likelihood and Bayesian inferences (TBE > 0.8 and PP > 0.90, Fig. S1). However, a few sequences, such as M. nigrum (MN721813.1 and MN721814.1) and P. dinoexitiosum (MZ663823.1 and MZ663830.1), displayed low branch support, underscoring the uncertainty in their phylogenetic placement as described in previous analyses12,13.

From the EukBank dataset, we extracted 1647 ASVs preliminary classified as Perkinsea group using a pipeline developed by the EukBank consortium ( The taxonomic affiliation was checked using phylogenetic analysis with our reference dataset. Out of the 1647 ASVs, the phylogenetic analysis confirmed that 1568 ASVs were not chimeras and branched with known Perkinsea. Perkinsea ASVs represented 1,075,904 reads and were found across 4034 samples. These ASVs were placed throughout the phylogenetic tree, illustrating the extensive diversity retrieved within the Perkinsea lineage (Fig. 1). The LWR analysis provided high-confidence assignments for 67% of the ASVs, as they were placed within their respective branches with a certainty score (LWR) greater than 0.50 (Fig. S2). In total, 318 ASVs (~ 20.3%) were associated with Parviluciferaceae, 214 ASVs (~ 13.6%) with NAG01, 52 ASVs (~ 3.3%) with Xcellidae, 39 ASVs (~ 2.5%) with Perkinsidae and 5 ASVs (~ 0.3%) with P. dinoexitiosum. Additionally, some ASVs were classified into environmental clusters: ‘Perkinsea cluster 01’ (200 ASVs, ~ 12.7%), ‘Perkinsea cluster 02’ (172 ASVs, ~ 11.0%), ‘Perkinsea cluster 03’ (5 ASVs, ~ 0.3%) and ‘Perkinsea cluster 04’ (12 ASVs, ~ 0.8%). The remaining 551 ASVs (~ 35.2% of the ASVs) could not be assigned to any of the nine defined clades and were designated as ‘unclassified Perkinsea’ ASVs. No ASVs related to M. nigrum were detected in our analysis.

Figure 1
figure 1

Phylogenetic placement of ASVs into the Perkinsea reference tree. The groups of ASVs were collapsed and highlighted in pink. The red numbers correspond to the number of ASVs in each collapsed cluster.

Diversity and distribution of Perkinsea sequences

To investigate the diversity and distribution patterns of the identified Perkinsea, we classified the 4034 samples into three main environmental types: ‘Marine’ (2567 samples), ‘Land water’ (410 samples), and ‘Soil’ (1057 samples). Within these categories, we detected 1001 Perkinsea ASVs in ‘Marine’ samples, 601 ASVs in ‘Land water’, and 269 ASVs in ‘Soil’. The sequencing effort for each environment, represented by the percentage of Perkinsea ASVs out of the total number of reads, ranged from approximately 0.1% in both ‘Marine’ and ‘Soil’ environments to 2.4% in ‘Land water’. These findings are consistent with the results of Jobard et al.25.

Most ASVs exhibited habitat specificity, with 827 ASVs exclusively found in ‘Marine’ environments, 348 ASVs recovered exclusively from ‘Land water’ and 129 ASVs specific to ‘Soil’ (Fig. 2A). Only 39 ASVs were retrieved across all three categories. These shared ASVs were associated with NAG01 cluster (16 ASVs), ‘unclassified Perkinsea’ (13 ASVs), ‘Perkinsea cluster 02’ (8 ASVs) and ‘Perkinsea cluster 01’ (2 ASVs). ‘Marine’ environments shared 163 ASVs with ‘Land water’ and 50 ASVs with ‘Soil’. In contrast, ‘Land water’ and ‘Soil’ exhibited only 129 shared ASVs.

Figure 2
figure 2

(A) Venn diagram of ASVs shared across environments. (B) Non-metric multi-dimensional scaling ordination plot (NMDS) based on Bray–Curtis distances, with 87 samples from ‘Marine’ environments, 53 from ‘Land water’ and 40 from ‘Soil’. (C) PCoA based on the phylogenetic distance of the samples.

The NMDS ordination plot based on the Bray–Curtis distance revealed a separation among ‘Marine’, ‘Land water’ and ‘Soil’ categories (Fig. 2B). According to ANOSIM analysis based on the Bray–Curtis distances (R:0.14, significance: 0.001, permutations: 999), the communities were significantly different between categories. A similar pattern was observed for the PCoA analysis based on the weighted UniFrac distance (Fig. 2C, R:0.29, significance: 0.001, permutations: 999). Our analysis also revealed a continuum between these ecological niches. ‘Soil’ and ‘Land water’ were more similar to each other than to ‘Marine’ samples, with the exception of ‘Estuarine’ and ‘Epipelagic zone’ samples that presented similarities with ‘Lake’ and ‘River’ samples (Figs. 2B, 3). Indeed, ‘Temperate forests’ shared 95 (78%) ASVs with ‘Lake’ while the ‘Epipelagic zone’ and ‘Estuarine’ shared 103 (20%) and 43 (78%) ASVs with ‘Lake’ environment, respectively (Table S1). Nevertheless, the communities Bray–Curtis and Weighted UniFrac distances showed to be significantly different (Pairwise ANOSIM p value < 0.05), with the exception of the Weighted UniFrac distance between ‘River’ and ‘Estuarine’ samples (p value  = 0.123, Table S2). The rest of Marine samples presented a very dissimilar community and can be separated into different groups corresponding to ‘Coastal zone’ samples positioned closely to ‘Marine sediments’ samples (sharing 76% of the ‘Coastal zone’ ASVs), separated from ‘Bathypelagic’, ‘Mesopelagic’ and ‘Abyssal zone’ samples. These last three environments share ~ 30% of the ASVs (Table S1). However, despite the relatively high number of shared ASVs, the communities differed significantly, including between ‘Epipelagic zone’ and ‘Marine sediment’, which shared 42% of the ASVs (p value  < 0.05, Table S2).

Figure 3
figure 3

Distribution of Perkinsea ASVs into the different sub-environment categories. Each plot represents a specific main environment, such as Marine (A), Land water (B), and Soil (C). The description of each sub-category per environment is next to each plot. The width of the connectors is the number of ASVs corresponding to the specific taxonomic group. The taxonomic group description is in the lower boxed panel and applies to all plots.

Alpha diversity index, including the Shannon index and phylogenetic diversity (PD), indicated that ‘Land water’ exhibited the highest diversity, with mean values of 1.1 and 1.3, respectively. In contrast, ‘Marine’ and ‘Soil’ samples showed lower mean alpha diversity and PD values, averaging around 0.4 (Fig. S4). Additionally, ‘Land water’ had the highest mean number of observed ASVs per sample (11 ASVs), followed by ‘Marine’ (2.3 ASVs per sample) and ‘Soil’ (2 ASVs per sample) (Fig. S4).

Regarding the Net Relatedness Index (NRI) analysis, higher indices were observed in marine water, particularly in the sub-category of ‘deep’ samples from the Abyssal zone, with a mean NRI of 1.18 and PD value of 0.69 (Fig. S4) This result suggests the presence of phylogenetically clustered ASVs in the ‘deep’ samples. However, rarefaction curves indicated that only the diversity of land water had been adequately sampled, as the curve reached a plateau, revealing that the Perkinsea diversity retrieved in ‘Marine’ and ‘Soil’ samples are still under-sampled (Fig. S5).

The Perkinsea ASVs revealed distinct affiliations for each taxonomic group. Indeed, phylogenetic analysis showed that ASVs from ‘Marine’ samples branched within all the identified groups, especially to ‘unclassified Perkinsea’ (362 ASVs, 36.2%) and Parviluciferaceae (316 ASVs, 31.6%). Additionally, 52 ASVs (5.2%) were affiliated with Xcellidae, and 39 (3.9%) ASVs were related to Perkinsidae clusters. A small number of ASVs (12, 1.2%) were associated with ‘Perkinsea cluster 04’, 2 (0.2%) to ‘Perkinsea cluster 03’ and 5 ASVs (0.5%) were related to P. dinoexitiosum. Surprisingly, a few ASVs from ‘Marine’ samples branched within clusters that are typically associated with freshwater environments, such as NAG01 (90 ASVs, 8.9%), ‘Perkinsea cluster 01’ (72 ASVs, 7.2%), and ‘Perkinsea cluster 02’ (51 ASVs, 5.1%).

To explore the ecological preferences of phylogenetic groups, particularly in marine samples, the category has been subdivided into sub-categories that represent different environments (Figs. 3 and S6). ASVs related to marine ‘unclassified Perkinsea’ were predominantly detected in marine sediments (228 ASVs) and the epipelagic zone (170 ASVs). A similar distribution was observed for Parviluciferaceae and Perkinsidae, with 182 and 12 ASVs detected in sediments and 151 and 23 ASVs in the epipelagic zone, respectively. Xcellidae ASVs were predominantly detected in the open ocean’s mesopelagic zone (42 ASVs) and bathypelagic zone (16 ASVs). ASVs related to ‘Perkinsea cluster 04’ were mainly observed in sediments (11 ASVs). Surprisingly, Perkinsidae (21 ASVs) and Parviluciferaceae (122 ASVs) were also detected in open ocean samples at significant depths, ranging from the ‘mesopelagic’ zone (200–1000 depth meters) to the ‘abyssal’ zone (below 4000 depth meters). The remaining detected groups in marine samples, including NAG01, ‘Perkinsea cluster 01–03’, were mainly detected in the epipelagic zone, estuarine areas, and sediments. ASVs related to P. dinoexitiosum were retrieved mainly from the epipelagic zone (4 ASVs), sediments (2 ASVs), and mesopelagic zone (1 ASV).

‘Land water’ samples were mainly represented by ASVs related to NAG01 (181 ASVs, 30.1% of the total ‘Land water’ ASVs), ‘Perkinsea cluster 01’ (174 ASVs, 29.3%), ‘Perkinsea cluster 02’ (92 ASVs, 15.3%), and ‘unclassified Perkinsea’ (147 ASVs, 24.5%). Additionally, ‘Perkinsea cluster 03’ (3 ASVs, 0.5%) and Parviluciferaceae (2 ASVs, 0.3%) were also detected in ‘Land water’ samples. ASVs related to NAG01 (268 ASVs), ‘Perkinsea cluster 01’ (267 ASVs), ‘Perkinsea cluster 02’ (63 ASVs) and ‘unclassified Perkinsea’ (157 ASVs) were mostly detected in lakes, rivers and sediment samples. Brackish water and Bromeliads tank samples were represented mainly by ASVs related to ‘unclassified Perkinsea’ (5 ASVs and 7, respectively) and ‘Perkinsea cluster 02’ (3 ASVs). In High Arctic water samples, ASVs related to ‘unclassified Perkinsea’ (20 ASVs) and ‘Perkinsea cluster 02’ (10 ASVs) were also retrieved.

Surprisingly, ‘Soil’ samples were represented by ASVs branched into as ‘unclassified Perkinsea’ (141 ASVs, 52.4% of the total ASVs observed in ‘Soil’ samples), ‘Perkinsea cluster 02’ (79 ASVs, 29.4%), NAG01 (43 ASVs, 16.0%) and ‘Perkinsea cluster 01’ (6 ASVs, 2.2%). ASVs related to ‘unclassified Perkinsea’ were detected in all ‘Soil’ subcategories. The next most abundant group in ASVs within ‘Soil’ samples group was ‘Perkinsea cluster 02’. NAG01 ASVs were retrieved in ‘Soil’ samples categorized as Temperate forests (33 ASVs), Tropical forests (10 ASVs), Temperate grassland (2 ASVs) and Temperate Land soils (5 ASVs). ‘Perkinsea cluster 01’ ASVs were detected in Temperate Land soil samples (4 ASVs) and Temperate and Tropical Forest samples (1 ASVs each).

A significant number of soil ASVs were exclusively retrieved from Temperate forest samples (96 ASVs), followed by Tropical forest samples (25 ASVs), and finally, Temperate Land soils (5 ASVs). However, these results may be influenced by the variable number of samples across each environment, which introduces potential biases. Nonetheless, we observed many ASVs shared between soil environments, even among contrasting ones such as Temperate and Tropical forests (14 ASVs). This indicates that potential Soil-dwelling Perkinsea organisms can adapt to diverse habitats.

Potential novel Perkinsea taxa

By applying specific criteria based on the LWR and the percentage of similarity (mean LWR < 0.69 and the mean % similarity < 94.8%), 428 ASVs potentially related to novel groups were identified (Fig. S7). These ‘novel’ ASVs branched widely across the reference tree (Fig. 4). Among them, 320 ASVs were detected in ‘Marine’ environments, 95 in ‘Land water’ and 50 in ‘Soil’ environments. The novel ASVs closely related to Parviluciferaceae were detected in marine sediments. ASVs related to Perkinsidae were mostly detected in the epipelagic and bathypelagic zones, except for one ASV detected in Abyssal zone samples (Fig. 4). Novel ASVs related to NAG01 were mainly retrieved from lakes and land water sediments (22 ASVs). Still, a significant number of these novel ASVs were also recovered from the marine epipelagic zone (7 ASVs) and the soil samples (3 ASVs in Temperate and 2 ASVs in Tropical Forest soil samples). ‘Perkinsea cluster 01’ novel ASVs were mostly detected in Land water samples, with a few observations in marine ‘epipelagic zone’ (3 ASVs) and sediment samples (3 ASVs). ‘Perkinsea cluster 02’ novel ASVs were retrieved in a wide range of samples, including 36 ASVs in soil and land water samples, 10 ASVs in marine samples, four ASVs in the Bathypelagic zone, and two ASVs in sediment samples. We can distinguish two groups among the ‘unclassified Perkinsea’ novel ASVs. Those closely branching to marine Perkinsea lineages were mostly detected in the Epipelagic zone and marine sediments, following the same patterns as the closely related defined lineages. However, those placed basal to NAG01, ‘Perkinsea cluster 01 to 03’ showed different distribution, with some ASVs solely retrieved in marine samples (17 ASVs) and others detected in both land waters and soil samples (18 ASVs). In total, 13 novel ASVs were detected within the three contrasted environments (‘Marine’, ‘Land water’ and ‘Soil’).

Figure 4
figure 4

Phylogenetic placement of ‘Novel ASVs’ into the Perkinsea reference tree. The ASVs are highlighted in pink. The outer heat map circles show the sampling provenance of each ASVs. Each sub-circle is identified by a letter which corresponds to a sub-category: (a) ‘Coastal zone’, (b) ‘Epipelagic zone’ (0–200 depth meters), (c) ‘Mesopelagic zone’ (200–1000 depth meters), (d) ‘Bathypelagic zone’ (1000–4000 depth meters), (e) ‘Abyssal zone’ (below 4000 depth meters), (f) ‘Pelagic’ (include marine water samples from unspecified depth), (g) ‘Sediment’, (h) ‘Estuarine’, (i) ‘Artic mixed water’, (j) ‘Marine others’. (k) ‘Lakes’, (l) ‘Rivers’, (m) ‘Sediments’, (n) ‘Brackish’, (o) ‘Saline spring sediment’, (p) ‘High Arctic water’ and (q) ‘Bromeliads tank water’. (r) ‘Forest soil (Temperate)’, (s) ‘Forest soil (Tropical)’, (t) ‘Cropland soil (Temperate)’, (u) ‘Cropland soil (Tropical)’, (v) ‘Grassland soil (Temperate)’ and (w) ‘Land soil (Temperate)’. Color scales are detailed in the legend (left) (indicating the number of times the ASV was retrieved from the environment compared with the total number of observations of the ASV). The main different clusters described are indicated on the right of the legend box.

Xcellidae distribution in open oceans

One noteworthy finding was that the ASVs related to Xcellidae were mostly detected in the Mesopelagic zone (between 200 and 1000 depth meters) and exhibited a global distribution across all the different oceans (Fig. S6). However, it is important to consider that the detected eDNA could originate from various sources, including free-living organisms, potentially infected host organisms, metabolically inactive or dead cells, or free DNA28. To investigate this hypothesis, we conducted a case study using the published Malaspina Expedition circumnavigation dataset. The expedition took place from 2010 to 2011 and involved sampling the tropical and subtropical regions of the Atlantic, Indian, and Pacific Oceans. This expedition is unique because (i) it covered vertical water column profiles spanning seven depth meters, ranging from the surface to 4000 depth meters and (ii) both rDNA and rRNA were used as templates for environmental sequencing. Using this dataset, we investigated the ‘relative’ ribosomal activity of Xcellidae in marine samples using DNA and RNA-derived sequencing58. We identified 90 ASVs related to Xcellidae. These ASVs contributed to less than 0.1% of the total reads in both DNA and RNA samples. We analyzed the rRNA/rDNA ratio, which can be a proxy for the ‘relative’ ribosomal activity. Our results highlight that, in most mesopelagic samples, the Xcellidae ASVs exhibited higher relative rRNA contributions than rDNA. This indicates a potential active ribosomal activity, suggesting the presence of putatively living organisms related to Xcellidae in the mesopelagic zone of the marine water column (Fig. S8).

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *