Metagenomic analysis of viromes in tissues of wild Qinghai vole from the eastern Tibetan Plateau | Scientific Reports – Nature.com

Posted: October 15, 2022 at 5:22 pm

Overview of the viromes

In all, 41 wild Qinghai voles were collected from pasture habitats located on the eastern Tibetan Plateau, China (Fig.1). Tissue samples from liver, lung, spleen, small intestine (with content), and feces (large intestinal content) of each vole were disrupted, and viral RNA was extracted. The RNA samples were combined into 20 pools of equal quantities according to sample type (Supplementary Table S1). Overall, 729,234,124 paired-end reads with an average of 150bp in length were obtained from 20 libraries, yielding an average of 36.5M (95% CI: 35.237.8M) reads per pool. After filtering by fastp, 98.399.5% of raw reads were retained, and 722,035,886 clean reads were used for further analyses, of which 67.5% mapped to the host genome. Reads classified as cellular organisms (including eukaryotes, bacteria, and archaea) and those with no significant similarity to any amino acid sequence were discarded, leading to 1,472,071 reads best matched with viral sequences, accounting for 0.31% of total clean reads. Due to the presence of numerous transcripts from the hosts and other organisms, most pools had low levels of viral RNA. The percentage of virus-associated reads in each pool was 0.052.47% (Supplementary Table S1).

Satellite map (left) and topographic map (right) of the rodent collection area on the eastern Tibetan Plateau of China. Shiqu county is highlighted yellow, and Sichuan province is marked in light gray, the geographic coordinate of collection site (E97443E.67", N331040.40") is marked on the topographic map. The map was generated by SuperMap (http://www.supermapol.com/).

A wide range of DNA and RNA virus groups were covered by the sequence reads. Virus-associated reads were assigned into 46 families of double-stranded (ds)DNA viruses, dsRNA viruses, retro-transcribing viruses, single-stranded (ss)DNA viruses, and ssRNA viruses (positive- and negative-strand viruses) in the virus root. Based upon natural host of each virus, we classified 13 families of these viruses as vertebrate-associated viruses (6 zoonotic viruses and 7 non-zoonotic rodent associated viruses), 11 as bacteriophages, 10 plant viruses, 6 as fungal viruses, 1 as an insect virus, 5 as eukaryotic microorganism (protozoa and algae)-related viruses and a group of unclassified viruses (Supplementary Fig. S2 and Table S2). An overview of the classified and unclassified viral reads is shown in Fig.2A.

Proportion of viral sequence reads with BLASTX hits to the specified virus families. (A) Proportion in each library. The y-axis is the percentage of viral reads distribute to each classification, or that were unclassified viruses. The sample ID is shown on the x-axis. The percentage of reads was determined based on the raw number of viral-related reads. (B) Proportion in total viral reads.

The largest proportion of the virus-classified sequences was related to vertebrate viruses, with 81.93% of the total viral reads, which included zoonotic viruses (1.4%) and non-zoonotic rodent associated viruses (80.53%). Among them, viral sequences related to ssRNA positive-strand viruses in the Picornaviridae were abundant, comprising 65.6% of the total virus-like sequence reads. The dsDNA viruses were predominantly bacteriophages such as Ackermannviridae, Autographiviridae, unclassified bacteriophages and nine other families, accounting for 12.8% of the total viral reads. In addition, 4.3% of the viral sequences were related to insect viruses (0.06%), plant viruses (2.16%), fungus viruses (0.99%), and eukaryotic microorganism viruses (1.08%) (Fig.2B). Detection of these viral sequences may be due to food consumption. In addition to the family assigned reads, 0.94% of total viral reads were identified as unclassified RNA viruses, including diverse bunyavirales, picornavirales, riboviria, and environment-related viruses. Except for unclassified virus and bacteriophage (11 families), the top 10 most widely distributed families of viruses were Picornaviridae, Flaviviridae, Retroviridae, Picobirnaviridae, Solemoviridae, Arteriviridae, Mitoviridae, Mimiviridae, Phycodnaviridae, and Reoviridae. Samples of wild Qinghai voles had marked virus diversity.

Venn analyses revealed that 21 viral families, including Picornaviridae, Flaviviridae, Retroviridae, and Picobirnaviridae, were distributed in the five tissues (Fig.3, Supplementary Table S3). However, the Venn diagram demonstrated that nine viral familiessuch as Coronaviridae, Parvoviridae, Hypoviridae, Autographiviridae and five plant viruseswere unique to feces, which indicated that these viruses have compartment specificity. In addition, six viral families were shared between intestine and feces. The other 12 viral families were found in at least two tissues, one of which was a fecal sample.

Venn diagram of viral families shared in the five tissues. The numbers represent viral families found in each tissue. A total of 48 viral taxa were analyzed and displayed, which included 46 viral families, one unclassified virus and one unclassified Bacteriophage.

The results suggest that liver and feces act as major reservoirs for diverse viruses in wild Qinghai voles, accounting for 55.3 and 26.1% of total viral reads, respectively. To detect differences in virome structures among the samples, taxonomic heatmap and hierarchical cluster analyses were conducted based on the normalized viral reads number. A heatmap of all reads to the sequences of the 46 viral families, unclassified virus and unclassified Bacteriophage identified in this study is shown in Fig.4. For instance, in liver, Picornaviridae, Flaviviridae, Iridoviridae, and Poxviridae were abundant. In lung, Herpesviridae and Arteriviridae were the most abundant virus families. The most abundant viral family in spleen was Retroviridae. In intestine, Ackermannviridae and Circoviridae were abundant. However, 37 viral families and unclassified virus were abundant in feces. Compared to the other tissues, liver and feces samples clustered together separately, which indicated a closer correlation of virome structures. Overall, our results revealed significant differences in virus composition and abundance among tissues.

Heatmap based on the distance matrix calculated by the Euclidean distance method according to the normalized number of reads belonging to each viral family in 20 pools. X axis shows sample names, and the Y axis the names of viral families. Red to blue, highest to lowest abundance of viral reads according to viral family. The hierarchical clustering is based on the Euclidean distance matrix calculated from the normalized read count. A total of 48 viral taxa were analyzed and displayed, which included 46 viral families, one unclassified virus and one unclassified Bacteriophage. The heatmap was generated by Hiplot (v0.2.0, https://hiplot.com.cn).

By characterizing host traits and transmission routes, non-vertebrate-associated viral reads, bacteriophages, and unclassified viruses reads described previously were removed. The remaining 1,206,124 viral reads (approximately 81.93% of the total viral reads) were assigned into 13 vertebrate-related viral families. Viral reads from the families Picornaviridae, Flaviviridae, Retroviridae, Picobirnaviridae, Arteriviridae, Poxviridae, and Herpesviridae were widely distributed in tissues, in different abundances. The families Reoviridae, Adenoviridae, Astroviridae, Coronaviridae, Circoviridae, and Parvoviridae were found in few tissue types. Analyses of the virus reads distribution showed that 965,703 reads (65.5% of total viral reads) exhibited sequence similarity to Picornaviridae, accounting for a major portion of the total virus reads (Supplementary Table S2). Other mammalian virus sequences in order of sequence read abundance were Flaviviridae (8.27%), Retroviridae (3.36%), Picobirnaviridae (3.27%), and other families, accounting for 1.43% of viral reads. These viruses belonged to a genus or family known to cause human or animal infection were confirmed by PCR amplification using specific primers. All these viral reads were extracted from each dataset and submitted to de novo assembly by SPAdes software, length and depth of assembly contigs were shown in Supplementary Table S4. Blast results indicated that these genomes showed low nucleotide (nt) or amino acid (aa) similarity to known genome sequences in the GenBank database. We characterized some of these full or near-full genome sequences and compared them to their closest relatives by phylogenetic analyses.

Eleven near-complete genomic sequences for Picorna-like viruses were identified in all tissues except lung. Reads related to the Picornaviridae family comprised the largest proportion of viruses, particularly in liver (85.9%), small intestine (51.5%), and feces (45.7%) samples. The distribution of these picorna-like viruses among tissues was similar to picornavirus, which infect the liver and are transmitted by the fecal-oral or blood route34,35. Overall, these 11 genome sequences of picorna-like virus were retrieved from the pools and were of 74487640bp. Using NCBIs ORF finder, it was predicted that both genomes had a single ORF encoding a polyprotein, similar to the genome structure of Hepatovirus13,36. The nt identity between contigs ranged from 99.0 to 99.9%, showing great similarity. Moreover, sequence similarity and phylogenetic analyses indicated that all contigs clustered with rodent hepatovirus. Therefore, these genomes were classified into the genus Hepatovirus (Fig.5). BLASTn search revealed that these sequences were closely related to rodent hepatovirus (KT452641.1, Myodes glareolus, collected in Germany in 2011) with nt sequence identities between 82.74 and 82.76% (Supplementary Table S4). BlastX analyses revealed that these contigs were 91.8391.88% similar at the aa level to their closet relative polyprotein, that of rodent hepatovirus (YP_009179213.1, Microtus arvalis, collected in Germany in 2010). According to the ICTV criteria, the divergence of members of hepatovirus species ranges from 0.18 to 0.40 for the P1 region and 0.190.49 for the 3CD region37. The distance was 0.030.04 for the P1 region and 0.07 for the 3CD region between these contigs and rodent hepatovirus. Therefore, these contigs were proposed to be novel variants of rodent hepatovirus.

Phylogenetic relationships of hepatovirus variants based on analyses of the P1 protein (A) and 3CD protein (B). Branch lengths are drawn to a scale of aa substitutions per site. Numbers above individual branches indicate bootstrap support, only values>80% are shown. Vole hepatovirus variants are marked by a black dot, sample ID were labeled in parentheses.

In all, 121,679 reads were assigned to the family Flaviviridae (Supplementary Table S2), being found in almost all tissues. Such a broad distribution indicates diverse modes of potential transmission, such as vertical and fecal-oral. Seven near-complete genomic sequences were identified in samples (three in liver, one in lung, one in spleen, and two in feces) by de novo assembly, with a length of 86178625bp. These sequences shared 99.299.9% identity to each other. Sequence analyses using NCBI ORF finder revealed a single ORF translated into a polyprotein, with a genome structure similar to typical Flaviviridae16,38,39. These contigs were subjected to PCR confirmation and whole-genome phylogenetic analyses. All contigs were assigned to a clade in the genus Hepacivirus with various sequence similarities to rodent hepaciviruses collected from Neodon clarkei in Tibet, China in 2014. The contigs showed 75.6375.73% nt identity and 82.8388.90% aa identity with rodent hepacivirus (Fig.6 and Supplementary Table S4). According to the ICTV guidelines, hepaciviruses with<0.25 aa p-distances in the conserved region of NS3 and 0.3 in the NS5B region belong to the same species40. Because the NS5B and NS3 region p-distances between these contigs and rodent hepacivirus were 0.16 and 0.15, they were identified as variants of rodent hepacivirus.

Phylogenetic analyses of hepacivirus variants based on the NS5B (A) and NS3 (B) protein. Branch lengths are drawn to a scale of aa substitutions per site. Numbers above individual branches indicate bootstrap support, only values>80% are shown. Hepacivirus variants are marked by a black dot, sample ID were labeled in parentheses.

In the liver, spleen, intestine, and fecal pools, 16 near-complete or partial genome sequences (0.23.3k nt) of viruses of the family Reoviridae and genus Rotavirus were characterized. Analyses using NCBI ORF finder revealed a similar genome structure to Reoviridae, including the VP1, VP2, VP3, VP4, VP6, VP7, NSP2, and NSP3 segments38,41,42. BLASTn analyses of seven PCR-amplified segments (two of VP1, two of VP2, and three of VP3) revealed that vole rotavirus was related to other viruses from a range of host species, including Lama guanicoe, chicken, Rhinolophus blasii, Microtus agrestis, and human, with nt similarities of 70.8176.78% and aa identify of 67.6786.85% to the closest relatives in the VP1, VP2, and VP3 segments (Supplementary Table S4). These findings were confirmed by the phylogenetic analyses of the VP1 and VP6 segments. The contigs clustered with the species rotavirus A (Fig.7). According to the aa sequence identities of the RdRp (VP1) and VP6 regions, these contigs were proposed to be novel variants or genotypes of rotavirus A43.

Phylogenetic relationships of vole rotavirus A based on the VP1 (RdRp) protein (A) and VP6 protein (B). Branch lengths are drawn to a scale of aa substitutions per site. Numbers above individual branches indicate bootstrap support, only values>80% are shown. Novel rotavirus A variants are marked by a black dot, sample ID were labeled in parentheses.

In this study, 90% of picobirnavirus (PBV) sequence reads were detected in fecal samples. Two PBV contigs were obtained and PCR-confirmed from two fecal pools, with lengths of 1685/1684bp. The distributions of these sequences were coincident with other PBVs, which have been detected in the feces of human, rabbit, dog, pig, rat, and bird5,41. Further analyses of these two segments revealed 2 RdRp region of PBV. These two segments showed low similarity to PBV sequences in GenBank. Based on the best RdRp matches from a BLASTn and BLASTx search, and several related strains from GenBank, nucleotide and protein phylogenetic trees were constructed separately. The two segments clustered with PBVs detected in fecal samples of rat collected in China, with 81.4% nt identity and 81.2% aa identity, respectively (Fig.8 and Supplementary Table S4). According to the ICTV guidance, the high similarly between RdRp and Rat PBV revealed that these segments are new variants of PBV44.

Phylogenetic analyses of picobirnavirus genomes on the basis of the segment 2 (RdRp) aa sequence. Branch lengths are drawn to a scale of aa substitutions per site. Numbers above individual branches indicate the bootstrap support, only values>80% are shown. The novel variants of picobirnavirus are marked with a black dot, sample ID were labeled in parentheses.

Other sequence reads or contigs related to mammalian viruses showed low nucleotide and amino acid sequence identities (<80%) with known viruses. Of 13 vertebrate-associated viruses identified, 9 were selected (Supplementary Table S4) for confirmation by PCR screening and Sanger sequencing. In addition to hepatovirus, hepacivirus, rotavirus, and PBV, astrovirus were verified in fecal samples. The assembled Astrovirus contigs with length of 242343bp showed 6978.9% nt identity and 64.676.3% aa identity to diverse Astrovirus.

Moreover, some sequence reads related to the families Coronaviridae, Circoviridae, Parvoviridae, and Arteriviridae were occasionally detected and confirmed by RT-PCR. However, these segments were too short to identify genotypes, this suggests that these viruses might be of low viral load. Among them, coronavirus contigs were detected only in the fecal library (275 and 249bp), and showed similarity to a known rodent coronavirus strain, Lucheng Rn rat coronavirus (MT820627.1), belonging to the genus Alphacoronavirus, with 87.94% nt identity and 91.21% aa identity. The circovirus contigs from the intestine and fecal libraries (367bp) showed 78.11% nt identity to a feline cyclovirus (KM017740.1). Some contigs related to the family Parvoviridae were also identified, showing 74.86% similarity at the nt level and 73.75% at the aa level to a murine bocavirus (NC_055487.1). Sequence reads of Arteriviridae were identified in lung and spleen, one contig was retrieved from spleen (350bp) showed 70.6% nt identity and 73.9% aa identity to Arteriviridae sp., which was detected in Mus pahari in Thailand (MT085142) (Supplementary Table S4).

View original post here:
Metagenomic analysis of viromes in tissues of wild Qinghai vole from the eastern Tibetan Plateau | Scientific Reports - Nature.com

Related Posts