{"id":14111,"date":"2013-05-22T21:49:29","date_gmt":"2013-05-23T01:49:29","guid":{"rendered":"http:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/the-norway-spruce-genome-sequence-and-conifer-genome-evolution\/"},"modified":"2013-05-22T21:49:29","modified_gmt":"2013-05-23T01:49:29","slug":"the-norway-spruce-genome-sequence-and-conifer-genome-evolution","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/transhuman-news-blog\/genome\/the-norway-spruce-genome-sequence-and-conifer-genome-evolution\/","title":{"rendered":"The Norway spruce genome sequence and conifer genome evolution"},"content":{"rendered":"<p><p>    We generated >1 billion RNA-Seq reads and used transcript    assemblies of these in combination with public expressed    sequence tags (ESTs) and transcripts to perform ab    initio prediction of protein-coding genes, which identified    a high confidence set of 28,354 loci with >70% coverage by    supporting evidence from the total set of 70,968 predicted    loci. A notable characteristic of the predicted gene structures    was the presence of numerous long introns (Fig.    1b), with mean intron length being higher than in most    available plant genomes, although similar to the repeat-rich    genomes of Vitis vinifera and Zea    mays17, 18. The longest    intron in the high-confidence genes was 68kb (Supplementary    Table 2.6), and 2,384 high-confidence genes contained    2,880longer    than5-kb    introns (20 of which we confirmed by PCR amplification;        Supplementary Information 2.14), 2,679 of which contained a    repeat, suggesting that repeat insertions account for intron    expansion. By contrast, exon size was consistent among the    species considered (Supplementary    Information 2.6.3). Numerous genes (~30%) remained split    across scaffolds owing to assembly fragmentation, and as such,    the longest introns were not represented in the P.abies 1.0    assembly. Long introns (either individual or cumulative intron    length) did not influence expression levels (Fig.    1c) and introns containing repeats have not contracted    despite a lack of recent repeat activity (see below).  <\/p>\n<p>          a, Gene family loss and gain in eight sequenced          plant genomes (Arabidopsis thaliana, Populus          trichocarpa, Vitis vinifera, Oryza          sativa, Zea mays, Picea abies,          Selaginella moellendorffii and Physcomitella          patens). Gene families were identified using TribeMCL          (inflation value 4), and the DOLLOP program from the          PHYLIP package was used to determine the minimum gene set          for ancestral nodes of the phylogenetic tree. We used          plant genome annotations filtered to remove transposable          elements. Orphans refers to gene families containing          only a single gene. Blue numbers indicate the number of          gene families. b, Boxplot representation of length          distribution for the 10% longest introns in the same          eight genomes. c, Scatter plots of cumulative          intron length against log10 expression          calculated as fragments per kilobase per million mapped          reads (FPKM) for high-confidence gene loci (top, coloured          orange) and green for lncRNA loci (middle, shaded green).          The bottom panel shows a histogram of cumulative intron          size in the two sets of loci. d, Distribution of          small (1824-nucleotide (nt)) RNAs and their          co-alignment-based colocation to genomic features          (repeats, high-confidence genes and their promoter\/UTRs).          CDS, coding sequence.        <\/p>\n<p>    Analysis of gene families in the high-confidence gene set and    seven sequenced plant genomes (five angiosperms: Arabidopsis    thaliana, Populus trichocarpa, Vitis    vinifera, Oryza sativa and Zea mays, and two    basal plants: Selaginella moellendorffii and    Physcomitrella patens) identified 1,021 P.    abies-specific gene families (Fig.    1a and     Supplementary Information 2.8). P. abies-specific    families included over-representation of Gene Ontology    categories involved in DNA repair and methylation of DNA and    chromatin (Supplementary    Information 2.8). As for most draft genomes, these results    probably overestimate gene numbers19 and will be    refined as we further improve the genome assembly.  <\/p>\n<p>    A common mechanism leading to genome size expansion is the    occurrence of a whole genome duplication (WGD) event. We    calculated the number of synonymous substitutions per    synonymous site (Ks) of paralogues within the    high-confidence genes but found no evidence for any recent WGD;    there was a clear, exponential decay in the number of retained    paralogues with increasing Ks values    (Supplementary    Information 2.9 and     Supplementary Fig. 2.6). However, a population dynamics    model that takes into account both small- and large-scale modes    of gene duplication20 suggested the    presence of a small peak (around Ks of 1.1),    which, considering the slow substitution rate of conifers,    might represent the ancient WGD predating the divergence of    angiosperms and gymnosperms (350Myr ago21).  <\/p>\n<p>    Previous examinations of small genomic subsets indicated that    conifer genomes contain numerous pseudogenes5, 6, 22, 23. The gene-like    fraction of the P.abies 1.0 assembly was identified by    alignment of RNA-Seq reads and de novo assembled    transcripts (Supplementary    Information 2.10). Within this subset of the genome, loci    with valid spliced alignments of de novo assembled    transcripts or the presence of a high-confidence gene were also    identified. The high-confidence gene set represented    27Mb of    protein-coding sequence, whereas 72Mb of regions were    identified with a valid spliced alignment or a high-confidence    gene. In stark contrast, 524Mb of gene-like regions were identified by    less stringent alignments. The presence of such a large    gene-like fraction lacking predicted gene structures supports    the presence of numerous pseudogenes.  <\/p>\n<p>    Recent ENCODE publications24, 25 characterized    numerous long non-coding RNA (lncRNA) loci in the human genome,    but this class of RNA remains largely uncharacterized in    plants. Using short-read de novo transcript assemblies,    13,031 spruce-specific and 9,686 conserved intergenic lncRNAs    were identified (Supplementary    Information 2.4.3). In common with the ENCODE results,    P. abies lncRNA loci contained fewer exons, were shorter    (Fig.    1c), and had more tissue-specific expression than    protein-coding loci (Supplementary    Fig. 2.8).  <\/p>\n<p>    There has been conflicting evidence about the presence of    24-nucleotide short RNAs (sRNAs) in gymnosperms26, 27, 28, 29, a class of    sRNA that silence transposable elements by the establishment of    DNA methylation30. Across 22    samples, we identified numerous 24-nucleotide sRNAs, but these    were highly specific to reproductive tissues, largely    associated with repeats but present at substantially lower    levels than in angiosperms (Fig.    1d and     Supplementary Fig. 2.10). By contrast, 21-nucleotide sRNAs    were associated with genes, repeats and promoters\/untranslated    regions (UTRs) (Fig.    1d). De novo microRNA (miRNA) prediction identified    2,719 loci, including 20 known miRNA families, with target    sites predicted within the high-confidence gene set for 1,378    of these (Supplementary    Information 2.13). Furthermore, 55 known miRNA families had    >5 aligned sRNA reads and mature miRNAs, representing 49    known families aligned to the genome (Supplementary    Information 2.13).  <\/p>\n<p><!-- Auto Generated --><\/p>\n<p>Read the original:<br \/>\n<a target=\"_blank\" href=\"http:\/\/dx.doi.org\/10.1038\/nature12211\" title=\"The Norway spruce genome sequence and conifer genome evolution\">The Norway spruce genome sequence and conifer genome evolution<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> We generated > 1 billion RNA-Seq reads and used transcript assemblies of these in combination with public expressed sequence tags (ESTs) and transcripts to perform ab initio prediction of protein-coding genes, which identified a high confidence set of 28,354 loci with > 70% coverage by supporting evidence from the total set of 70,968 predicted loci. A notable characteristic of the predicted gene structures was the presence of numerous long introns (Fig.  <a href=\"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/transhuman-news-blog\/genome\/the-norway-spruce-genome-sequence-and-conifer-genome-evolution\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[25],"tags":[],"class_list":["post-14111","post","type-post","status-publish","format-standard","hentry","category-genome"],"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/posts\/14111"}],"collection":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/comments?post=14111"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/posts\/14111\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/media?parent=14111"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/categories?post=14111"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/tags?post=14111"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}