{"id":224937,"date":"2017-07-02T00:47:17","date_gmt":"2017-07-02T04:47:17","guid":{"rendered":"http:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/uncategorized\/researchers-build-seqspark-to-analyze-massive-genetic-data-sets-medical-xpress.php"},"modified":"2017-07-02T00:47:17","modified_gmt":"2017-07-02T04:47:17","slug":"researchers-build-seqspark-to-analyze-massive-genetic-data-sets-medical-xpress","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/genetic-medicine\/researchers-build-seqspark-to-analyze-massive-genetic-data-sets-medical-xpress.php","title":{"rendered":"Researchers build SEQSpark to analyze massive genetic data sets &#8211; Medical Xpress"},"content":{"rendered":"<p><p>June 30, 2017          <\/p>\n<p>      Uncovering rare susceptibility variants that contribute to      the causes of complex diseases requires large sample sizes      and massively parallel sequencing technologies. These sample      sizes, often made up of exome and genome data from tens to      hundreds of thousands of individuals, are often too large for      current analytical tools to process. A team at Baylor College      of Medicine, led by Dr. Suzanne Leal, professor of molecular      and human genetics, has developed new software called      SEQSpark to overcome this processing obstacle. A study on the      new technology appears in The American Journal of Human      Genetics.    <\/p>\n<p>    \"To handle these large data sets, we built the SEQSpark tool based on    the commonly used Spark program, which allows SEQSpark to    utilize multiple processing platforms to increase the speed and    efficiency of performing data quality control, annotation and    rare variant association analysis,\" Leal    said.  <\/p>\n<p>    To test and validate the versatility and speed of SEQSpark,    Leal and her team analyzed benchmarks from the whole genome sequence data from the UK10K,    testing specifically for waist-to-hip ratios.  <\/p>\n<p>    \"The analysis and related tasks took about one and a half hours    to complete, in total. This includes loading the data,    annotation, principal components analysis and single and rare    variant aggregate association analysis for the more than 9    million variants present in this sample set,\" explained Di    Zhang, a postdoctoral associate in the Leal lab at Baylor and    first author on the paper.  <\/p>\n<p>    To evaluate SEQSpark's performance in a larger data set, Leal    and the research team generated 50,000 simulated exomes. The    SEQSprak program ran the analysis for a quantitative trait    using several variant aggregate association methods in an hour    and forty-five minutes.  <\/p>\n<p>    When compared to other variant association tools, SEQSpark was    consistently faster, reducing computation to a hundredth of the    time in some cases.  <\/p>\n<p>    \"What is unique about SEQSpark is that it is scalable, and    smaller labs can run it without super specific hardware, and it    can also be run in a multi-server environment to increase its    speed and capacity for large genetic data sets,\" Zhang said.    \"It is ideal for large-scale genetic epidemiological studies    and is highly efficient from a computational standpoint.\"  <\/p>\n<p>    \"We see this software as being very useful as the demand for    the analysis of massively parallel    sequence data grows. SEQSpark is highly versatile, and as we    analyze increasingly large sets of rare variant data, it has    the potential to play a key role in furthering personalized    medicine,\" Leal said.  <\/p>\n<p>    In the future, Leal and her team will continue to test and    increase SEQSpark's capabilities and will be analyzing soon    data sets that have 500,000 samples or more.  <\/p>\n<p>     Explore further:        Genetic test for familial data improves detection genes causing    complex diseases such as Alzheimer's  <\/p>\n<p>    More information: Di Zhang et al. SEQSpark: A Complete    Analysis Tool for Large-Scale Rare Variant Association Studies    using Whole-Genome and Exome Sequence Data, The American    Journal of Human Genetics (2017). DOI: 10.1016\/j.ajhg.2017.05.017<\/p>\n<p>        A team of researchers at Baylor College of Medicine has        developed a family-based association test that improves the        detection in families of rare disease-causing variants of        genes involved in complex conditions such as Alzheimer's.        ...      <\/p>\n<p>        Precision medicine, which utilizes genetic and molecular        techniques to individually tailor treatments and        preventative measures for chronic diseases, has become a        major national project, with President Obama launching the        ...      <\/p>\n<p>        A multi-institutional team of researchers has sequenced the        DNA of 6,700 exomes, the portion of the genome that        contains protein-coding genes, as part of the National        Heart, Lung and Blood Institute (NHLBI)-funded Exome        Sequencing ...      <\/p>\n<p>        (Medical Xpress)Via genetic analysis, a large        international team of researchers has found rare, damaging        gene variants that they believe contribute to the risk of a        person developing schizophrenia. In their paper published        ...      <\/p>\n<p>        Human genome sequencing costs have dropped precipitously        over the last few years, however the analytical ability to        meet the growing demand for making sense of large data sets        remains as a bottleneck. With the introduction ...      <\/p>\n<p>        Researchers at EMBL-EBI have developed a new approach to        studying the effect of multiple genetic variations on        different traits. The new algorithm, published in Nature        Methods, makes it possible to perform genetic analysis ...      <\/p>\n<p>        Following up on findings from a an earlier genome-wide        association study (GWAS) of type 2 diabetes (T2D) in        Latinos, researchers from the Broad Institute of MIT and        Harvard and Massachusetts General Hospital (MGH) traced ...      <\/p>\n<p>        Although the basic outlines of human hearing have been        known for years - sensory cells in the inner ear turn sound        waves into the electrical signals that the brain        understands as sound - the molecular details have remained        ...      <\/p>\n<p>        Using a new skin cell model, researchers have overcome a        barrier that previously prevented the study of living        tissue from people at risk for early heart disease and        stroke. This research could lead to a new understanding ...      <\/p>\n<p>        The first results from a functional genetic catalogue of        the laboratory mouse has been shared with the biomedical        research community, revealing new insights into a range of        rare diseases and the possibility of accelerating ...      <\/p>\n<p>        Whole genome sequencing involves the analysis of all three        billion pairs of letters in an individual's DNA and has        been hailed as a technology that will usher in a new era of        predicting and preventing disease. However, the ...      <\/p>\n<p>        Researchers have found that genes for coronary heart        disease (CAD) also influence reproduction, so in order to        reproduce successfully, the genes for heart disease will        also be inherited.      <\/p>\n<p>      Please sign      in to add a comment. Registration is free, and takes less      than a minute. Read more    <\/p>\n<p><!-- Auto Generated --><\/p>\n<p>Read more: <\/p>\n<p><a target=\"_blank\" href=\"https:\/\/medicalxpress.com\/news\/2017-06-seqspark-massive-genetic.html\" title=\"Researchers build SEQSpark to analyze massive genetic data sets - Medical Xpress\">Researchers build SEQSpark to analyze massive genetic data sets - Medical Xpress<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> June 30, 2017 Uncovering rare susceptibility variants that contribute to the causes of complex diseases requires large sample sizes and massively parallel sequencing technologies. These sample sizes, often made up of exome and genome data from tens to hundreds of thousands of individuals, are often too large for current analytical tools to process. A team at Baylor College of Medicine, led by Dr <a href=\"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/genetic-medicine\/researchers-build-seqspark-to-analyze-massive-genetic-data-sets-medical-xpress.php\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"limit_modified_date":"","last_modified_date":"","_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[5],"tags":[],"class_list":["post-224937","post","type-post","status-publish","format-standard","hentry","category-genetic-medicine"],"modified_by":null,"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/224937"}],"collection":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/comments?post=224937"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/224937\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/media?parent=224937"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/categories?post=224937"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/tags?post=224937"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}