{"id":1117951,"date":"2023-09-21T10:16:34","date_gmt":"2023-09-21T14:16:34","guid":{"rendered":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/uncategorized\/genetic-diversity-and-ancestry-of-the-khmuic-speaking-ethnic-groups-nature-com\/"},"modified":"2023-09-21T10:16:34","modified_gmt":"2023-09-21T14:16:34","slug":"genetic-diversity-and-ancestry-of-the-khmuic-speaking-ethnic-groups-nature-com","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/transhuman-news-blog\/genome\/genetic-diversity-and-ancestry-of-the-khmuic-speaking-ethnic-groups-nature-com\/","title":{"rendered":"Genetic diversity and ancestry of the Khmuic-speaking ethnic groups &#8230; &#8211; Nature.com"},"content":{"rendered":"<p><p>Ethical statement    <\/p>\n<p>    Ethical approval of this study was granted from the Human    Experimentation Committee of the Research Institute for Health    Sciences, Chiang Mai university, Thailand (Certificate of    Ethical Clearance No. 31\/2022). During the research, we protect    the rights of participants and their identity, and we confirm    that all experiments were performed in accordance with relevant    guidelines and regulations based on the experimental protocol    on human subjects under the Declaration of Helsinki. Written    informed consent from all volunteers was obtained prior to the    interview and sample collection.  <\/p>\n<p>    A total of 95 unrelated subjects residing in five villages of    Nan province, Thailand, were enrolled with written informed    consent. Volunteers were healthy subjects who were over    20years old, of Khmuic-speaking ethnicity and had no    ancestors that were known to be from other recognized ethnic    groups for at least three generations. We collected personal    data using form-based oral interviews for self-reported    unrelated lineages, linguistics, and migration histories.    Following the manufacturer's instructions, we collected buccal    or saliva samples and extracted DNA using the Gentra Puregene    Buccal Cell Kit (Qiagen, Germany).  <\/p>\n<p>    Genotyping was carried out using the Affymetrix Axiom    Genome-Wide Human Origins array10. Affymetrix    Genotyping Console v4.2s primary screening produced a total of    93 samples that were genotyped for 622,834 loci on the hg19    version of the human reference genome coordinates (genotype    call rate97%). We used PLINK version    1.90b5.224 to exclude loci    and individuals with more than 5% missing data and also exclude    mtDNA and sex chromosome from our analysis. We further excluded    loci that did not pass the HardyWeinberg equilibrium test    (P value<0.00005) or had more than 5% missing data,    within any population. We used KING 2.325 to determine    individual relatedness, and we removed one person from each    pair of first degree kinship. After these quality control    measures, there are 81 Khmuic-speaking people    (Fig.1) with 612,614 loci    overall.  <\/p>\n<p>    We next used PLINK version 1.90b5.2 to merge our newly obtained    genotyping results with a set of genome-wide SNP    data8, which included    populations from East\/Southeast Asia, South Asia, African    Mbuti, European French, and Southeast Asian ancient    samples9,10,11,12,13. It should be    noted that in this collection, allelic data from ancient    samples was gathered using pseudo-haploid techniques, and    samples with less than 15,000 informative loci were eliminated.    After filtering the positions of SNPs that can be jointly    analyzed within this dataset, we excluded SNPs that had more    than 5% missing data or with a minor allele frequency (MAF)    less than 3.3104 or were not in HardyWeinberg    equilibrium with a significance level of P<0.00005.    As a result, 353,505 positions in a dataset consisting of 979    individuals from 90 populations (Supplementary Table    1 and 2) were used for    subsequent analysis.  <\/p>\n<p>    In order to investigate the genetic structure and relationships    of the analyzed sample, we used PLINK version 1.90b5.2 to    perform pruning for linkage disequilibrium, excluding one    variant from pairs with r2>0.4 within    windows of 200 variants and a step size of 25 variants. A total    of 959 individuals from the sample set, excluding the Mbuti and    French populations, were incorporated. There were 149,384 SNPs    positions available for this analysis. The Principal Component    Analysis (PCA) was carried out using smartpca from EIGENSOFT    with the \"lsqproject\" and \"autoshrink\" options.  <\/p>\n<p>    To infer population structure, we employed 155,709 SNP    positions derived from a sample set of 979 individuals, which    encompassed both Asian samples and the outgroups represented by    the Mbuti and French populations, for the ADMIXTURE analysis.    The clustering tool ADMIXTURE version 1.3.014 was run from    K=2 to K=10 with 100 replicates for each K and using random    seeds with the -P option. For each K, the top 20    ADMIXTURE replicates with the highest likelihood for the major    mode were displayed using PONG version 1.4.726. For these PCA    and ADMIXTURE analyses, the ancient samples and highly drifted    modern populations (Mlabri, Onge, Mamanwa, Khamu, and Lua) were    projected.  <\/p>\n<p>    To test admixture and excess ancestry sharing, we used admixr    version 0.7.127 from ADMIXTOOLS    version 5.110 to calculate the    f3 and f4-statistics, with assessed through block    jackknife resampling across the genome and using Mbuti as the    outgroup. A total of 353,505 SNPs from 979 samples were used in    these analyses. Additionalf4-statistics were    computed when ancient samples were involved, using French as    the outgroup to avoid deep attraction to Africans and only    transversions (2,94751,452 SNPs depending on the quality of    samples) to avoid potential noise from ancient DNA damage    patterns28. We used    pheatmap package in R version 3.6.0 to visualize the heatmap of    f3 and f4 profiles.  <\/p>\n<p>    To examine the haplotype sharing between different groups, we    used SHAPEIT version 4.1.329 to phase the    modern samples. We employed South Asian and East Asian    populations as a reference panel (excluding the Kinh    Vietnamese) and the recombination map from the 1000 Genomes    Phase330 was also used.    Our analysis specifically focused on modern population data,    consisting of 359,539 SNPs. For the preparation of the    reference panel, we extracted individuals of East and South    Asian descent, as well as the overlapping sites with our data,    for each chromosome from the 1000 Genomes Phase3 data using    bcftools version 1.4. The phasing accuracy of SHAPEIT4 can be    improved by increasing the number of conditioning neighbors in    the Positional Burrows-Wheeler Transform (PBWT) on which    haplotype estimation is based29. We conducted    phasing with the option -pbwt-depth 8 for 8 conditioning    neighbors, while keeping other parameters as default.    Subsequently, we employed ChromoPainter version    231 on the phased    dataset to initiate the investigation of haplotype sharing with    sample sizes for each population were randomly down-sampled to    4 and 8. The former was used for 10 iterations of the EM    (expectation maximization) process to estimate the switch rate    and global mutation probability. The latter was employed for    the chromosomal painting process with the estimated switch and    global mutation rates. The output of this process was then used    for downstream analyses. We then attempted to paint the    chromosomes of each individual, with all the modern Asian    samples serving as donors and recipients via the -a argument.    The EM estimation yielded a switch rate of approximately 251.21    and a global mutation probability of approximately 0.00001,    which were subsequently used as starting values for these    parameters for all donors in the painting process. The heatmap    results were generated using the pheatmap package in R.  <\/p>\n<p>    To construct the admixture graph, our initial step involved    selecting backbone populations from different language families    in Southeast Asia. Specifically, we used f4-statistics    to choose representative ethnic groups that speak Austronesian,    Tai-Kadai, Austroasiatic, Hmong-Mien, and Sino-Tibetan    languages, which included Atayal, Dai, Cambodian, Miao, and    Naxi, respectively. We employed the African Mbuti and North    Indian populations (Gujarati, Brahmin Tiwari, and Lodhi) who    speak Indo-European languages as outgroups. Our focus was on    constructing the admixture graph for the Austroasiatic language    family in Thailand. Thus, we categorized these populations    according to their linguistic branches; Katuic (Bru and Soa),    Monic (Mon), Palaungic (Lawa_Eastern, Lawa_Western, Palaung,    Blang), and Mlabri. Our interested Khmuic-speaking people were    divided into the Khamu (consist of four Khamu populations) and    Lua (consist of two Lua populations together with HtinMal and    HtinPray).  <\/p>\n<p>    For modeling the admixture graph, we used a dataset of 359,539    SNPs from modern populations as the input for ADMIXTOOLS    232. Initially, we    computed pairwise f2 statistics between the groups using    the extract_f2 function with specific parameters;    maxmiss=0 (no missing SNPs to calculate), useallsnp: NO    (no missing data to allow), and blg=0.05 (SNP block size    set in 0.05 morgans). Then, we extracted allele frequency    products from the computed f2 blocks using    f2_from_precomp. Next, for each scenario, we searched for the    best-fitting admixture graph by running ten independent runs of    find_graphs. From the 100 independent runs, we selected the    one with the lowest score (computed based on residuals between    the expected and observed f-statistics given the data)    using random_admixturegraph. To confirm the fitting graph, we    tested the graph with the lowest score using qpgraph with    parameters numstart=100, diag=0.0001,    return_fstats=TRUE. This allowed us to check if the absolute    value of the worst-fitting Z score was below 3. Starting with    no migrations (numadmix=0), we gradually added migrations    until we found a fitting graph, which we considered as the    best-fitting graph for that particular scenario.  <\/p>\n<p><!-- Auto Generated --><\/p>\n<p>Go here to see the original:<br \/>\n<a target=\"_blank\" href=\"https:\/\/www.nature.com\/articles\/s41598-023-43060-7\" title=\"Genetic diversity and ancestry of the Khmuic-speaking ethnic groups ... - Nature.com\" rel=\"noopener\">Genetic diversity and ancestry of the Khmuic-speaking ethnic groups ... - Nature.com<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> Ethical statement Ethical approval of this study was granted from the Human Experimentation Committee of the Research Institute for Health Sciences, Chiang Mai university, Thailand (Certificate of Ethical Clearance No. 31\/2022).  <a href=\"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/transhuman-news-blog\/genome\/genetic-diversity-and-ancestry-of-the-khmuic-speaking-ethnic-groups-nature-com\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[25],"tags":[],"class_list":["post-1117951","post","type-post","status-publish","format-standard","hentry","category-genome"],"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/posts\/1117951"}],"collection":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/comments?post=1117951"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/posts\/1117951\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/media?parent=1117951"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/categories?post=1117951"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/tags?post=1117951"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}