Ethical statement
Ethical approval of this study was granted from the Human Experimentation Committee of the Research Institute for Health Sciences, Chiang Mai university, Thailand (Certificate of Ethical Clearance No. 31/2022). During the research, we protect the rights of participants and their identity, and we confirm that all experiments were performed in accordance with relevant guidelines and regulations based on the experimental protocol on human subjects under the Declaration of Helsinki. Written informed consent from all volunteers was obtained prior to the interview and sample collection.
A total of 95 unrelated subjects residing in five villages of Nan province, Thailand, were enrolled with written informed consent. Volunteers were healthy subjects who were over 20years old, of Khmuic-speaking ethnicity and had no ancestors that were known to be from other recognized ethnic groups for at least three generations. We collected personal data using form-based oral interviews for self-reported unrelated lineages, linguistics, and migration histories. Following the manufacturer's instructions, we collected buccal or saliva samples and extracted DNA using the Gentra Puregene Buccal Cell Kit (Qiagen, Germany).
Genotyping was carried out using the Affymetrix Axiom Genome-Wide Human Origins array10. Affymetrix Genotyping Console v4.2s primary screening produced a total of 93 samples that were genotyped for 622,834 loci on the hg19 version of the human reference genome coordinates (genotype call rate97%). We used PLINK version 1.90b5.224 to exclude loci and individuals with more than 5% missing data and also exclude mtDNA and sex chromosome from our analysis. We further excluded loci that did not pass the HardyWeinberg equilibrium test (P value<0.00005) or had more than 5% missing data, within any population. We used KING 2.325 to determine individual relatedness, and we removed one person from each pair of first degree kinship. After these quality control measures, there are 81 Khmuic-speaking people (Fig.1) with 612,614 loci overall.
We next used PLINK version 1.90b5.2 to merge our newly obtained genotyping results with a set of genome-wide SNP data8, which included populations from East/Southeast Asia, South Asia, African Mbuti, European French, and Southeast Asian ancient samples9,10,11,12,13. It should be noted that in this collection, allelic data from ancient samples was gathered using pseudo-haploid techniques, and samples with less than 15,000 informative loci were eliminated. After filtering the positions of SNPs that can be jointly analyzed within this dataset, we excluded SNPs that had more than 5% missing data or with a minor allele frequency (MAF) less than 3.3104 or were not in HardyWeinberg equilibrium with a significance level of P<0.00005. As a result, 353,505 positions in a dataset consisting of 979 individuals from 90 populations (Supplementary Table 1 and 2) were used for subsequent analysis.
In order to investigate the genetic structure and relationships of the analyzed sample, we used PLINK version 1.90b5.2 to perform pruning for linkage disequilibrium, excluding one variant from pairs with r2>0.4 within windows of 200 variants and a step size of 25 variants. A total of 959 individuals from the sample set, excluding the Mbuti and French populations, were incorporated. There were 149,384 SNPs positions available for this analysis. The Principal Component Analysis (PCA) was carried out using smartpca from EIGENSOFT with the "lsqproject" and "autoshrink" options.
To infer population structure, we employed 155,709 SNP positions derived from a sample set of 979 individuals, which encompassed both Asian samples and the outgroups represented by the Mbuti and French populations, for the ADMIXTURE analysis. The clustering tool ADMIXTURE version 1.3.014 was run from K=2 to K=10 with 100 replicates for each K and using random seeds with the -P option. For each K, the top 20 ADMIXTURE replicates with the highest likelihood for the major mode were displayed using PONG version 1.4.726. For these PCA and ADMIXTURE analyses, the ancient samples and highly drifted modern populations (Mlabri, Onge, Mamanwa, Khamu, and Lua) were projected.
To test admixture and excess ancestry sharing, we used admixr version 0.7.127 from ADMIXTOOLS version 5.110 to calculate the f3 and f4-statistics, with assessed through block jackknife resampling across the genome and using Mbuti as the outgroup. A total of 353,505 SNPs from 979 samples were used in these analyses. Additionalf4-statistics were computed when ancient samples were involved, using French as the outgroup to avoid deep attraction to Africans and only transversions (2,94751,452 SNPs depending on the quality of samples) to avoid potential noise from ancient DNA damage patterns28. We used pheatmap package in R version 3.6.0 to visualize the heatmap of f3 and f4 profiles.
To examine the haplotype sharing between different groups, we used SHAPEIT version 4.1.329 to phase the modern samples. We employed South Asian and East Asian populations as a reference panel (excluding the Kinh Vietnamese) and the recombination map from the 1000 Genomes Phase330 was also used. Our analysis specifically focused on modern population data, consisting of 359,539 SNPs. For the preparation of the reference panel, we extracted individuals of East and South Asian descent, as well as the overlapping sites with our data, for each chromosome from the 1000 Genomes Phase3 data using bcftools version 1.4. The phasing accuracy of SHAPEIT4 can be improved by increasing the number of conditioning neighbors in the Positional Burrows-Wheeler Transform (PBWT) on which haplotype estimation is based29. We conducted phasing with the option -pbwt-depth 8 for 8 conditioning neighbors, while keeping other parameters as default. Subsequently, we employed ChromoPainter version 231 on the phased dataset to initiate the investigation of haplotype sharing with sample sizes for each population were randomly down-sampled to 4 and 8. The former was used for 10 iterations of the EM (expectation maximization) process to estimate the switch rate and global mutation probability. The latter was employed for the chromosomal painting process with the estimated switch and global mutation rates. The output of this process was then used for downstream analyses. We then attempted to paint the chromosomes of each individual, with all the modern Asian samples serving as donors and recipients via the -a argument. The EM estimation yielded a switch rate of approximately 251.21 and a global mutation probability of approximately 0.00001, which were subsequently used as starting values for these parameters for all donors in the painting process. The heatmap results were generated using the pheatmap package in R.
To construct the admixture graph, our initial step involved selecting backbone populations from different language families in Southeast Asia. Specifically, we used f4-statistics to choose representative ethnic groups that speak Austronesian, Tai-Kadai, Austroasiatic, Hmong-Mien, and Sino-Tibetan languages, which included Atayal, Dai, Cambodian, Miao, and Naxi, respectively. We employed the African Mbuti and North Indian populations (Gujarati, Brahmin Tiwari, and Lodhi) who speak Indo-European languages as outgroups. Our focus was on constructing the admixture graph for the Austroasiatic language family in Thailand. Thus, we categorized these populations according to their linguistic branches; Katuic (Bru and Soa), Monic (Mon), Palaungic (Lawa_Eastern, Lawa_Western, Palaung, Blang), and Mlabri. Our interested Khmuic-speaking people were divided into the Khamu (consist of four Khamu populations) and Lua (consist of two Lua populations together with HtinMal and HtinPray).
For modeling the admixture graph, we used a dataset of 359,539 SNPs from modern populations as the input for ADMIXTOOLS 232. Initially, we computed pairwise f2 statistics between the groups using the extract_f2 function with specific parameters; maxmiss=0 (no missing SNPs to calculate), useallsnp: NO (no missing data to allow), and blg=0.05 (SNP block size set in 0.05 morgans). Then, we extracted allele frequency products from the computed f2 blocks using f2_from_precomp. Next, for each scenario, we searched for the best-fitting admixture graph by running ten independent runs of find_graphs. From the 100 independent runs, we selected the one with the lowest score (computed based on residuals between the expected and observed f-statistics given the data) using random_admixturegraph. To confirm the fitting graph, we tested the graph with the lowest score using qpgraph with parameters numstart=100, diag=0.0001, return_fstats=TRUE. This allowed us to check if the absolute value of the worst-fitting Z score was below 3. Starting with no migrations (numadmix=0), we gradually added migrations until we found a fitting graph, which we considered as the best-fitting graph for that particular scenario.
Go here to see the original:
Genetic diversity and ancestry of the Khmuic-speaking ethnic groups ... - Nature.com
- The complete plastome sequences of invasive weed Parthenium hysterophorus: genome organization, evolutionary ... - Nature.com - February 18th, 2024 [February 18th, 2024]
- Multi-omic profiling reveals associations between the gut microbiome, host genome and transcriptome in patients with ... - Journal of Translational... - February 18th, 2024 [February 18th, 2024]
- Polymerase Chain Reaction (PCR) - National Human Genome Research Institute - February 18th, 2024 [February 18th, 2024]
- Genomic Time Machine Reveals Secrets of Human DNA - SciTechDaily - February 18th, 2024 [February 18th, 2024]
- 1 Million Unannotated Exons Discovered in the Human Genome - Technology Networks - February 18th, 2024 [February 18th, 2024]
- Hope for the night parrot: bird's full genome has been sequenced - Cosmos - February 18th, 2024 [February 18th, 2024]
- RevIT AAV Enhancer: Rev-up AAV genome production in upstream manufacturing - BioProcess Insider - February 18th, 2024 [February 18th, 2024]
- Multi-omics resources for the Australian southern stuttering frog (Mixophyes australis) reveal assorted antimicrobial ... - Nature.com - February 18th, 2024 [February 18th, 2024]
- Large-scale gene expression alterations introduced by structural variation drive morphotype diversification in Brassica ... - Nature.com - February 18th, 2024 [February 18th, 2024]
- Near-gapless and haplotype-resolved apple genomes provide insights into the genetic basis of rootstock-induced ... - Nature.com - February 18th, 2024 [February 18th, 2024]
- Secrets of Night Parrot unlocked after first genome sequenced - CSIRO - February 18th, 2024 [February 18th, 2024]
- CRISPR gene editing tool gets a revolutionary high-tech upgrade - Earth.com - February 18th, 2024 [February 18th, 2024]
- Ancient retroviruses played a key role in the evolution of vertebrate brains - EurekAlert - February 18th, 2024 [February 18th, 2024]
- Natural selection and genetic diversity maintenance in a parasitic wasp during continuous biological control application - Nature.com - February 18th, 2024 [February 18th, 2024]
- Hopes elusive parrots genome will provide answers - news.com.au - February 18th, 2024 [February 18th, 2024]
- MicroRNA is the master regulator of the genome researchers are learning how to treat disease by harnessing the ... - The Conversation - November 30th, 2023 [November 30th, 2023]
- "Ground-Breaking" Release of World's Largest Whole Genome Resource - Inside Precision Medicine - November 30th, 2023 [November 30th, 2023]
- Pangenome analysis reveals genomic variations associated with domestication traits in broomcorn millet - Nature.com - November 30th, 2023 [November 30th, 2023]
- Global genetic diversity, introgression, and evolutionary adaptation of indicine cattle revealed by whole genome ... - Nature.com - November 30th, 2023 [November 30th, 2023]
- Genome characteristics of atypical porcine pestivirus from abortion cases in Shandong Province, China - Virology Journal - Virology Journal - November 30th, 2023 [November 30th, 2023]
- Correcting modification-mediated errors in nanopore sequencing by nucleotide demodification and reference-based ... - Nature.com - November 30th, 2023 [November 30th, 2023]
- CRISPR-Based "Genome Shredding" Technique Shows Promise in Treating Glioblastoma - Inside Precision Medicine - November 30th, 2023 [November 30th, 2023]
- Genome wide analysis revealed conserved domains involved in the effector discrimination of bacterial type VI secretion ... - Nature.com - November 30th, 2023 [November 30th, 2023]
- TRISH to investigate the effects of spaceflight on the human genome, central nervous system - Odessa American - November 30th, 2023 [November 30th, 2023]
- The venom preceded the stinger: Genomic studies shed light on the origins of bee venom - EurekAlert - November 30th, 2023 [November 30th, 2023]
- Integrating genomic and multiomic data for Angelica sinensis provides insights into the evolution and biosynthesis of ... - Nature.com - November 30th, 2023 [November 30th, 2023]
- Researchers to Apply Genome Analysis to Childhood Cancers; Goal ... - The Japan News - September 21st, 2023 [September 21st, 2023]
- How Bats' Genomes May Help Them Avoid Cancer and Survive ... - Technology Networks - September 21st, 2023 [September 21st, 2023]
- Longitudinal genomic surveillance of carriage and transmission of ... - Nature.com - September 21st, 2023 [September 21st, 2023]
- Whole genomes from bacteria collected at diagnostic units around ... - Nature.com - September 21st, 2023 [September 21st, 2023]
- Genome-wide identification of lncRNA & mRNA for T2DM | PGPM - Dove Medical Press - September 21st, 2023 [September 21st, 2023]
- Tasmanian tiger RNA is first to be recovered from an extinct animal - Nature.com - September 21st, 2023 [September 21st, 2023]
- Loneliness and depression: bidirectional mendelian randomization ... - Nature.com - September 21st, 2023 [September 21st, 2023]
- Rome Therapeutics adds $72 million to Series B round to harness ... - OutSourcing-Pharma.com - September 21st, 2023 [September 21st, 2023]
- Mystery of 'living fossil' tree frozen in time for 66 million years finally ... - Livescience.com - September 21st, 2023 [September 21st, 2023]
- Why the human genome could be healthcares holy grail - Yahoo Finance - May 4th, 2023 [May 4th, 2023]
- Scientists Compare Genomes of 240 Mammals to Understand Human DNA - The New York Times - May 4th, 2023 [May 4th, 2023]
- Genomes From 240 Mammalian Species Help Explain 100 Years Of Evolution And Human Disease - ABP Live - May 4th, 2023 [May 4th, 2023]
- 'Deletions' from the human genome may be what made us human - Yale News - May 4th, 2023 [May 4th, 2023]
- GeneDx Adds Buccal Swab as Non-Invasive Whole Genome ... - GlobeNewswire - May 4th, 2023 [May 4th, 2023]
- Whole-genome sequencing used to track down genes behind familial glioma - Medical Xpress - May 4th, 2023 [May 4th, 2023]
- Wiggly proteins guard the genome: Dynamic network in the pores of ... - Science Daily - May 4th, 2023 [May 4th, 2023]
- Genome-Wide Splicing Quantitative Expression Locus Analysis ... - Cancer Discovery - May 4th, 2023 [May 4th, 2023]
- Digital Genome Market is expand at a CAGR of 8.6% to reach USD ... - Digital Journal - May 4th, 2023 [May 4th, 2023]
- High School Students Learn the Basics of Base Editing to Cure GFP ... - University of California San Diego - May 4th, 2023 [May 4th, 2023]
- Genomic researchers gain access to CSIRO's AI-powered data ... - Microsoft - May 4th, 2023 [May 4th, 2023]
- Archaic hominin traits through the splicing lens - Nature.com - May 4th, 2023 [May 4th, 2023]
- Critical bug in genome sequencing device scores '10' on CVSS ratings - SC Media - May 4th, 2023 [May 4th, 2023]
- Novel Genomic Approach Ensures Better Diagnosis of Hereditary ... - Technology Networks - May 4th, 2023 [May 4th, 2023]
- Intellia Therapeutics: Leading the Way in Revolutionary Genome ... - Best Stocks - May 4th, 2023 [May 4th, 2023]
- Visual tracking of viral infection dynamics reveals the synergistic ... - Nature.com - May 4th, 2023 [May 4th, 2023]
- Genome | Genome LLC | United States - March 31st, 2023 [March 31st, 2023]
- Belarus: EU and WHO deliver equipment for research of genomes of infectious disease agents - EIN News - February 24th, 2023 [February 24th, 2023]
- Gene vs. genome: Definition, function, and impact - January 30th, 2023 [January 30th, 2023]
- Big cog in the wheel: As Covid worries reappear, Insacogs genome sequencing ability must be aided by govts - Times of India - December 25th, 2022 [December 25th, 2022]
- CapitalGainsReport Sector Spotlight: Healthcare Penny Stocks On The Move (ARDX, WHSI, BNGO) - Marketscreener.com - November 25th, 2022 [November 25th, 2022]
- Genome Insight and Kun-hee Lee Child Cancer & Rare Disease Project Team of SNUH (Seoul National University Hospital) Made an Agreement About a... - November 23rd, 2022 [November 23rd, 2022]
- Genome-wide association study reveals distinct genetic associations related to leaf hair density in two lineages of wheat-wild relative Aegilops... - October 19th, 2022 [October 19th, 2022]
- The Global Genomics Market to Exhibit Growth at a CAGR of 16.90% During the Forecast Period (20222027) | DelveInsight - Yahoo Finance - October 19th, 2022 [October 19th, 2022]
- Illumina and GenoScreen Partner to Expand Access to Genomic Testing for Multidrug-Resistant Tuberculosis - PR Newswire - October 19th, 2022 [October 19th, 2022]
- Superresolution Method Poised to Better Gene Function Understanding - Photonics.com - October 19th, 2022 [October 19th, 2022]
- Genome-centric analysis of short and long read metagenomes reveals uncharacterized microbiome diversity in Southeast Asians - Nature.com - October 15th, 2022 [October 15th, 2022]
- How a New Battery Data Genome Project Will Use Vast Amounts of Information to Build Better EVs - InsideClimate News - October 15th, 2022 [October 15th, 2022]
- Scientists Reconstruct the Genome of the 180-Million-Year-Old Common Ancestor of All Mammals - SciTechDaily - October 15th, 2022 [October 15th, 2022]
- Combining OSMAC, metabolomic and genomic methods for the production and annotation of halogenated azaphilones and ilicicolins in termite symbiotic... - October 15th, 2022 [October 15th, 2022]
- Concerted expansion and contraction of immune receptor gene repertoires in plant genomes - Nature.com - October 15th, 2022 [October 15th, 2022]
- Uncovering the Full Variant Continuum with Pioneering Solutions from Bionano - Inside Precision Medicine - October 15th, 2022 [October 15th, 2022]
- Metagenomic analysis of viromes in tissues of wild Qinghai vole from the eastern Tibetan Plateau | Scientific Reports - Nature.com - October 15th, 2022 [October 15th, 2022]
- Research Assistant in Molecular and Genome Editing Therapeutics job with KINGS COLLEGE LONDON | 311876 - Times Higher Education - October 15th, 2022 [October 15th, 2022]
- Lessons learnt from COVID-19 shed light on future pandemic preparedness - The Peter Doherty Institute for Infection and Immunity - October 15th, 2022 [October 15th, 2022]
- From Neanderthal genome to Nobel prize: meet geneticist Svante Pbo - Nature.com - October 8th, 2022 [October 8th, 2022]
- Revealing the genome organization of the earliest common ancestor of all mammals - Tech Explorist - October 8th, 2022 [October 8th, 2022]
- Mitochondrial DNA Is Working Its Way Into the Human Genome - Technology Networks - October 8th, 2022 [October 8th, 2022]
- Animated Map: Where to Find Water on Mars - Visual Capitalist - October 8th, 2022 [October 8th, 2022]
- Reconstruction of The First Mammal's Genome Suggests It Had 38 Chromosomes - ScienceAlert - October 6th, 2022 [October 6th, 2022]
- Genomic Science Breakthroughs Are Happening Faster Than Ever Thanks to HPC - CIO - October 6th, 2022 [October 6th, 2022]
- Genome Of Ancient Humans Is The Winning Field Of 2022's Nobel Prize in Medicine - IFLScience - October 6th, 2022 [October 6th, 2022]
- ASU professor to study new genome editing tools with NIH Innovator Award - ASU News Now - October 6th, 2022 [October 6th, 2022]
- New R&D norms to fast-track research on genome-edited crops - The Financial Express - October 6th, 2022 [October 6th, 2022]
- Genomic Research Aids in the Effort to Understand How Best to Treat Deadly Infections Caused by a Fungus - UMass News and Media Relations - October 6th, 2022 [October 6th, 2022]