Pont C, Leroy T, Seidel M, et al. Tracing the ancestry of modern bread wheats[J]. Nature genetics, 2019, 51(5): 905-911.

1. 文章结论

1.1 小麦基因组多样性

为了探索目前能够进入小麦基因池中的多样性的起源与模式,坐着组装了一个世界范围的、拥有487个基因型的、包含wild diploid and tetraploid relatives, domesticated tetraploid and hexaploid landraces, old cultivars and modern elite cultivars的数据集。作者采用了外显子测序的方法,参考基因组序列采用了中国春序列。
发现了620,158个高可信度遗传变异。

基因与结构变异的关联性支持了染色体水平检测到的,从远端基因富集区域到着丝粒周围基因匮乏区域基因的变异没有偏好。亚基因组(B>A>>D)之间和染色体不同区域都呈现了结构性变异丰富程度的差异。整体上,数据提供了多尺度上详细的小麦基因组多样性总览。

经过分析,作者给出了驱动数据集内多样性差异产生的三个主要因素:①对春化的不同需求;②历史学的分组不同;③地理起源不同。通过分支单系守恒置换检验,证实了这三个因素的强分组效应。同时,系统发育的深层结构以大洲/大陆呈现差异,并在后来受到了现代育种强烈的选择而产生的生长习性变化的影响。

通过将大洲/国家起源重叠到系统发育聚类结果上,显示出了观察到的遗传多样性的西-东轴向结构,这与人类迁出新月沃地的路径相一致。

1.2 小麦选择足迹

作者采用滑动窗口的形式对多样性的局部降低进行检测,并将地理结构纳入了考虑。在驯化信号检测上,作者采用了在1Mb不重叠的窗口内计算平均每位点核苷酸多样性的形式。

在野生麦祖先和六倍体栽培种(landraces)间的对比,支持了驯化过程中,基因组上发生的大量多样性降低(the reduction of diversity (ROD))的多相性。通过历史分组i、ii、iii和iv的比较,结果显示经历了两轮主要的下降过程。第一轮是i到ii的早期育种选择,第二轮发生在iii到iv,对应了绿色革命期间。

为了鉴定育种家选择的基因marker与区段,采用了PCAdapt进行了全基因组、全样本扫描,并鉴定了5089个具有提高信号的多态性位点。一些已知的基因与这些位点距离较近(<5Mb distance)。近两世纪产生并固定的大型区间(>10Mb)在1A染色体尤其多,并在两个发生了结构性重新排列的染色体——4A和7B上较多。

对于欧/亚基因型上发现的8308/9948个重要足迹位点进行了2Mb重叠区间的拓展,从而分别定义了950Mb和1.3Gb的累计基因组区间,其分别具有两个地区的选择特点。作者对比发现,其中只有168Mb的区间能在两地都有发现,显示了两种地理起源有着不同选择目标。

作者接下来通过多环境下的GWAS分析,测试观测到的等位基因多样性是否能与两个关键的生命历程特征——抽穗期(HD)和株高(PH)相关,并发现了48/40个基因组位点与HD/PH显著相关,这之中包括一些包含已知基因的区域,和一些未知基因。

作者认为,目前的数据集为从在先前检测到、但当前仍未知的基因座中识别相关候选基因提供了基础。尤其是,先前的密度、选择足迹与GWAS分析清晰地显示了只有一小部分同源基因座包含了共有的信号,支持了六倍体面包小麦在遗传上与四倍体相似的观点(supporting the view that modern hexaploid bread wheats behave genetically as diploids)。一如先前选择的收敛模式(convergent pattern)所表现的、在同源区域之间的罕见性。

1.3 小麦起源

作者采用了一种基于网络的系统发育方法进行研究。该方法包括从重复的随机单倍型样本(repeated random haplotype samples (RRHS) )中,基于最大似然度,选出1000棵树。随后的图重建分析和种群聚类重建了现代六倍体面包小麦及其二倍体、四倍体祖先的网络进化史。并通过网络中间亲本与已知亲缘关系的比较,证明了该方法的鲁棒性。

作者提出的小麦进化综合模型由下述三种因素综合得出:①对网络边与边的权重的彻底分析;②树的拓扑结构评估;③使用D统计量(Patterson’s D statistic)进行基因流动检验。

作者提出的模型认为,进化过程是AA+SS -> AABB -(+DD)-> AABBDD。

**进化模型叙述原文:**Our proposed model (Fig. 4b) largely refines the widely accepted evolutionary path leading to modern bread wheat with the hybridization of wild diploid AA and SS (close to BB) genotypes leading to wild tetraploid AABB progenitors, which subsequently hybridized with a wild diploid DD genotype resulting in the hexaploid T. aestivum (AADDBB) lineage. In our analysis, the wheat B genome
is confirmed to be derived from the Aegilops section Sitopsis lineage, which gave rise to A. speltoides (SS), while the progenitors of A. tauschii and T. urartu represent the established origins of the D and
A genome lineages, respectively. T. araraticum (also referenced as T. araraticum Jakubz) represents the closest wild descendant of the AAGG tetraploid ancestor. It appears to have been subsequently domesticated to form T. timopheevii (Zhuk.) Zhuk while also hybridizing with T. boeoticum leading to the hexaploid T. zhukovskyi (Menabde & Ericzjan) lineage (AAAAGG).

模型确认了野生二粒(T. dicoccoides)是与现代四倍体(AABB)和六倍体(AABBDD)小麦中A、B亚基因组祖先最接近的子代。数据显示,在驯化与栽培的早期阶段,野生二粒至少产生了两种不同的驯化四倍体小麦血统T. dicoccum Schrank ex Schübl.和T. durum Desf.。

最后,模型显示了下述假说:普通小麦很可能由硬粒小麦和一种具有D基因组的、类似粗山羊草的品种发生杂交而形成的。随后,由六倍体普通小麦和栽培二粒杂交,产生了T. spelta,并直到今天仍隐含着野生二粒渗入的证据。

**假说叙述原文:**Finally, the model supports the hypothesis that T. aestivum is most likely to be derived from an ancestral hybridization event between the previous T. durum lineage and a D lineage close to wild A. tauschii (Fig. 4b and Supplementary Fig. 11). Subsequently, T. spelta emerged from the hybridization between the hexaploid T. aestivum and the tetraploid T. dicoccum, and still harbors evidence of T. dicoccum introgressions today (Supplementary Fig. 12).

作者接着寻找了六倍体小麦驯化过程中固定下的基因池的创建者。它可能被包含在起源于新月沃地的远古栽培种之中,并导致了两个(β和γ)不同的六倍体群体。其中,γ在西欧更为常见,而β在东欧更为常见。这一进化上的差异可能显示了人类历史与社会活动如何影响小麦种质资源的基因组成。

2. 研究方法

2.1 系统发育分析

这部分分析使用了从全部三个亚基因组ABD上的三联体直系同源基因( triplets (2,855) of orthologous genes)中的91554个SNP推断出来的435个六倍体面包小麦基因型。

数据首先用了iqtreeX(GTR+GAMMA(4) model)进行分析,with 1,000 ultrafast bootstraps。

祖先节点的地理区域使用如下方法进行重构:10,000 simulations were performed using the stochastic mapping algorithm of the R phytools package32 (using the equal rates model), the region of a
node was then chosen as the one with maximum sampled frequency

树的11个主要分支基于大小、代表性和统计数据的支持得出,以提供对于树的较好覆盖。

世界地图使用R包countrycode、geosphere、maps进行构建。

487个二倍体、四倍体、六倍体小麦的系统发育学分析考虑了基于SNP数据进行系统发育学分析中,由多个水平的杂合性、连锁不平衡、不完全的血统排序和网状进化而造成的歧义和可能的偏向。对这一点,作者使用了一个基于网络的方法来重构小麦科采样基因型的物种历史与群体结构。最后,作者采取了严格的过滤措施,并使用RRHS进行1000棵最大似然树的拓扑结构推测。

原文: The analysis of phylogeny for the 487 di- tetra- and hexaploid wheats was inferred in accounting for ambiguities and possible biases in phylogenetic inference from SNP data arising from varying levels of heterozygosity, linkage disequilibrium, incomplete lineage sorting and reticulate evolution. In that regard, we implemented a network-based approach to reconstruct the species history and community structure in the sampled Triticeae genotypes. To this end, we stringently filtered biallelic, polymorphic SNPs present in>90% of the genotypes from non-imputed data accounting for linkage disequilibrium (delivering 15,490 filtered SNPs), and implemented a repeated random haplotype sampling procedure including heterozygous sites (RRHS) to infer 1,000 maximum likelihood tree topologies with the ASC_GTRGAMMA model and JC69 distances in RAxML (asc-corr=felsenstein).

对于构建的1000棵树,采用了最小生成树算法,并将树形图与加权的、采用Cytoscape 3.6将节点聚类为树枝的系统发育网络结合。

**原文:**While these RRHS trees were also analyzed in the form of conventional consensus topologies and densitree visualizations to infer taxonomic clades, we analyzed the evolutionary distances among the tips of the 1,000 trees using the minimum spanning tree (MST) algorithm in Python. The MST graphs were subsequently combined into a weighted, phylogenetic consensus network whose nodes were clustered into clades using the Girvan–Newman EdgeBetweenness algorithm in Cytoscape 3.6 (ref. 33). The clustered network topology was plotted considering edge-betweenness in Cytoscape and taxonomic clades were inferred by intersection of community clusters with taxon information that was annotated using the AutoAnnotate plugin.

RRHS的边被MST选择了的相关数被用作边的权重,被解释为类似于共识树拓扑中的引导支持值。

原文: The relative number of RRHS trees for which a respective edge was selected by the MST algorithm were used as edge weights and were interpreted similar to bootstrap support values in the consensus tree topologies.

地理信息、历史分组等用于推测小麦群体的信息使用了chi平方进行检验,并使用了条形图绘制。

原文: The composition, geographical and historical origins of the identified wheat communities were analyzed using chi squared tests and barplots in R.

AB亚基因组间的基因流动使用了ANGSD进行检测。

原文: Gene flow in subgenomes A and B was investigated with the Patterson’s D statistic (or ABBA-BABA statistic) using ANGSD with a threshold of Z>4 (ref. 35).

小麦进化的综合模型(Fig. 4b)由下述三点因素共同决定,并手动合并而成:①系统发育共识网络的边支持值;②多种共识树和IUPAC树的拓扑结构;③ABBA-BABA检测结果。

原文: An integrative model (Fig. 4b) of wheat evolution was built by manual consolidation of the support values of the edges in the phylogenetic consensus network (Supplementary Fig. 10 and Supplementary Table 7), the various consensus and IUPAC tree topologies (Supplementary Fig. 11), the ABBA-BABA results (Supplementary Fig. 12) as well as the literature.

当物种关系仍不能由网络方法唯一确定时,将ABBA-BABA检测结果和已有的资料纳入考虑。

原文: Where species relationships remained ambiguous on the sole basis of the network approach (that is, when similar phylogenetic relatedness between groups of genotypes defines several possible evolutionary paths between putative progenitors and descendants), we then considered the results of the ABBA-BABA statistical test (Supplementary Fig. 12 and Supplementary Table 6) and the existing literature when available.

Fig.4b中,仅包含系统发育共识网络和ABBA-BABA检测结果共同支持的网络事件。

原文: Fig. 4b reports only the reticulation events identified on the basis of phylogenetic consensus networks supported by the ABBA-BABA analysis in both the A and B subgenomes.

文献阅读 | Tracing the ancestry of modern bread wheats相关推荐

  1. 谣言检测文献阅读六—Tracing Fake-News Footprints: Characterizing Social Media Messages by How They Propagate

    系列文章目录 谣言检测文献阅读一-A Review on Rumour Prediction and Veracity Assessment in Online Social Network 谣言检测 ...

  2. 谣言检测文献阅读二—Earlier detection of rumors in online social networks using certainty‑factor‑based convolu

    系列文章目录 谣言检测文献阅读一-A Review on Rumour Prediction and Veracity Assessment in Online Social Network 谣言检测 ...

  3. 文献阅读 | Deep learning enables reference-free isotropic super-resolution for v fluorescence microscopy

    文献阅读 | Deep learning enables reference-free isotropic super-resolution for volumetric fluorescence mi ...

  4. 谣言检测文献阅读三—The Future of False Information Detection on Social Media:New Perspectives and Trends

    系列文章目录 谣言检测文献阅读一-A Review on Rumour Prediction and Veracity Assessment in Online Social Network 谣言检测 ...

  5. 谣言检测文献阅读十二—Interpretable Rumor Detection in Microblogs by Attending to User Interactions

    系列文章目录 谣言检测文献阅读一-A Review on Rumour Prediction and Veracity Assessment in Online Social Network 谣言检测 ...

  6. 谣言检测文献阅读四—Reply-Aided Detection of Misinformation via Bayesian Deep Learning

    系列文章目录 谣言检测文献阅读一-A Review on Rumour Prediction and Veracity Assessment in Online Social Network 谣言检测 ...

  7. 文献阅读十——Detect Rumors on Twitter by Promoting Information Campaigns with Generative Adversarial Learn

    系列文章目录 谣言检测文献阅读一-A Review on Rumour Prediction and Veracity Assessment in Online Social Network 谣言检测 ...

  8. 谣言检测文献阅读一A Review on Rumour Prediction and Veracity Assessment in Online Social Network

    系列文章目录 谣言检测文献阅读一-A Review on Rumour Prediction and Veracity Assessment in Online Social Network 谣言检测 ...

  9. 四位科研牛人介绍的文献阅读经验

     每天保持读至少2-3 篇的文献的习惯.读文献有不同的读法,但最重要的自己总结概括这篇文献到底说了什么,否则就是白读,读的时候好像什么都明白,一合上就什么都不知道,这是读文献的大忌,既浪费时间,最 ...

最新文章

  1. 【社交系统ThinkSNS+研发日记】Laravel Model 利用 Macroable 为数据模型添加宏能
  2. j3455跑mysql_自用NAS升级折腾小记+J3455开硬件直通
  3. DHTML【5】--HTML
  4. python 控件叠加_如何将图像应用于控件背景(叠加)
  5. 数据库系统开发生命周期各个阶段需要获取的数据以及生成的文档
  6. u大侠pe系统桌面计算机,WinPE系统的四种启动方法
  7. 用计算机弹歌我的歌声里,我的歌声里 (完整版)
  8. 利用中间结果减少计算量
  9. java复制数组函数_java 数组复制:System.arrayCopy 深入解析
  10. prototype.js学习(2)
  11. matlab 矩阵一致性检验,层次分析法判断矩阵求权值以及一致性检验程序.doc
  12. php写phalapi,用PHP搭建你的云平台-PhalApi Pro框架介绍
  13. 人在深圳的100怕!——谨以此文献给所有的深圳打拼者
  14. revit二开之过滤族(Family)
  15. 新一代数据中心光纤布线技术发展趋势
  16. 南京邮电大学离散数学实验三(传递性,自反性,对称性)
  17. mavens使用阿里云国内私服下载
  18. 10、Dp Notes底部导航栏
  19. 考古学家质疑古埃及法老是外星人后代(图)
  20. java经典50道编程题(很好练逻辑思维的题)(第四篇)

热门文章

  1. 什么是模块化?模块化的好处是什么?
  2. 程序人生 - 西瓜霜能吃下去吗?
  3. 17.Ubuntu命令行下添加新用户
  4. ByteBuffer使用揭秘
  5. xv6 makefile详解
  6. 关于S32K汽车通用MCU,NXP工程师总结的10个超实用Tips
  7. fulltext全文索引的使用
  8. cesium---图加载
  9. Algorithms, 4th Edition 算法4精华笔记,通俗理解,算法收集与强化
  10. 【考前冲刺整理】20220812