论文标题
使用基因组规模杂交检测方法的各种杂交方案的可检测性
Detectability of Varied Hybridization Scenarios using Genome-Scale Hybrid Detection Methods
论文作者
论文摘要
杂交事件使系统发育的准确重建变得复杂,因为它们导致了遗传性遗传力的模式,这些模式在传统的,分叉的物种树模型下是出乎意料的。这导致了推断这些多样化杂交事件的方法的发展,这两种方法都直接重建网络,以及预测个体杂交事件的摘要方法。但是,方法之间缺乏经验比较(尤其是与具有多样化杂交方案的大型网络有关的方法)阻碍了它们的实际使用。在这里,我们提供了流行摘要方法的全面综述:TICR,MSCQUARTETS,HYDE,PATTERSON的D-Statistic(Abba-Baba),D3和DP。 TICR和MSCQuartets基于从基因树拓扑结构和Patterson的D统计,D3和DP使用位点模式频率来识别杂交事件的四重奏一致性因素。然后,我们使用模拟数据来解决方法准确性和理想使用方案的问题,通过对复杂网络进行测试方法,这些方法描绘了深度(时机),数量(单与多重,重叠的杂交)和基因流量速率的基因流动事件。我们发现,更深层次或多个杂交事件可能会引入噪声并削弱杂交信号,从而导致跨方法较高的假阴性率。尽管有某种形式的杂交避免了基于四重奏的检测方法,但在大多数情况下,MSCQuartets仍显示高精度。尽管HYDE在涉及幽灵谱系的杂交测试时会导致高假阴性率,但Hyde是能够分离混合动力与父信号的唯一方法。最后,我们测试了来自蜜蜂亚家族的超保守元素的方法,发现进化枝之间的杂交事件的可能性与原始研究中估计的物种树中不良支持区域相对应。
Hybridization events complicate the accurate reconstruction of phylogenies, as they lead to patterns of genetic heritability that are unexpected under traditional, bifurcating models of species trees. This has led to the development of methods to infer these varied hybridization events, both methods that reconstruct networks directly, and summary methods that predict individual hybridization events. However, a lack of empirical comparisons between methods - especially pertaining to large networks with varied hybridization scenarios - hinders their practical use. Here, we provide a comprehensive review of popular summary methods: TICR, MSCquartets, HyDe, Patterson's D-Statistic (ABBA-BABA), D3, and Dp. TICR and MSCquartets are based on quartet concordance factors gathered from gene tree topologies and Patterson's D-Statistic, D3, and Dp use site pattern frequencies to identify hybridization events. We then use simulated data to address questions of method accuracy and ideal use scenarios by testing methods against complex networks which depict gene flow events that differ in depth (timing), quantity (single vs. multiple, overlapping hybridizations), and rate of gene flow. We find that deeper or multiple hybridization events may introduce noise and weaken the signal of hybridization, leading to higher false negative rates across methods. Despite some forms of hybridization eluding quartet-based detection methods, MSCquartets displays high precision in most scenarios. While HyDe results in high false negative rates when tested on hybridizations involving ghost lineages, HyDe is the only method to be able to separate hybrid vs parent signals. Lastly, we test the methods on ultraconserved elements from the bee subfamily Nomiinae, finding the possibility of hybridization events between clades which correspond to regions of poor support in the species tree estimated in the original study.