论文标题

可视化多物种合并树:在物种树中绘制基因树

Visualizing Multispecies Coalescent Trees: Drawing Gene Trees Inside Species Trees

论文作者

Klawitter, Jonathan, Klesen, Felix, Niederer, Moritz, Wolff, Alexander

论文摘要

我们考虑在单个物种树中绘制多个基因树的问题,以可视化多种物种合并树。具体而言,物种树的图填充了一个矩形,在该矩形中,其每个边的边缘由较小的矩形表示,并且在物种树的图中,将基因树绘制为矩形cladogram(即,正交和向下,一个弯曲,每个边缘的弯曲,一个弯曲,每个边缘的弯曲)。作为替代方案,我们还考虑了一种风格,即物种树的边缘的宽度与给定有效的人口大小成正比。 为了获得可读的可视化,我们的目的是最大程度地减少此类图中基因树边缘之间的交叉数量。我们表明,平面实例可以在线性时间内识别,并且一般问题是NP-HARD。因此,我们介绍了两种启发式方法,并提供了一个整数线性编程(ILP)公式,该公式为我们提供了指数时间的精确解决方案。我们使用ILP来衡量现实世界实例的启发式方法的质量。启发式方法产生了令人惊讶的良好解决方案,而ILP的运行速度出乎意料。

We consider the problem of drawing multiple gene trees inside a single species tree in order to visualize multispecies coalescent trees. Specifically, the drawing of the species tree fills a rectangle in which each of its edges is represented by a smaller rectangle, and the gene trees are drawn as rectangular cladograms (that is, orthogonally and downward, with one bend per edge) inside the drawing of the species tree. As an alternative, we also consider a style where the widths of the edges of the species tree are proportional to given effective population sizes. In order to obtain readable visualizations, our aim is to minimize the number of crossings between edges of the gene trees in such drawings. We show that planar instances can be recognized in linear time and that the general problem is NP-hard. Therefore, we introduce two heuristics and give an integer linear programming (ILP) formulation that provides us with exact solutions in exponential time. We use the ILP to measure the quality of the heuristics on real-world instances. The heuristics yield surprisingly good solutions, and the ILP runs surprisingly fast.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源