论文标题
在加权编辑距离约束下,在道路网络中的快速子区域相似性搜索
Fast Subtrajectory Similarity Search in Road Networks under Weighted Edit Distance Constraints
论文作者
论文摘要
在本文中,我们解决了道路网络中空间轨迹的相似性搜索问题。特别是,我们专注于子区域的相似性搜索问题,该问题涉及在数据库中查找类似于查询轨迹的子区域。我们方法的关键特征是我们不关注特定的相似性函数。取而代之的是,我们考虑加权编辑距离(WED),这是一类相似性功能,该功能允许用户定义的成本功能,因此包含了几个重要的相似性功能,例如EDR和ERP。我们将轨迹建模为字符串,并提出一个通用解决方案,该解决方案能够处理属于WED类的任何相似性函数。通过采用过滤器和验证策略,我们将子序列过滤引入有效修剪轨迹并找到候选者。为了选择一个适当的子序列以优化候选号码,我们将选择模型为离散优化问题(NP-HARD),并使用2-辅助算法对其进行计算。为了验证候选人,我们设计了双向尝试,验证从有希望的位置开始,并利用轨迹的共同段和道路网络的稀疏性来加速。实验是在大型数据集上进行的,以证明WED的有效性以及我们方法在WED下对各种相似性函数的效率。
In this paper, we address a similarity search problem for spatial trajectories in road networks. In particular, we focus on the subtrajectory similarity search problem, which involves finding in a database the subtrajectories similar to a query trajectory. A key feature of our approach is that we do not focus on a specific similarity function; instead, we consider weighted edit distance (WED), a class of similarity functions which allows user-defined cost functions and hence includes several important similarity functions such as EDR and ERP. We model trajectories as strings, and propose a generic solution which is able to deal with any similarity function belonging to the class of WED. By employing the filter-and-verify strategy, we introduce subsequence filtering to efficiently prunes trajectories and find candidates. In order to choose a proper subsequence to optimize the candidate number, we model the choice as a discrete optimization problem (NP-hard) and compute it using a 2-approximation algorithm. To verify candidates, we design bidirectional tries, with which the verification starts from promising positions and leverage the shared segments of trajectories and the sparsity of road networks for speed-up. Experiments are conducted on large datasets to demonstrate the effectiveness of WED and the efficiency of our method for various similarity functions under WED.