论文标题

地理:用于恒定时间地理空间搜索的数据结构,实现实时混合调整的中位物业价格指数

GeoTree: a data structure for constant time geospatial search enabling a real-time mix-adjusted median property price index

论文作者

Miller, Robert, Maguire, Phil

论文摘要

在数据科学领域中出现的一个常见问题是$ k $ -nn($ k $ - 最终的邻居),尤其是在地理信息系统的背景下。在本文中,我们提出了一种新型的数据结构,即Geotree,该结构包含一系列地理句(GPS坐标的字符串编码)。这使得一个常数$ o \ left(1 \右)$ time搜索算法,该算法返回围绕Geotree中给定的地理座的一组地理船,代表该地理上的大约$ k $ neart邻居。此外,地理数据结构保留$ o \ left(n \右)$内存要求。我们将数据结构应用于物业价格指数算法,该算法的重点是与历史相邻销售的价格比较,表明性能提高。结果表明,这种数据结构允许开发实时属性价格指数,并且可以轻松地缩放到较大的数据集。

A common problem appearing across the field of data science is $k$-NN ($k$-nearest neighbours), particularly within the context of Geographic Information Systems. In this article, we present a novel data structure, the GeoTree, which holds a collection of geohashes (string encodings of GPS co-ordinates). This enables a constant $O\left(1\right)$ time search algorithm that returns a set of geohashes surrounding a given geohash in the GeoTree, representing the approximate $k$-nearest neighbours of that geohash. Furthermore, the GeoTree data structure retains $O\left(n\right)$ memory requirement. We apply the data structure to a property price index algorithm focused on price comparison with historical neighbouring sales, demonstrating an enhanced performance. The results show that this data structure allows for the development of a real-time property price index, and can be scaled to larger datasets with ease.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源