论文标题

用于文件包装缓存和概括分布式缓存的最佳在线算法

Optimal Online Algorithms for File-Bundle Caching and Generalization to Distributed Caching

论文作者

Qin, Tiancheng, Etesami, S. Rasoul

论文摘要

我们考虑对称为文件包装缓存的标准缓存问题的概括,其中不同的查询(任务)包含$ l \ ge 1 $文件,依次到达。一种不知道提前查询顺序的在线算法必须自适应地决定要保留哪些文件以征收最小数量的高速缓存误差。在这里,缓存失误是指在缓存文件中缺少查询中至少一个文件的情况。在$ L = 1 $的特殊情况下,此问题减少了标准缓存问题。我们首先在此环境中分析了最近使用的最少使用的经典(LRU)算法的性能,并表明LRU是针对竞争比率的文件捆绑缓存的近乎最佳的在线确定性算法。然后,我们将结果扩展到此文件集合设置中的广义$(h,k)$ - 分页问题,​​在该设置中,将其具有缓存尺寸$ k $的在线算法的性能与较小的高速缓存尺寸$ h <k $的最佳离线基准进行了比较。在后一种情况下,我们为我们的广义$(h,k)$ - 分页问题提供了随机$ O(l \ ln \ frac {k} {k-h})$ - 可以看作是经典标记算法的扩展。我们通过为竞争比率提供匹配的下限来完成此结果,这表明该修改后的标记算法的性能在任何随机在线算法中的两个因素之内。最后,我们查看文件捆绑缓存问题的分布式版本,其中系统中有$ M \ ge 1 $相同的缓存。在这种情况下,我们表明,对于$ M = L+1 $缓存,有一个确定性分布式缓存算法,该算法为$(l^2+l)$ - 竞争性和一个随机分布式的缓存算法,该算法为$ O(l \ ln(2l+1))$ - 当$ ln(2l+1)$ - 竞争$ l \ ge ge 2 $。

We consider a generalization of the standard cache problem called file-bundle caching, where different queries (tasks), each containing $l\ge 1$ files, sequentially arrive. An online algorithm that does not know the sequence of queries ahead of time must adaptively decide on what files to keep in the cache to incur the minimum number of cache misses. Here a cache miss refers to the case where at least one file in a query is missing among the cache files. In the special case where $l=1$, this problem reduces to the standard cache problem. We first analyze the performance of the classic least recently used (LRU) algorithm in this setting and show that LRU is a near-optimal online deterministic algorithm for file-bundle caching with regard to competitive ratio. We then extend our results to a generalized $(h,k)$-paging problem in this file-bundle setting, where the performance of the online algorithm with a cache size $k$ is compared to an optimal offline benchmark of a smaller cache size $h<k$. In this latter case, we provide a randomized $O(l \ln \frac{k}{k-h})$-competitive algorithm for our generalized $(h,k)$-paging problem, which can be viewed as an extension of the classic marking algorithm. We complete this result by providing a matching lower bound for the competitive ratio, indicating that the performance of this modified marking algorithm is within a factor of two of any randomized online algorithm. Finally, we look at the distributed version of the file-bundle caching problem where there are $m\ge 1$ identical caches in the system. In this case we show that for $m=l+1$ caches, there is a deterministic distributed caching algorithm which is $(l^2+l)$-competitive and a randomized distributed caching algorithm which is $O(l\ln(2l+1))$-competitive when $l\ge 2$.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源