论文标题

GPU上的内核操作,带有AutoDiff,没有内存溢出

Kernel Operations on the GPU, with Autodiff, without Memory Overflows

论文作者

Charlier, Benjamin, Feydy, Jean, Glaunès, Joan Alexis, Collin, François-David, Durif, Ghislain

论文摘要

KEOPS库为张量提供了快速,记忆有效的GPU支持,其条目由数学公式(例如内核和距离矩阵)提供。 KEOPS减轻了用于内核和几何应用的以张量为中心库的主要瓶颈:内存消耗。它还支持自动差异化,并且胜过标准的GPU基准,包括Pytorch CUDA张量或卤化物和TVM库。 Keops将优化的C ++/CUDA方案与高级语言的粘合剂结合在一起:Python(Numpy和Pytorch),Matlab和GnuR。结果,高级“二次码头”代码现在可以扩展到大型数据集,并在数百万个秒内处理大型数据集。 Keops为内核方法带来了类似图形的性能,并且可以在标准存储库(PYPI,CRAN)上免费获得。为了展示其多功能性,我们在\ url {www.kernel-operations.io}在各种设置中提供教程。

The KeOps library provides a fast and memory-efficient GPU support for tensors whose entries are given by a mathematical formula, such as kernel and distance matrices. KeOps alleviates the major bottleneck of tensor-centric libraries for kernel and geometric applications: memory consumption. It also supports automatic differentiation and outperforms standard GPU baselines, including PyTorch CUDA tensors or the Halide and TVM libraries. KeOps combines optimized C++/CUDA schemes with binders for high-level languages: Python (Numpy and PyTorch), Matlab and GNU R. As a result, high-level "quadratic" codes can now scale up to large data sets with millions of samples processed in seconds. KeOps brings graphics-like performances for kernel methods and is freely available on standard repositories (PyPi, CRAN). To showcase its versatility, we provide tutorials in a wide range of settings online at \url{www.kernel-operations.io}.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源