论文标题
边缘上的匿名传感器数据:表示和转换方法
Anonymizing Sensor Data on the Edge: A Representation Learning and Transformation Approach
论文作者
论文摘要
传感器在物联网设备(IoT)设备中收集的大量数据,以及深度神经网络在发现时间序列数据中隐藏模式中的成功,这导致了越来越多的隐私问题。这是因为可以通过可以访问此数据的应用程序从传感器数据中了解私有和敏感的信息。在本文中,我们旨在通过学习对数据混淆有用的低维表示来检查效用和隐私损失之间的权衡。我们提出在差异自动编码器的潜在空间中的确定性和概率转换,以综合时间序列数据,以便预防侵入性推断,而仍然可以以足够的精度进行所需的推论。在确定性的情况下,我们使用线性转换来移动潜在空间中输入数据的表示形式,以使重建数据可能具有相同的公共属性,但与原始输入数据不同,但私有属性不同。在概率情况下,我们将线性转换应用于输入数据的潜在表示,并具有一定的概率。我们将技术与基于自动编码器的匿名技术进行比较,并表明它可以在资源约束的边缘设备上实时匿名数据。
The abundance of data collected by sensors in Internet of Things (IoT) devices, and the success of deep neural networks in uncovering hidden patterns in time series data have led to mounting privacy concerns. This is because private and sensitive information can be potentially learned from sensor data by applications that have access to this data. In this paper, we aim to examine the tradeoff between utility and privacy loss by learning low-dimensional representations that are useful for data obfuscation. We propose deterministic and probabilistic transformations in the latent space of a variational autoencoder to synthesize time series data such that intrusive inferences are prevented while desired inferences can still be made with sufficient accuracy. In the deterministic case, we use a linear transformation to move the representation of input data in the latent space such that the reconstructed data is likely to have the same public attribute but a different private attribute than the original input data. In the probabilistic case, we apply the linear transformation to the latent representation of input data with some probability. We compare our technique with autoencoder-based anonymization techniques and additionally show that it can anonymize data in real time on resource-constrained edge devices.