论文标题

发现Dalle-2的隐藏词汇

Discovering the Hidden Vocabulary of DALLE-2

论文作者

Daras, Giannis, Dimakis, Alexandros G.

论文摘要

我们发现Dalle-2似乎具有隐藏的词汇,可用于用荒谬提示生成图像。例如,似乎\ texttt {apoploe vesrreaitais}表示鸟类,\ texttt {contarra ccetnxniams luryca tanniounons}(有时)是指虫子或害虫。我们发现这些提示通常在孤立状态下是一致的,有时也有时是组合。我们提出了黑框方法,以发现看起来随机但与视觉概念有一定对应的单词。这会带来重要的安全性和解释性挑战。

We discover that DALLE-2 seems to have a hidden vocabulary that can be used to generate images with absurd prompts. For example, it seems that \texttt{Apoploe vesrreaitais} means birds and \texttt{Contarra ccetnxniams luryca tanniounons} (sometimes) means bugs or pests. We find that these prompts are often consistent in isolation but also sometimes in combinations. We present our black-box method to discover words that seem random but have some correspondence to visual concepts. This creates important security and interpretability challenges.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源