文本嵌入向量逆向攻擊與主題式語意分解模型

文本嵌入學習是一種將文章轉化成低維度向量的重要技術，文章的嵌入向量保有完整的文章語意且並在自然語言處理的各種應用上取得成功。然而，這些成功也吸引來許多惡意攻擊者的注意，在其中一種惡意攻擊裡，攻擊者嘗試去逆向工程，把文章的嵌入向量反推回其原本的輸入文字，或是敏感的文字，進而窺探出向量背後的資訊。在這篇論文中，我們將先前的設定延伸成更普遍的形式，並提供兩種資訊來增加嵌入向量的解釋性。其一，我們認為攻擊者對文章中的每個字有他自己的偏好分數，而我們的目標就是回傳一個序列的文字使得它和攻擊者偏好的先後順序一樣。其二，即使能取得該文字序列，我們仍然難以理解它背後的涵義，因此我們借用主題模型的優點，從目標找到一致的語意。為了達成這些目標，我們結合神經網路主題模型和排序優化。經過完整的實驗，我們的設計在返回攻擊者的偏好文字序列，以及提供一致且多元的主題上，都有良好的表現，這意味著攻擊者在各種資料集與常見的嵌入模型上，都能夠輕易地理解一個文章嵌入向量背後的特性。

關鍵字

嵌入向量反向攻擊；文章嵌入；主題模型；排序

並列摘要

Document representation learning has become an important technique to embed rich document context into lower dimensional vectors. The embeddings preserve complete semantics of documents and lead to a huge success in various NLP applications. Nonetheless, the success also attracted many malicious adversaries’ attention. In one branch, the adversaries try to reverse-engineer the embeddings to its content words or sensitive keywords to pry into the information behind them. In our work, we want to extend the previous setting to a more general one and provide two types of information that both increase the interpretation of the embeddings. For the first one, we assume an adversary has his/her preference for the information in the documents and our goal is to retrieve the sequence of words that correspond to the adversary’s preference. Second, since even we could precisely retrieve a sequence of words that represent the documents, it is still hard for human to actually understand the idea among them. Thus, we borrow the advantages of topic model to acquire coherent semantics from the targets. To achieve these goals, wecombine the mechanism of neural topic model and ranking optimization. Through comprehensive experiments, our design shows promising results in capturing the sequence of adversaries’ preference words and providing coherent and diverse topics that the adversary could easily realize the characteristic of the unknown embeddings on various datasets and off-the-shelf embedding models .

並列關鍵字

Embedding Inversion Attack ； Document Embedding ； Topic Model ； Learning to Rank

參考文獻

K. Bennani-Smires, C. Musat, A. Hossmann, M. Baeriswyl, and M. Jaggi. Simple unsupervised keyphrase extraction using sentence embeddings. arXiv preprint arXiv:1801.04470, 2018.

Google Scholar

F. Bianchi, S. Terragni, and D. Hovy. Pre-training is a hot topic: Contextualized document embeddings improve topic coherence. arXiv preprint arXiv:2004.03974, 2020.

Google Scholar

F. Bianchi, S. Terragni, D. Hovy, D. Nozza, and E. Fersini. Cross-lingual contextualized topic models with zero-shot learning. arXiv preprint arXiv:2004.07737, 2020.

Google Scholar

D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of machine Learning research, 3(Jan):993–1022, 2003.

Google Scholar

R. Campos, V. Mangaravite, A. Pasquali, A. Jorge, C. Nunes, and A. Jatowt. Yake! keyword extraction from single documents using multiple local features. Information Sciences, 509:257–289, 2020.

Google Scholar

主題瀏覽