行銷機器人: 從開放式到任務導向的對話系統

通常對話系統可以分為兩類，第一種是開放式的對話，他的目標是希望使用者能享受對話的過程，並積極地參與，所以根據目前的對話紀錄選擇適當的主題進行連接是非常重要的，而另一種是任務導向的對話，這類的系統一開始就是為了完成某一種特定的任務，像是"幫我找一部適合星期五晚上看的電影"或是"隨機為我播放一首歌"，因為這兩種系統有不同的目的，所以在研究的方向上通常都是分開討論的，可是其實能不突兀的從閒聊慢慢地轉移到特定任務的系統上，是有一些商業價值的，目前也沒有這類型的公開資料，因此我們希望能探討如何將對話從開放式的閒聊逐漸地引導到任務導向的對話上，也有釋出包含標註的資料集，以利於之後相關的研究，為了達到目標，我們提出了一個自動產生對話的流程，不需要太多人為的輔助就能產生所需的資料，主要是利用現今強大的預訓練開放式對話系統模型，在評估自動產生對話的階段，對話進行過程在人類看來是相對合理的、自然的，顯示了這些資料對未來想探討類似方向的研究是有幫助的，此外，我們提出的流程能產生大量的對話，也可以利用在無監督式學習或半監督式學習上。

關鍵字

對話系統；自然語言理解；自然語言處理

並列摘要

Dialogue systems are usually categorized into two types, open-domain and task-oriented. The first one focuses on chatting with users and making them engage in the conversations, where selecting a proper topic to fit the dialogue context is essential for a successful dialogue. The other one focuses on a specific task instead of casual talks, e.g., finding a movie on Friday night, playing a song. These two directions have been studied separately due to their different purposes. However, how to smoothly transition from social chatting to task-oriented dialogues is important for triggering the business opportunities, and there is no any public data focusing on such scenarios. Hence, this paper focuses on investigating the conversations starting from open-domain social chatting and then gradually transitioning to task-oriented purposes, and releases a large-scale dataset with detailed annotations for encouraging this research direction. To achieve this goal, this paper proposes a framework to automatically generate many dialogues without human involvement, in which any powerful open-domain dialogue generation model can be easily leveraged. The human evaluation shows that our generated dialogue data has a natural flow at a reasonable quality, showing that our released data has a great potential of guiding future research directions and commercial activities. Furthermore, the released models allow researchers to automatically generate unlimited dialogues in the target scenarios, which can greatly benefit semi-supervised and unsupervised approaches.

並列關鍵字

Dialogue Systems ； Natural Language Understanding ； Natural Language Processing

參考文獻

[1] D. Adiwardana, M.-T. Luong, D. R. So, J. Hall, N. Fiedel, R. Thoppilan, Z. Yang, A. Kulshreshtha, G. Nemade, Y. Lu, et al. Towards a human-like open-domain chatbot. arXiv preprint arXiv:2001.09977, 2020.

Google Scholar

[2] D. Adiwardana, M.-T. Luong, D. R. So, J. Hall, N. Fiedel, R. Thoppilan, Z. Yang, A. Kulshreshtha, G. Nemade, Y. Lu, and Q. V. Le. Towards a human-like open-domain chatbot, 2020.

Google Scholar

[3] P. Budzianowski, T.-H. Wen, B.-H. Tseng, I. Casanueva, S. Ultes, O. Ramadan, and M. Gašić. MultiWOZ - a large-scale multi-domain Wizard-of-Oz dataset for task-oriented dialogue modelling. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 5016–5026, Brussels, Belgium, Oct.-Nov. 2018. Association for Computational Linguistics.

Google Scholar

[4] P. Ennen, Y.-T. Lin, A. G. Ozbay, F. Insalata, M. Li, Y. Tian, S. Jalali, and D. shan Shiu. Towards a universal nlg for dialogue systems and simulators with future bridg- ing, 2021.

Google Scholar

[5] A. Fan, M. Lewis, and Y. Dauphin. Hierarchical neural story generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 889–898, Melbourne, Australia, July 2018. Association for Computational Linguistics.

Google Scholar

主題瀏覽