Title

實用對話系統之強健性研究

Translated Titles

On the Robustness of Real-world Spoken Dialogue System

Authors

王獻章

Key Words

錯誤回復 ; 多人對話系統 ; 對話管理員 ; 虛擬辨識器 ; 對話系統 ; 閱讀測驗系統 ; reading comprehension system ; dialogue manager ; multi-speaker dialogue system ; Dialogue system ; virtual speech recognizer

PublicationName

成功大學資訊工程學系學位論文

Volume or Term/Year and Month of Publication

2004年

Academic Degree Category

博士

Advisor

王駿發

Content Language

英文

Chinese Abstract

能夠讓人類跟機器以自然的方式來進行溝通一直是長久以來的夢想,近期的研究在針對加強人機互動的口語對話系統方面已經有不少的技術成果。然而,要想讓這些技術運用在實際運行的系統中,仍然有許多的問題等待著更進一步的研究。本論文之目的即在探討如何加強口語對話系統的效能,使其能夠在實地應用時更有效率。 首先,我們介紹世界各地的口語對話系統之發展現況,並討論發展這類系統時會遭遇到的一些問題。針對這些問題,在如何提升口語對話系統的效能方面,我們提出了語音辨識錯誤之回復、加入語意知識於構句模組、以及精巧的對話管理員等技術。 我們也介紹了一個有關於多人對話的嶄新研究課題。多人對話系統主要是針對傳統單人對話系統之下,只允許一個使用者跟系統進行互動的限制加以進行改良。從理論上的分析到如何實地進行多人對話的管理在本論文中均有詳細論述。 為了評估我們所提的方法的好壞,我們實際設計了一個對話轉接系統,並在本系統的實驗上獲得了不錯的結果。另外,我們也完成了一套智慧型行車資訊系統,以提供多人的環境下之人機溝通的測試。對我們的系統的強健性跟友善性,測試者均表示了肯定的意見,並表示若本技術商品化之後將很有興趣實際使用它。總結而言,實驗的結果讓我們對本論文所提出的方法在應用到實用的口語對話系統時的表現具有信心。

English Abstract

It is a long time dream that, one day, people could interact with machines via natural manners. Recent researches have resulted in many technologies for the spoken dialogue system (SDS) development which aims to facilitate the interaction between man and machine. However, in order to successfully apply these technologies into robust real-world applications, many issues remained to be addressed further. The purpose of this dissertation is to investigate the methods for improving the performance of spoken dialogue systems. This dissertation first introduces the current development of speaker dialogue system and discusses several robustness issues about developing SDS. Approaches to improve the performance of SDS such as speech recognition error recovery, incorporating semantics knowledge, and sophisticate dialogue management are proposed. We also introduce a brand new research topic pertains to multi-speaker dialogue system (MSDS). The MSDS improves the limitation of traditional SDS that only one speaker can communicate with the system. The fundamental analysis of an MSDS is illustrated and an enhanced multi-speaker dialogue manager algorithm is proposed to deal with the interactions between multiple speakers and the system. In order to evaluate the performance of our approaches, an auto-attendant telephone routing system was set up, and remarkable experimental results shows are achieved. Furthermore, an in-car environment multi-speaker dialogue system is undertaken. Testers in our experiments express positive opinions both on robustness and user-friendliness. Experimental results encourage us that the proposed approaches are suitable to be applied to real world applications of SDS.

Topic Category 基礎與應用科學 > 資訊科學
電機資訊學院 > 資訊工程學系
Reference
  1. [2]. Allen J. et al., “The TRAINS project: A case study in defining a conversational planning agent,” J. Exper. Theoret. AI, vol. 7, pp. 7–48, 1995.
    連結:
  2. [4]. Asadi A., Schwartz R., and Makhoul J., “Automatic modeling for adding new words to a large vocabulary continuous speech recognition system,” in Proc. ICASSP, 1991, pp. 305–308.
    連結:
  3. [5]. Aust H., Oerder M., Seide F., and Steinbiss V., “The Philips automatic train timetable information system,” Speech Communication, vol. 17, pp. 249–262, 1995.
    連結:
  4. [6]. Aust, H., Oerder, M., Seide, F. and Steinbiss, V. (1995). A Spoken Language Inquiry System for Automatic Train Timetable Information, Philips Journal of Research, Vol. 49, No. 4, pp. 399-418.
    連結:
  5. [7]. Bai B-R, Wang H-M and Lee L-S, "A word-length-dependent confidence measure for large vocabulary Chinese keyword spotting", Proceedings of IEEE TENCON '97, pp.595 -598, Vol. 2., 1997.
    連結:
  6. [8]. Barnard E., Halberstadt A., Kotelly C., and Phillips M., “A consistent approach to designing spoken-dialog systems,” presented at the Proc. ASRU Workshop, Keystone, CO, 1999.
    連結:
  7. [10]. Bernstein, J. and K. Taussig, J. Godfrey, "MACROPHONE: An American English Telephone Speech Corpus for Polyphone Project," ICASSP'94, Adelaide, Australia, 1994, I-81-I-84.
    連結:
  8. [11]. Billi R., Canavesio R., and Rullent C., “Automation of telecom italia directory assistance service: Field trial results,” in Proc. IVTTA, 1998, pp. 11–16.
    連結:
  9. [12]. Blomberg M., Carlson R., Elenius K., Granstrom B., Gustafson J., Hunnicutt S., Lindell R., and Neovius L., “An experimental dialogue system: Waxholm,” in Proc. Eurospeech, 1993, pp. 1867–1870.
    連結:
  10. [13]. Bouwman G., Sturm J., Boves L., “Incorporating Confidence Measures in the Dutch Train Timetable Information System Developed in the Arise Project,” in Proceeding of EuroSpeech, 1999, pp.493-396.
    連結:
  11. [14]. Boyce, S.J. (2000). Natural spoken dialogue systems for telephony applications. Communication of the ACM, Vol. 43, issue 9, pp. 29-34.
    連結:
  12. [17]. Butzberger J., Murveit H., and Weintraub M., “Spontaneous speech effects in large vocabulary speech recognition applications,” in Proc. ARPA Workshop Speech and Natural Language, 1992, pp. 339–344.
    連結:
  13. [18]. Cerf G., et al. “Progress in deployment and further development of the NYNEX VoiceDialing service,” Speech Communication, Vol. 23, 1997, pp. 41-50.
    連結:
  14. [19]. Chen H.Z. and Wu C.H., “Speech enhancement based on audible noise spectrum and short-time spectral amplitude estimator,” Electronics Letters, Vol. 38, No. 10, pp. 485-486, May 2002.
    連結:
  15. [23]. Chien J.T. and Wang H.C., “Adaptation of hidden Markov model for telephone speech recognition and speaker adaptation,” IEE Proceedings – Vision, Image, and Signal Processing, vol. 144, no. 3, pp. 129-135, 1997.
    連結:
  16. [24]. Chien J.T. and Wu C.C., "Discriminant waveletfaces and nearest feature decisions for face recognition", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 12, pp. 1644-1649, December 2002.
    連結:
  17. [25]. Chien J.T., "Linear regression based Bayesian predictive classification for speech recognition", IEEE Transactions on Speech and Audio Processing, vol. 11, no. 1, January 2003.
    連結:
  18. [27]. Cohen, P.R., Coulston, R. and Krout, K. (2002). Multiparty Multimodal Interaction: A Preliminary Analysis, In Proceeding of International Conference on Spoken Language Processing (ICSLP‘2002), CD-ROM.
    連結:
  19. [29]. Demeechai T and Makelainen K, “Recognition of syllables in a tone language”, Speech Communication, Vol. 33, pp.241-254, Feb. 2001.
    連結:
  20. [31]. Foote, J.T., Young, S.J., Jones, G.J.F. and Jones, K.S. (1997). Unconstrained keyword spotting using phone lattices with application to spoken document retrieval, Computer Speech and Language, Vol. 11, pp. 207-224.
    連結:
  21. [36]. Hermann H. and Alex W. “Recognition of Spelled Names Over the Telephone,” in Proceedings of ICSLP, Vol. 1, 1996.
    連結:
  22. [42]. Hsieh G.P., (1996). An Introduction to the Linguistics, Taipei: San-Min Press.
    連結:
  23. [45]. Hwang T.H. and Wang H.C., “An adaptation scheme for hidden Markov models in noisy speech recognition,” Electronics Letters, vol. 33, no. 4, pp. 157-258, 1997.
    連結:
  24. [47]. Johnston D., et al., “Current and experimental applications of speech technology for telecom services in Europe,” Speech Communication, Vol. 23, 1997, pp. 5-16.
    連結:
  25. [48]. Johnston, M., Bangalore, S., Stent, A. Vasireddy, G. and Ehlen, P. (2002). Multimodal Language Processing for Mobile Information Access, In Proceeding of International Conference on Spoken Language Processing (ICSLP‘2002), CD-ROM.
    連結:
  26. [50]. Kellner A., et al. “PADIS – An automatic telephone switchboard and directory information system,” Speech Communication, Vol. 23, 1997, pp. 95-111.
    連結:
  27. [51]. Kitai M., et al. “ASR and TTS telecommunications applications in Japan,” Speech Communication, Vol. 23, 1997, pp. 17-30.
    連結:
  28. [52]. Knill, K.M. and Young, S.J. (1999). Low-cost implementation of open set keyword spotting, Computer Speech and Language, Vol. 13, pp. 243-266.
    連結:
  29. [57]. Lee C.H., and Carpenter B., et. al., “On natural language call routing”, Speech Communication, Vol.31 pp.309-320, 2000.
    連結:
  30. [58]. Lee L.S, et. al., “Golden Mandarin (I) - A Real-time Mandarin Speech Dictation Machine for Chinese Language with Very Large Vocabulary”, IEEE Transactions on Speech and Audio Processing, Vol. 1, No.2, pp.158-179, Apr. 1993.
    連結:
  31. [59]. Lee L.S., Lee Y., "Voice Access of Global Information for Broadband Wireless: Technologies of Today and Challenges of Tomorrow", Proceedings of the IEEE, pp. 41-57, Jan. 2001.
    連結:
  32. [60]. Lee L.S., “Voice Dictation of Mandarin Chinese”, IEEE Signal Processing Magazine, Vol.14, No. 4, pp.63-101, July 1997.
    連結:
  33. [61]. Leggetter, C.J. and Woodland, P.C. (1995) Flexible Speaker Adaptation Using Maximum Likelihood Linear Regression, In Proceeding of Eurospeech'95, pp. 1155-1158.
    連結:
  34. [62]. Marcus S. et al., “Prompt constrained natural language—Evolving the next generation of telephony services,” in Proc. ICSLP, 1996, pp. 857–860.
    連結:
  35. [63]. Marsic, I. (2000). Natural Communication with Information Systems, Proceedings of the IEEE, Vol. 88, pp. 1354-1366.
    連結:
  36. [65]. Neubert F., and Gravier G., et al., “Directory Name Retrieval over the telephone in the Picasso project,” in Proceedings of IEEE workshop on Interactive Voice Technology for Telecommunications Applications, 1998, pp. 31-36..
    連結:
  37. [67]. Peckham J., “A new generation of spoken dialogue systems: Results and lessons from the SUNDIAL project,” in Proc. Eurospeech, 1993, pp. 33–40.
    連結:
  38. [69]. Price P., “Evaluation of spoken language systems: The ATIS domain,” in Proc. DARPA Speech and Natural Language Workshop, 1990, pp. 91–95.
    連結:
  39. [70]. Reinhart, T. (1983). Anaphora and Semantic Interpretation, Croom Helm Linguistics Series, University of Chicago Press.
    連結:
  40. [71]. Riccardi G and Gorin AL, “Stochastic language adaptation over time and state in natural spoken dialog systems”, IEEE Transactions On Speech And Audio Processing, 8: (1) 3-10 JAN 2000
    連結:
  41. [72]. Riccardi, G., Gorin, A.L., Ljolje, A. and Riley, M. (1997). A spoken language system for automated call routing, In Proceeding of ICASSP’97, pp.1143-1146.
    連結:
  42. [73]. Roach, P. et al., "BABEL: An Eastern European Multi-language Database," ICSLP'96, Philadelphia, PA, 1996, pp. 1892-1893.
    連結:
  43. [78]. Sanderman, A., Sturm, J., den Os, E., Boves, L., and Cremers, A. (1998). Evaluation of the Dutch train timetable information system developed in the ARISE project. In Interactive Voice Technology for Telecommunications Applications, IVTTA, pp. 91-96.
    連結:
  44. [80]. Seneff S. and Polifroni J., “A new restaurant guide conversational system: Issues in rapid prototyping for specialized domain,” in Proc. ICSLP, 1996, pp. 665–668.
    連結:
  45. [85]. Shyuu, J.S., and Wang, J.F., "A Speech Input Interface for Web Page Query Based on A Dynamic Language Model Architecture", In: Proc. International Conference on Consumer Electronic, 1998.
    連結:
  46. [88]. Tan B.T., Gu Y., and Thomas T., “Implementation and Evaluation of A Voice-Activated Dialing System,” in Proceedings of IEEE workshop on Interactive Voice Technology for Telecommunications Applications, 1998, pp.83-86.
    連結:
  47. [92]. Tsuboi H. and Takebayashi Y., "A real-time task-oriented speech understanding system using keyword spotting," in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, 1992, pp.197-200.
    連結:
  48. [93]. Vaseghi, S.V. (2000). Advanced Digital Signal Processing and Noise Reduction. John Wiley & Sons.
    連結:
  49. [94]. Walker, M.A., Litman D.J., Kamn C.A., and Abella A. (1998). Evaluating Spoken Dialogue Agents with PARADISE: Two Case Studies, Computer Speech and Language, vol. 12, pp. 317-347.
    連結:
  50. [95]. Wang H.C. and Wang J.F., "A Telephone Number Inquiry System with Dialog Structure," in Proceeding of ICASSP'98, Seattle, pp. 193-196, 1998.
    連結:
  51. [97]. Wang H.C., “MAT – A Project to Collect Mandarin Speech Data through Telephone Networks in Taiwan,” International Journal of Computational Linguistic Chinese Language Processing, Vol. 2. No. 1, pp.73-89, 1997.
    連結:
  52. [100]. Wang J.F. and Wang H.C., “A Portable Auto Attendant System with Sophisticated Dailogue Structure”, Journal of Information Science Engineering. Vol.18, pp.627-636, 2002.
    連結:
  53. [102]. Wang W.J., Liao Y.F. and Chen S.H., “RNN-based Prosodic Modeling for MAndarin Speech and Its Application to Speech-to-Text Conversion”, Speech Communication 36 (2002) pp.247-265.
    連結:
  54. [103]. Wang, K. (2000). A Plan-Based Dialog System With Probabilistic Inference, In the Proceeding of International Conference on Spoken Language Processing, ICSLP’2000.
    連結:
  55. [104]. Wang, Y. (1999). A Robust Parser for Spoken Language Understanding, in the proceeding of Eurospeech’99, Budapest, Hungary, pp.2055-2058.
    連結:
  56. [105]. Ward W., “Modeling nonverbal sounds for speech recognition,” in Proc. DARPA Workshop Speech and Natural Language, 1989, pp. 47–50.
    連結:
  57. [106]. Wessel F, Schluter R, Macherey K, et al. “Confidence measures for large vocabulary continuous speech recognition”, IEEE Transactions on Speech and Audio Processing, 9: (3) 288-298 MAR 2001.
    連結:
  58. [107]. Wu C.H. and Chen Y.J., “Multi-Keyword Spotting of Telephone Speech Using a Fuzzy Search Algorithm and Keyword-Driven Two-Level CBSM,” Speech Communication, Vol.33, pp.197-212, 2001.
    連結:
  59. [109]. Wu, C.H. and Chen, Y.J., “Multi-Keyword Spotting of Telephone Speech Using a Fuzzy Search Algorithm and Keyword-Driven Two-Level CBSM,” Speech Communication, Vol.33, 2001, pp.197-212.
    連結:
  60. [110]. Yang C.H., “A new Mandarin phonetic Morse code recognition method using a variant LMS algorithm”, Journal of the Chinese Institute of Engineers, 23: (6) 741-748, Nov. 2000.
    連結:
  61. [112]. Young, S.R. (1995). Discourse Structure for Multi-speaker Spontaneous Spoken Dialogs: Incorporating Heuristics into Stochastic RTNS, In Proceeding of International Conference on Acoustic and Speech Signal Processing (ICASSP’95), pp.177-180.
    連結:
  62. [113]. Young, SJ (2000). "Probabilistic Methods in Spoken Dialogue Systems." Philosophical Transactions of the Royal Society (Series A) 358(1769): pp.1389-1402.
    連結:
  63. [114]. Young, SJ. (2002). Talking to Machines (Statistically Speaking). In the Proceeding of International Conference on Spoken Language Processing, Denver, Colorado.
    連結:
  64. [115]. Yuo K.H. and Wang H.C., “Robust features for noisy speech recognition based on temporal trajectory filtering of short-time auto-correlation sequences,” Speech Communication, vol. 28, no. 1, pp. 13-24, 1999.
    連結:
  65. [1]. Abney, S. (1996). Part of Speech Tagging and Partial Parsing, Corpus-based Methods in Language and Speech Processing, pp. 118-136, Dovdrecht: Kluwer Academic.
  66. [3]. Allen J., Natural Language Understanding, Second Edition, the Benjamin/Cummings Publishing Company, Inc. 1995.
  67. [9]. Berg, J. and Francez, N. (1994). A Multi-Agent Extension of DRT, .Technical report of Laboratory for Computation Linguistics, In Proceeding of the 1st International Workshop on Computational Semantics, pp. 81-90. University of Tilburg.
  68. [15]. Bull, M. and Aylett, M. (1998). An Analysis of The Timing of Turn-Taking in A Corpus of Goal-Oriented Dialogue, In Proceedings of the International Conference on Spoken Language Processing (ICSLP'98), volume 4, pages 1175-1178, Sydney, Australia.
  69. [16]. Buntschuh B. et al., “VPQ: A spoken language interface to large scale directory information,” in Proc. ICSLP, 1998, pp. 2863–2866.
  70. [20]. Chen S.H., Hwang S.H., and Wang Y.R. “A Mandarin Text-to-Speech System”, Computational Linguistics and Chinese Language Processing", Vol. 1, No. 1, pp. 87-100, August, 1996.
  71. [21]. Chen, S. S. and Gopalakrishnan, P.S. (1998). Speaker, Environment And Channel Change Detection and Clustering Via the Bayesian Information Criterion, in Proc. DARPA Broadcast News Transcription and Understanding Workshop.
  72. [22]. Chiang T.H. and Lin Y.C., “Towards a PURE Chinese spoken dialogue system”, Journal Of The Chinese Institute of Engineers, 22: (5) 581-591 SEP 1999.
  73. [26]. Chu, C.J. and Carpenter, R. (1999), Vector-Based Natural Language Call Routing. Journal of Computational Linguistics, 25(30), pp. 361-388, 1999.
  74. [28]. Danieli, M. and Gerbino, E. (1995). Metrics for evaluating dialogue strategies in a spoken language system. In Proceedings of the 1995 AAAI Spring Symposium on Empirical Methods in Discourse Interpretation and Generation, Stanford, CA, pp. 34–39.
  75. [30]. Eskenazi M., Rudnicky A., Gregory K., Constantinides P., Brennan R., Bennett C., and Allen J., “Data collection and processing in the Carnegie Mellon communicator,” in Proc. Eurospeech, 1999, pp. 2695–2698.
  76. [32]. Fumitada Itakura, “Multi-Media Data Collection for In-Car Speech Communication – Ongoing Data Collection and Preliminary Results”, in the Proceeding of International Workshop on Hand-Free Speech communication, 2001, Kyoto, Japan.
  77. [33]. Gorin A., Riccardi G., and Wright J., “Howmay I help you?,” Speech Communication, vol. 23, pp. 113–127, 1997.
  78. [34]. Haykin, S. Modern Filters. Macmillan Publishing Company, 1989.
  79. [35]. Helen M., Senis B., and Zue V., et al. “WHEELS: A Conversational System in the Automobile Classification Domain,” in Proceedings of International Conference on Spoken Language Processing, Vol. 1, 1996, pp.542-545.
  80. [37]. Hetherington L. and Zue V., “New words: Implications for continuous speech recognition,” in Proc. Eurospeech, 1991, pp. 475–931.
  81. [38]. Heuvel H., et al., “SpeechDat-Car: Towards a collection of speech databases for automotive environments”, in the Proceedings of the Nokia-COST249 Workshop on Robust Methods for Speech Recognition in Adverse Conditions, Tampere, Finland, 25-26 May 1999, pp. 135-138.
  82. [39]. Heuvel H., Boudy J., Comeyne R., Euler S., Moreno A. & Richard G., “The SpeechDat-Car multilingual speech databases for in-car applications: Some first validation results”, Proceedings EUROSPEECH'99, Budapest, Hungary, 5-9 Sep. 1999, Vol.5, pp. 2279-2282.Hild H.. and Waibel A., “Recognition of Spelled Names Over the Telephone,” in Proceedings of International Conference on Spoken Language Processing, Vol. 1, 1996, pp.346-349.
  83. [40]. Hinkelman, E.A. and Spaceman, S.K. (1994). Communication with Multiple Agents, In Proceedings of the 15th International Conference on Computational Linguistics (COLING’94), vol. 2, pp. 1191-1197, Kyoto, Japan,
  84. [41]. Hirschman, L. and Pao, C. (1993). The cost of errors in a spoken language system. In Proceedings of the Third European Conference on Speech Communication and Technology, Berlin, Germany, pp. 1419–1422.
  85. [43]. Huang E.F.; Wang H.C. and Soong, F.K. "A fast algorithm for large vocabulary keyword spotting application", IEEE Transactions on Speech and Audio Processing, Volume: 23, July 1994, Page(s): 449 -452
  86. [44]. Huang, X., Acero, A. and Hon, H.W. (2001). Spoken Language Processing, New Jersey: Prentice Hall.
  87. [46]. Jackendoff, R.S. (1977). X’s Syntax: A Study of Phrase Structure, Cambridge, MIT Press.
  88. [49]. Jurafsky D., Wooters C., Tajchman G., Segal J., Stolcke A., Fosler E., and Morgan N., “The Berkeley restaurant project,” in Proc. ICSLP, 1994, pp. 2139–2142.
  89. [53]. Kwon, S. and Narayanan, S. (2002). Speaker Change Detection Using A New Weighted Distance Measure, In Proceeding of International Conference Spoken Language Processing 2002, vol. 4, p.2537-2540.
  90. [54]. Kudo, I. and T. Nakama, N. Arai, N. Fujimura, "The Database Collection of Voice Across Japan (VAJ) Project," ICSLP'94, Yokohama, Japan, 1994, pp.1799-1802.
  91. [55]. Lamel L., Rosset S., Gauvain J. L., Bennacef S., Garnier-Rizet M., and Prouts B., “The LIMSI ARISE system,” in Proc. IVTTA, 1998, pp. 209–214.
  92. [56]. Lamel L.F. and Bennacef S.K. et al., “The LIMSI RailTel System: Field trial of a telephone service rail travel information,” in Proceedings of IEEE Workshop on Interactive Voice Technology for Telecommunications Applications, 1997, pp.67-82
  93. [64]. Matsusaka, Y., Tojo, T., Kubota, S., Furukawa, K., Tamiya, D., Hayata, K., Nakano, Y. and Kobayashi T. (1999). Multi-person Conversation via Multi-modal Interface – A Robot who Communicate with Multi-user, In Proceeding of EuroSpeech’99, pp.1723-1726.
  94. [66]. Os E. den, Boves L., Lamel L., and Baggia P., “Overview of the ARISE project,” in Proc. Eurospeech, 1999, pp. 1527–1530.
  95. [68]. Poesio, M. (1998). Cross-speaker Anaphora and Dialogue Acts, In Proceeding of the workshop on Mutual Knowledge, Common Ground and Public Information ESSLLI Summer School.
  96. [74]. Rosset S., Bennacef S., and Lamel L., “Design strategies for spoken language dialog systems,” in Proc. Eurospeech, 1999, pp. 1535–1538.
  97. [75]. Rössler, H., Wajda, J.S., Hoffmann, J. and Kostrzewa, M. (2001). Multimodal Interaction for Mobile Environments, In Proceeding of International Workshop on Information Presentation and Natural Multimodal Dialogue.
  98. [76]. Rudnicky A., Thayer E., Constantinides P., Tchou C., Shern R., Lenzo K., Xu W., and Oh A., “Creating natural dialogs in the carnegie mellon communicator system,” in Proc. Eurospeech, 1999, pp. 1531–1534.
  99. [77]. Rumbaugh J., et. al. Object-Oriented Modeling and Design, Prentice Hall, 1991.
  100. [79]. Seide F. and Kellner A., “Toward an Automated Directory Information System,” in Proceedings of EuroSpeech, 1997, pp.1327-1330.
  101. [81]. Seneff S., Hurley E., Lau R., Pao C., Schmid P., and Zue V., “GALAXY-II: A reference architecture for conversational system development,” in Proc. ICSLP, 1998, pp. 931–934.
  102. [82]. Seneff S., Lau R., Glass J., and Polifroni J., “The mercury system for flight browsing and pricing,” MIT Spoken Language System Group Annual Progress Rep., pp. 23–28, 1999.
  103. [83]. Seneff, S., Lau, R., and Polifroni, J. (1999). Organization, Communication, and Control in the GALAXY-II Conversational System, Proc. Eurospeech '99, pp. 1271-1274.
  104. [84]. Shankar, T.R., VanKleek, M., Vicente, A. and Smith, B.K. (2002). “Fugue: A Computer Mediated Conversational System that Supports Turn Negotiation”, In 33rd Hawaii International Conference on System Sciences, Los Alamitos: IEEE Press.
  105. [86]. Sönmez, K., Heck, L. and Weintraub, M. (1999). Speaker Tracking and Detection with Multiple Speakers, In Proceeding of Eurospeech’99, vol. 5, pp. 2219-2222, Budapest.
  106. [87]. Tapias, D. and A. Acero, J. Esteve, J.C. Torrecilia, "The VESTEL Telephone Speech Database," ICSLP'94, Yokohama, Japan, 1994, pp.1811-1814.
  107. [89]. The How-net web site, http://www.how-net.com/
  108. [90]. Traum, D. and Rickel, J. (2002). “Embodied Agents for Multi-party Dialogue in Immersive Virtual Worlds, In Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 2. pp.766-773.
  109. [91]. Tseng, C.Y. “A Phonetically Oriented Speech Database for Mandarin Chinese”, ICPH’95, Stockholm, Sweden, 1995, Vol. 3, pp.326-329.
  110. [96]. Wang H.C. and Wang J.F., “Multi-speaker Dialogue for Vehicular Navigation and Assistance”, accepted to be published for the Journal of International Speech Technologies, 2003.
  111. [98]. Wang H.C., Chiou R.L., Chuang S.K., and Huang Y.F., “A phonetic labeling method for MAT database processing,” Journal of the Chinese Institute of Engineers, vol. 22, no. 5, pp. 529-534, 1999.
  112. [99]. Wang H.C., Wang J.F., and Liu Y.N., “A Conversational Agent for Food_ordering Dialog Based on VenusDictate,” Proceedings of the International Conference on Research on Computational Linguistics (ROCLING X), pp.325-334, Taipei, ROC, 1997.
  113. [101]. Wang J.F., Wang H.C., Lee C.N. and Hung M.S., "On the Construction of the Knowledge Base for the Domain Unconstrained Spoken Dialogue System", Oriental COCOSDA 1999, pp.133-136.
  114. [108]. Wu C.H., Chen Y.J., and Hung Y.C, “Telephone Speech Multi-Keyword Spotting Using Fuzzy Search Algorithm and Prosodic Verification,” in Proceeding of International Conference on Spoken Language Processing, 1998, Vol. 3, pp.835-838.
  115. [111]. Yang Y.J., Chien L.F. and Lee L.S., “Speaker Intention Modeling for Large Vocabulary Mandarin Spoken Dialogs,” in Proceeding of International Conference on Spoken Language Processing, Vol. 2, 1996, pp.713-716.
  116. [116]. Zue, V., Seneff, S., Glass, J.R., Polifroni, J., Pao, C., Hazen, T.J. and Hetherington, L. (2000). Jupiter: A Telephone-Based Conversational Interface for Weather Information. IEEE Transactions on Speech and Audio Processing, vol. 8, No. 1, pp.85-96.