透過您的圖書館登入
IP:3.144.202.133
  • 學位論文

聲源定位和語音互動系統應用於智慧型服務機器人

Sound Source Localization and Speech Interaction System for Intelligent Mobile Robots

指導教授 : 羅仁權
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


在先進國家人口老化過程所引發的社會福利、醫療照護及各種公共服務需求已經漸漸浮現。從科技面來看,透過智慧型機器人的輔助,使高齡化人士能健康、舒適及安全地生活,是各國重視的課題。在二十一世紀,智慧型機器人產業也是世界各國列為前瞻優先發展的新科技產業。 我們的研究主題在於機器人與週遭環境中聲音的互動,包括語音人機介面以及聲源定位系統。為了整合語音介面、聲源定位系統與機器人的其他應用程式,我們研究了許多國內外論文所發表的機器人軟體架構與開放程式原始碼,發現此類開源軟體多是針對研究導向的機器人平台,並不適用於產品導向的智慧型服務機器人。因此,我們自行設計了一套基於微軟 Windows 作業系統和 C++ 程式語言的軟體框架 SRUT framework ( Socket based Robot User interface and Task framework ),並以此為主軸,將智慧型服務機器人常見的應用功能整合起來,以證明此軟體架構能夠有效地將複雜的機器人基本功能歸納為有系統的機器人子系統,同時提供簡易且具有擴充彈性的使用者介面。 本論文有三大主題:(一)基於語音互動的人機介面,(二)聲源定位系統,(三)適用於產品導向的機器人軟體框架SRUT。 另外,本論文也針對許多機器人功能,包括定位、路徑規劃與避障,提出演算法上的創新與改進。論文中所整合的機器人功能包括互動式人物追蹤、有聲目標搜索、室內環境定位巡邏等。 本論文中的語音人機介面是基於 Microsoft 所提供的 Windows 作業系統環境及 SAPI (Speech Application Programming Interface) 所設計,而聲源定位系統則是實現了時域上互相關(Cross-Correlation)與聲音活動偵測(VAD, Voice Activity Detection)演算法。針對系統整合的需求,本論文所提出的軟體架構 SRUT,能夠將各式各樣的機器人功能化零為整,以一樣的型態成為機器人的子系統,進而降低系統整合工作的複雜度。 SRUT 充分地利用 C++ 語言物件導向的特性,將使用者圖形介面( Graphical Interface )和語音介面( Speech Interface )整合於一個抽象類別( abstract class )中,讓開發者能夠快速地繼承此類別並設計自己的使用者介面與機器人功能。 本論文所提出的軟體架構與設計模式都是以標準 C++ 程式語言以及 Win32 API 實現在本實驗室自行開發的輪型機器人 RenQ 身上。

並列摘要


Growing of elderly population raises the demands of medical cares, social welfare and kinds of public services. In terms of technology, it is an important issue to help elders live with a healthy, safe and comfortable life assisted by intelligent service robots. In 21st century, the field of intelligent robotics is a high-priority development industry. Human-Robot-Interaction (HRI) plays an important role of intelligent robotics field, and we are interested in how robots interact with speech and sound. There are three main topics in this thesis: (1) Speech-based human-robot interface, (2) sound source localization system, (3) a software framework "Socket-based Robot, User interface and Task framework" (SRUT) which is designed for product-oriented robot applications. In addition, several robotic algorithms such as localization, path planning and obstacle avoidance, are proposed in this thesis. The speech interface is implemented with Microsoft Speech API (SAPI), and the sound source localization system is designed based on time-domain cross-correlation and voice-activity-detection algorithms. The SRUT framework is designed to reduce the complexity of difficult software integration tasks, especially for product-oriented robots that need friendly user interfaces. SRUT is designed under Microsoft Windows environments and native C++ programming language. The applications shown in this thesis are Interactive Human Follow, finding sound-based targets and indoor environments patrol. All the systems, user interface, software frameworks and applications proposed in this thesis are implemented with native C++ programming language and Win32 API. The experiments are conducted on RenQ, which is an intelligent mobile service robot developed by the Intelligent Robotics and Automation (IRA) Laboratory at National Taiwan University.

參考文獻


[6] F. Jelinek, “Statistical Methods for Speech Recognition”, MIT Press, 1999
[11] R. Schmidt, "Multiple emitter location and signal parameter estimation," Antennas and Propagation, IEEE Transactions on , vol.34, no.3, pp. 276- 280, Mar 1986
[12] R. Roy and T. Kailath, "ESPRIT-estimation of signal parameters via rotational invariance techniques," Acoustics, Speech and Signal Processing, IEEE Transactions on , vol.37, no.7, pp.984-995, Jul 1989
[14] Y. Sasaki, Y. Tamai, S. Kagami, H. Mizoguchi, "2D sound source localization on a mobile robot with a concentric microphone array," Systems, Man and Cybernetics, 2005 IEEE International Conference on , vol.4, pp. 3528- 3533 Vol. 4, 10-12 Oct. 2005
[16] R. C. Luo, C. T. Liao, and S. C. Lin, "Multi-sensor fusion for reduced uncertainty in autonomous mobile robot docking and recharging," Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on, pp.2203-2208, 10-15 Oct. 2009

延伸閱讀