透過您的圖書館登入
IP:3.141.47.163
  • 學位論文

基於場景語義認知的意圖驅動導航居家服務型機器人

User Intent-driven Navigation of Home Service Robot based on Semantic Scene Cognition

指導教授 : 傅立成
本文將於2025/08/25開放下載。若您希望在開放下載時收到通知,可將文章加入收藏

摘要


全球正面臨著少子化及人口高齡化的趨勢,勞動力缺口及高齡長照等替服務型機器人帶來直接的需求,進而衍伸了許多如居家協助與照護等問題,而為了能夠使機器人能夠適應不同場合以提供各式的服務,機器人的自主環境認知、推理決策與導航能力尤為重要,此外,服務型機器人亦應能夠理解人類語言,才能在人機互動中做出適當的移動決策,另一方面,此類型的機器人通常配載有限的運算資源與感測器,在整體系統設計架構中必須選擇能夠同時兼顧準確性與運算效能等問題。 在本篇研究當中,我們提出了一個基於自主環境認知地圖的推理導航架構,並使用最常見的2D laser 與 RGBD 相機做為環境辨識的感測器,讓機器人進入一個新環境時,可以利用自主探索與環境辨識的能力,認知室內環境的場景配置,而且在自主探索的過程中,必須避免碰撞問題的發生,除了使用2D laser掃描平面環境外,在機器人正前方也使用深度相機點雲的融合,以辨識障礙物的真實邊界,並使用階層式的路徑規劃演算法,將點雲融合的訊息考慮進本地規劃器決策中,讓機器人也能夠閃避桌、椅等中空的家俱,在探索的路徑上,我們連續建構多個拓譜點,並在探索結束後利用場景語義網格圖,賦予拓譜點最具代表性的場景語義標籤。 於此同時,我們使用自然語言的數據集,建構在場景中常見的物體資訊與場景用途的語義知識圖譜,在機器人完成自主環境認知後,機器人就可以透過與使用者的對話,利用句子嵌入向量比對人類意圖與知識圖譜實體中的關聯性,推論出目標場景在認知地圖中的位置,最後使用帶有場景語義訊息的拓譜圖與A*路徑規劃演算法,規劃出到達目標場景的最短拓譜節點路徑提供服務。

並列摘要


The world is facing the trend of low birth rates and the aging of the population. The shortage of caregivers and more and more long-term care for the elderly has brought direct demand for service-oriented robots, which has further exacerbated many problems such as home assistance and care. To enable robots to adapt to be able to different environments and to provide various services, the ability of autonomous environment cognition, reasoning, decision-making, and navigation of robots are particularly important. In addition, a service-oriented robot should also be able to understand what human speaks so that it can make appropriate decisions during human-robot interaction. On the other hand, this type of robot usually only carries limited computing resources and common sensors. In the design of the overall system architecture, the robot must be able to take both accuracy and computing efficiency of the inferencing into account. In this research work, we propose a user’s intent driven navigation architecture based on the cognition of the indoor scene’s map and use the RGB-D camera for scene recognition. When the robot enters a new indoor environment, it can autonomously explore the environment and recognize the scene configuration while ensuring collision with obstacles can be avoided. During the process of collision avoidance, the 2D laser sensors not only tries to scan to detect obstacles on the fixed plane slightly above the ground, but also integrate the deep camera point cloud in front of the robot to reconstruct the real boundaries of the objects. Then, we design a hierarchical path planning framework such that the information of the depth point cloud will be utilized appropriately in the local planner enabling the robot to safely dodge the table, chair, and any other hollow furnitures featured with hollow bottom and thin legs. On the exploration trajectories, we continuously construct multiple topological nodes and established a topological map. After the exploration, we leverage the grid map and scene recognition to give the most representative scene labels to those nodes on the topological map. Besides collecting the geometric information and associated semantic meanings of the environment, we also use the dataset of natural language to construct a semantic knowledge graph that aggregates object information and major functions over different scenes. After the robot becomes sufficiently cognitive of the physical environment, it can infer the target scene and associated location to head to by investigating the correlation between the human intent and entity on the knowledge graph. Finally, given the inferred target scene, the robot maps out the relevant semantic topological nodes on the cognitive map and then applies the A* path planning algorithm to search for the shortest topological node path leading to the target scene so that the robot can accomplish the intended services.

參考文獻


[1] L. F. Jacobs and F. Schenk, "Unpacking the cognitive map: the parallel map theory of hippocampal function," Psychological review, vol. 110, no. 2, pp. 285-315, 2003.
[2] E. C. Tolman, "Cognitive maps in rats and men," Psychological Review, vol. 55, no. 4, pp. 189-208, 1948.
[3] R. A. Epstein, E. Z. Patai, J. B. Julian, and H. J. Spiers, "The cognitive map in humans: spatial navigation and beyond," Nature Neuroscience, vol. 20, no. 11, pp. 1504-1513, 2017.
[4] H. R. Evensmoen, H. Lehn, J. Xu, M. P. Witter, L. Nadel, and A. K. Håberg, "The anterior hippocampus supports a coarse, global environmental representation and the posterior hippocampus supports fine-grained, local environmental representations," J Cogn Neurosci, vol. 25, no. 11, pp. 1908-25, 2013.
[5] K. B. Iva, B. Buddhika, D. O. Jason, M. Vincent, R. Jessica, L. Zhong-Xu, G. Cheryl, R. S. Rosenbaum, W. Gordon, D. B. Morgan, and M. Morris, "Multiple Scales of Representation along the Hippocampal Anteroposterior Axis in Humans," Current Biology, vol. 28, no. 13, pp. 2129-2135, 2018.

延伸閱讀