  • 學位論文


Algorithm and Implementation of Vision-Based 3D Mouse System

指導教授 : 陳良基


隨著三維顯示器與三維數位內容的發展,消費性電子產品即將從傳統的二維平面進入三維的立體時代。舉凡立體螢幕,立體電影,搭載立體顯示器的手機、立體遊戲機台等等已問市的三維消費性電子產品,都帶給消費者更生動、直覺與愉悅的使用經驗。個人電子產品的三維立體化已是不可逆的趨勢,微軟公司在今年新推出的作業系統VISTA中,即使用三維的立體概念來呈現層疊的資料夾,提供使用者更直覺方便的操作經驗。隨著三維顯示器技術的成熟,以及三維呈現方始所能帶來的便利性與真實感,我們預估不久的將來,三維個人電腦也將正式走入我們的生活中。一台搭載三維顯示器與作業系統的立體電腦,亦必須具備可以與電腦做立體互動的三維滑鼠。有別與傳統的桌上型滑鼠,只能與電腦做二維的互動,一個三維滑鼠必須要能回傳三個維度的運動資訊,也就是比傳統滑鼠多了垂直電腦螢幕的這個維度。 現有的三維滑鼠系統(三維指向器)多用於特定的系統如醫療或工業塑型等,也有部份學術研究針對一般個人用三維滑鼠提出設計。大至上可分為三類: 手持型、桌上型與機械手臂型。這些現有的三維指向器可偵測出三個維度的位移量,部份可偵測加速度以及三維轉動向量。然而,有一個普遍存在的問題即是對於一般個人而言,操作起來不是很舒適或直覺,並且有部份系統具有準備校準時間過長、體積龐大、精密度不佳等等缺點。 任天堂新推出的遊戲主機WII最與眾不同的特色是它的控制器,外型為棒狀,就如同電視遙控器一樣,可單手操作。具有二維指向定位及三維加速度偵測的功能。我們亦可把它看成一種能提供消費者愉悅使用經驗的三維滑鼠。然而,wii的控制器必須搭配一個配有紅外線偵測陣列的感應棒使用,將其置於置於電視螢幕的上方或下方。此感應棒限制了使用者的操作範圍,一旦控制器指到感應器能感測的範圍之外,控制器的功能就會失效。 我們希望實作一個三維滑鼠解決以上所述的問題。這個三維滑鼠系統的特色包含無限制的感應區間(不需感應棒),操作簡易便利,體積輕便短小以及能偵測滑鼠在三個維度上的運動向量。為了達到前述的特色,我們將網錄攝影機嵌入滑鼠本體中,利用網錄攝影機即時錄製下來的畫面,經過我們的數位信號處理單元分析滑鼠瞬間的三維運動向量,只要網錄攝影機能攝入具足夠亮度與適合分析的影像,我們的滑鼠就能有效作用。因此使用者可以依自己的喜號姿勢,拿著滑鼠以任意方向操作,提供使用者愉悅的使用經驗。 本系統的核心演算法為以攝得的兩張相臨影像偵測相機的三維位移向量。我們所提出的演算法,創新點在於可避開以上傳統演算法不易解決的問題,使用全域式移動預測針對畫面中的主要平面做運算,得到準確的平面轉換關係,並僅對非位在該平面上的特徵點作特徵向量萃取和特徵向量偵測。由於縮小了特徵點的範圍,並且僅需數組特徵向量便可得到能換算出位移與轉動向量的關鍵資訊,我們的演算法提供了較傳統演算法對於實際畫面更為準確,且低運算量的創新優點。 我們以多組實驗數據驗證演算法的正確性與可用性。當攝入的畫面中存在有一主要平面與少數具有與該平面不同深度的物件時,我們的演算法可達到相當好的精確性(達98%~99%)。未來的研究方向主要包含: (1)處裡當攝入影像中含有自行運動的物體時之狀況(現有系統假設所攝入的影像均為靜態) (2)進一步得出相機沿三個軸的轉動向量(現有系統僅能求得三維的位移單位向量) (3)當畫面中沒有主要平面時,如何以合適特徵產生虛擬平面來做全域移動預測。


The fast growths of 3D digital content and 3D display device are driving the development of 3D industry; since traditional 2D interface will gradually be difficult to meet consumer demand, the markets for 3D applications are promising. For example, the Vista which applies 3D-style window and the Wii which provides semi-3D motion sensing function have been successfully accepted by today's consumers. In addition to enjoying the 3D content on the 3D display, users may want to perform 3D interactions with the entertaining systems. Assume in the near future, the 3D computers will enter our life and drive the demand for 3D mouse, which users use to interact with computer system or other applications, providing interaction with one more dimension than the conventional 2D mouse. Existing 3D mouse devices could be broadly devided into the following categories: hand-held type, desktop type and mechanical-arm type. There are some problems in these existing solutions, such as inconvenience to use and limited active sensing range. In our proposed 3D mouse system, the most important concept is to provide our 3D mouse an unlimited active sensing range. The important features would allow user to perform interaction with host system at any places and toward any directions, enhancing the operation conveniences and allow users to perform interaction with unlimited large movement, which can not be done by a 3D mouse with limited sensing range, such as the Wii remote. In order to achieve the unlimited active sensing range, we embedded the sensing device, the webcam, in our mouse, and use the captured video frames as clues to detect the translational displacement of the mouse. Camera motion estimation has been studied for decades. Some prior art use the correspondences between image features to recover the camera motion while the other class use the optical flow to solve the problems. Both of them encountered the problems of global feature matching, which is usually hard to achieve high accuracy rate. In our proposed algorithm, we first perform the global motion estimation and compensation between two captured frames and the perform the local feature matching on the mismatching regions to detect the epipolar lines and then derive the epipole, which could be used to derive the translational motion parameters. Since only local feature matching is required, our proposed algorithm could achieve a very high accuracy rate (98%~99%) when there is a dominating plane (for GME) and few foreground objects(for feature matching) existing. The future work would include moving objects removing, 3D rotational parameters calculation and virtual plane generation (when there is no dominating plane existing).


3D mouse 3D pointer 3D controller camera motion ego motion


[1] F. Dufaux and J. Konrad, “Efficient, robust, and fast global motion estimation for
[2] S. Erturk, “Digital image stabilization with sub-image phase correlation based
[3] M. Hoetter, “Differential estimation of the global motion parameters zoom and
pan,” Signal Processing, vol. 16, pp. 249–265, Mar. 1989.
[4] Y. Lu, W. Gao, and F. Wu, “Sprite generation for frame-based video coding,” in


