透過您的圖書館登入
IP:18.221.154.151
  • 學位論文

一個利用時間對比像素和縱向平行區域二值化並基於立體視覺實現動態深度感測之多功能影像感測器

A Multi-Mode Vision Sensor with Temporal Contrast Pixel and Column-Parallel Local Binary Pattern Extraction for Dynamic Depth Sensing Using Stereo Vision

指導教授 : 謝志成
本文將於2026/12/13開放下載。若您希望在開放下載時收到通知,可將文章加入收藏

摘要


本論文提出了一個基於幀的運動偵測(Motion Detection)視覺感測器,此感測器採用了新提出的時間對比像素(Temporal Contrast Pixel)架構和曝光補償機制(Exposure Compensation Scheme),利用全域快門和幀差異脈衝寬度調變(Pulse Width Modulation)操作實現了像素內時間對比運算和動態事件報告,並且此架構僅使用了6個電晶體及1個電容器,是目前為止像素內幀差異運算中最簡單的架構。另一方面,縱向平行區域二值化提供了空間特徵擷取並且無需先執行類比數位轉換,進而節省了大量功耗,而透過幀差異和區域二值化的結合,此晶片實現了時空間特徵擷取訊息,在針對障礙物判斷和避障的應用中,此時空間訊息可以被用來計算動態立體視覺,而這也是首次提出計算動態物體深度並過濾掉靜態物體深度的方法。最後,感興趣區域(Region of Interest)擷取也在此晶片上實現並用於資料切割和定位,並且此感興趣區域擷取不僅支援原始影像模式(Image Mode),同時也支援了幀差異模式(Frame Difference Mode)和動態事件報告模式(Event Report Mode)以定位動態區域。 一個0.56V/0.8V多功能視覺感測器搭載126x126時間對比脈衝寬度調變影像陣列採用了0.18µm 1P6M標準互補式金氧半導體工藝製造,晶片面積為2.35x3.19mm2;此晶片支援5種主要操作模式,包括10位元原始影像模式、10位元幀差異模式、1.5位元動態事件報告模式、8位元區域二值化模式和感興趣區域模式,所有的操作模式都可以相互組合以支援複雜的應用情景。量測結果顯示此晶片在原始影像模式和動態事件報告模式以及區域二值化模式下,達到的最大幀率分別為每秒540/819/540幀,功率消耗分別為390/162.6/151.9µW,正規化優質指標分別為每幀每像素45.5/12.5/17.7pJ。

並列摘要


This thesis presents a frame-based motion detection (MD) vision sensor with a new proposed temporal contrast pixel (TCP) structure and exposure compensation scheme (ECS), which realizes the in-pixel temporal contrast calculation and event reporting with global shutter and frame difference pulse-width-modulation (PWM) operations using only 6 transistors and 1 capacitor (6T1C), and the structure is the simplest architecture in in-pixel frame difference to date. On the other hand, the column-parallel local binary pattern (LBP) extraction provides spatial feature extraction without performing ADC first saving lots of power. With the combination of frame difference and LBP, the temporal-spatial feature information is achieved. For the application of obstacle judgment and avoidance, this temporal-spatial information can be used to calculate dynamic stereo vision, which is first proposed to calculate dynamic objects’ depth and filter out static objects’ depth. Last, region of interest (ROI) extraction is also implemented on-chip for data windowing and location. Moreover, The ROI not only supports raw image (IM) mode but also frame difference (FD) mode and event report (ER) mode to locate motion region. A 0.56V/0.8V multi-mode vision sensor with 126x126 6T1C TCP has been fabricated in 0.18um 1P6M standard CMOS process with chip area 2.35 x 3.19 mm2. The chip supports five main operation modes including 10-bit IM mode, 10-bit FD mode, 1.5-bit ER mode, 8-bit LBP mode and ROI mode. All the operation modes can be combined with each other to support complicated application scenarios. The measurement results show the achieved max frame rate in IM/ER/LBP mode is 540/819/540fps and the power consumption is 390/162.6/151.9µW with iFoM 45.5/12.5/17.7pJ/pixel∙frame respectively.

參考文獻


[1] J. Ohta, "Smart CMOS Image Sensor and Applications," CRC Press.
[2] J. Nakamura, "Image Sensor and Signal Processing for Digital Still Camera, Taylor & Francis Group, 2006.
[3] P. Lichtsteiner and T. Delbruck et al., "A 128 X 128 120 dB 15 μs Latency Asynchronous Temporal Contrast Vision Sensor," IEEE Journal of Solid-State Circuits, vol. 43, no. 2, pp. 566-576, Feb. 2008.
[4] C. Li, et al., "A 132 by 104 10μm-Pixel 250μW 1kefps Dynamic Vision Sensor with Pixel-Parallel Noise and Spatial Redundancy Suppression," IEEE Symp. VLSI Circuits, pp. 216-217, June 2019.
[5] B. Son, et al., "A 640×480 Dynamic Vision Sensor with a 9μm Pixel and 300Meps Address-Event Representation," IEEE ISSCC Dig. Tech. Papers, pp. 66-67, Feb. 2017.

延伸閱讀