研究生: 陳靖允
Chen, Ching-Yun
論文名稱: 結合PTZ攝影機與光學雷達之CNN虛擬圍籬系統
CNN-Based Virtual Fence System with PTZ Camera and LiDAR
指導教授: 陳世旺
Chen, Sei-Wang
Fang, Chiung-Yao
學位類別: 碩士
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 中文
論文頁數: 96
中文關鍵詞: 虛擬圍籬系統卷積神經網路光學雷達PTZ攝影機連續影像相減法形態學資料集前處理
英文關鍵詞: virtual fence system, temporal differencing method, dataset preprocessing
DOI URL: http://doi.org/10.6345/THE.NTNU.DCSIE.028.2018.B02
論文種類: 學術論文
  • 本研究開發一套結合PTZ(Pan-Tilt-Zoom)攝影機與光學雷達(Light Detection and Ranging, LiDAR)之CNN(Convolutional Neural Network,卷積神經網路)虛擬圍籬系統。
    虛擬圍籬與傳統的隔離方式不同,並不需要真正築起一道實體的牆壁或護欄,而是利用各種電子裝置與軟體程式的結合,建立人眼不可察覺的虛擬防線。虛擬圍籬具有下列優點:(A)低人力介入且警戒可為全天候、大範圍 (B)具機動性與擴充性 (C)不破壞原景觀 (D)即時通報且可延伸後續處理。但實際應用上,虛擬圍籬常因誤報率太高,處理和通報速度太慢等因素,尚未被大眾所接受。
    對於虛擬圍籬系統而言,因為侵入物件有移動的動態資訊,會造成連續畫面的變化,因此系統可經由移動物件定位法來找出侵入物件及其邊界框,不需像傳統的物件偵測系統是以單張靜態的影像畫面為輸入,必須產生和評比各種可能的物件邊界框,並浪費資源和時間在不必要的背景物件的偵測和辨識上。本研究所採用的移動物件定位法是運用三個連續畫面的連續影像相減法,並且採運作速度極快的bitwise_and函式取相減影像的交集,以得到較精確的移動前景與邊界框。此外,可用經過動態形態學填補空洞後的二值化前景影像為遮罩,與原影像或邊界框影像結合後,達到粗略的去背景(matting)效果。Matting& Rescaled-Grey版資料集在VGG-16也有很高的mAP(95.3%)。

    This study proposes a CNN(Convolutional Neural Network)-based virtual fence system equipped with a LiDAR (Light Detection and Ranging) sensor and a PTZ (Pan-Tilt-Zoom) camera. The proposed system detects and classifies invaders with high mAP (mean average precision) and short operation time.
    Virtual fence, as opposed to a real physical fence, plays an important role in intelligent surveillance systems. It involves less human resources to build, makes no physical impacts on the surrounding areas, and is easily extendable and portable. However, due to a high false alarm rate and high time complexity, it is still challeng- ing for a virtual fence system to provide satisfactory performance.
    The proposed virtual fence system in this study improves both the detection rate and speed. First, a LiDAR sensor is used to detect invaders. Once an invader is sensed, the sensor triggers the PTZ camera. LiDAR’s tolerance to variations of weather and light enhances the robustness of the system. Besides, since small objects in images easily cause detection and classification errors, the distance information of objects provided by the LiDAR sensor is passed to the PTZ camera for controlling its zoom-in and zoom-out operations to ensure proper sizes of objects present in images.
    Then, a three-frames temporal differencing algorithm is applied to locate the moving objects in video frames. Through a bitwise-and operation and dynamic morphological processings applied to the differencing frames, the contours and bounding boxes of the moving objects can quickly be determined. Compared with the existing object detection systems, such as RCNN and YOLO series, which provide lots of bounding boxes and evaluations at multiple locations and scales, the proposed object location method is less complicated. Besides, object detection systems above are trying to locate and classify all the objects appearing in an image, while a virtual fence system is only interested in detecting the invading moving objects. Thus, using the proposed moving object location method can avoid unnecessary processings of irrelevant background objects.
    Finally, a CNN system is used to classify the objects in the bounding box images into 3 classes, mainly, pedestrian, animal and others. The CNN frameworks experimented in this study are VGG-16 and Darknet-19 (the CNN framework used in YOLOv2). Different training modes and dataset preprocessings for CNN are investi- gated. For training modes, experiments of VGG-16 demonstrate that training with ImageNet-pretrained parameters and fine-tuned with bounding box datasets achieves the best performance. For dataset preprocessing, there are 4 main preprocessing types, mainly, Original, Rescaled (isotropically rescaling an image into a predefined fixed- size black or grey underlay), Matting (color background with black or grey), and Rescaled&Matting. Experimental results indicate that using Rescaled preprocessing for both training and testing datasets outperforms other combinations. VGG-16 with ImageNet-pretrained parameters and fine-tuned using a bounding box dataset with Rescaled-Grey preprocessing achieves 96.3% mAP.
    The integration test of the proposed virtual fence system demonstrate that the performance of the best performing configuration mentioned above achieves higher than 95% mAP and the processing time averagely taken from LiDAR detection to the end of CNN classification is less than 0.2 second. The experimental results show that the proposed system is fast, accurate, stable and of practical use.

    目錄 i 圖目錄 iii 表目錄 vi 摘要 viii Abstract x 誌謝 xii 第一章 緒論 1 第一節 研究動機 1 第二節 研究困難 4 第三節 研究貢獻 5 第四節 論文架構 5 第二章 文獻探討 7 第一節 虛擬圍籬之相關研究 7 第二節 移動物件定位之相關研究 10 第三節 CNN及基於CNN的物件偵測之相關研究 10 第四節 PTZ攝影機及光學雷達(LiDAR) 22 第三章 虛擬圍籬系統 25 第一節 系統目的 25 第二節 研究環境與設備 25 第三節 系統流程 27 第四章 移動物件定位 30 第一節 背景相減法 30 第二節 連續影像相減法 34 第三節 比較分析 37 第四節 連續影像相減法對較長間隔之連續影像的處理結果 39 第五章 CNN架構 43 第一節 CNN介紹 43 第二節 VGG-16 49 第三節 Darknet-19 51 第六章 實驗結果 54 第一節 實驗環境 54 第二節 Darknet-19與VGG-16進行物件類別判斷之實驗 55 第三節 鏡頭縮放對物件偵測結果之影響 82 第四節 整合測試 83 第七章 結論與未來工作 88 第一節 結論 88 第二節 未來工作 89 參考文獻 91

