簡易檢索 / 詳目顯示

研究生: 周文姸
Chou, Wen-Yen
論文名稱: Sensor-Based Gesture Detection Using Bidirectional LSTM with Self-Attention and Conditional Random Field
Sensor-Based Gesture Detection Using Bidirectional LSTM with Self-Attention and Conditional Random Field
指導教授: 黃文吉
Hwang, Wen-Jyi
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 41
中文關鍵詞: 自注意力機制條件式隨機場
英文關鍵詞: self-attention, conditional random field
DOI URL: http://doi.org/10.6345/NTNU202100338
論文種類: 學術論文
相關次數: 點閱:147下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文的目的是為感測器產生的數據提供一種新穎的手勢檢演算法,在該演算法中,我們使用了self-attention,雙向LSTM(Bi-LSTM)和條件式隨機場(CRF)。self-attention的作用是使模型專注於輸入訊號重要的部分,Bi-LSTM則同時參考過去和未來時間點的訊息,最後,CRF則針對Bi-LSTM的輸出以預測結果應有的行為模式做修正,以產生最終的檢測結果。我們使用了實驗室原創的、配有感測器的智慧手套進行實驗來評估該演算法的效能。實驗結果表明,此演算法不但能對訊號中的手勢序列進行有效地檢測,還能與現有的手勢分類算法結合使用,從而達到以傳感器為基礎的手勢序列識別的目的。

    The goal of this thesis is to present a novel hand gesture detection algorithm for the sensory data produced by flex sensors. In the algorithm, the self-attention operations, Bi-directional Long Short Term Memory (Bi-LSTM), and Continuous Random Field (CRF) are employed for the effective detection of hand gestures. The self-attention operations are adopted for hi-lighting the significant portions of input sensory data for detection. The Bi-LSTM further exploits the correlation among the input sequences in both directions. The correlation between input samples and output labels, and among the output labels, are then explored by the CRF to produce the final detection results. A prototype of smart glove equipped with flex sensors has been built for the evaluation of the proposed algorithm. Experimental results reveal that the proposed algorithm is able to carry out detection of gesture sequences in the sensory data. The proposed algorithm can also operate in conjunction with existing gesture classification algorithms for the accurate recognition of gesture sequences based on flex sensors.

    Chapter I. Introduction 1 Section 1. 1 Background and Motivation 1 Section 1. 2 Problem Definition 3 Section 1. 3 Organization of The Thesis 4 Chapter II. Related Work 5 Section 2. 1 Fundamental Neural Network 5 Section 2. 2 Basic Structures 9 Section 2. 3 Advanced Structures 11 Chapter III.The Proposed Algorithm 12 Section 3. 1 Overview 12 Section 3. 2 Self-Attention 14 Section 3. 3 bi-LSTM 16 Section 3. 4 Conditional Random Field for bi-LSTM 17 Chapter IV.Experiments 20 Section 4. 1 Experimental Setup 20 Section 4. 2 Models for Ablation Study 24 Section 4. 3 Evaluation Criteria 26 Section 4. 4 Numerical Results and Comparisons 30 Section 4. 4 Visualization results 34 Chapter V. Conclusions 40 BIBLIOGRAPHY 41

    Hochreiter, S. a. (1997). Long short-term memory. Neural computation, 9(MIT Press), 1735--1780.
    Paliwal, M. S. (1997). Bidirectional recurrent neural networks. IEEE Trans. Signal Process., 45, 2673-2681.
    Lafferty, J. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. 282--289.
    Bahdanau, D. a. (2014, 09). Neural Machine Translation by Jointly Learning to Align and Translate. ArXiv, 1409.
    Lin, Z. a. (2017, 03). A Structured Self-attentive Sentence Embedding.
    Hornik, K. a. (1989, 01). Multilayer feedforward networks are universal approximator. IEEE Transactions on Neural Networks, 2.
    Jorge M. Lobo, A. J.-V. (2008, 3). AUC: a misleading measure of the performance of predictive distribution models. Global Ecology and Biogeography, 17(2), 145-151.
    Yu, Z. H. (2015). Bidirectional LSTM-CRF Models for Sequence Tagging. ArXiv, abs/1508.01991.
    Ma, P.-H. L.-J.-Y. (2020). Why Attention? Analyze BiLSTM Deficiency and Its Remedies in the Case of NER. AAAI.
    Deng, Z. S.-H. (2019). Fast Structured Decoding for Sequence Models. NeurIPS.
    Cybenko, G. (1989). Approximation by superposition of a sigmoidal function (Vol. 2). Math. Control Signals Syst.
    K. Hornik, M. S. (1989). Multilayer feedforward networks are universal approximators (Vol. 2 (5)). Neural Networks.
    Graves, A. S. (2005). Bidirectional LSTM networks for improved phoneme classification and recognition." Artificial Neural Networks: Formal Models and Their Applications (Vol. 799). Springer Berlin Heidelberg: ICANN .
    Coulibaly, P. A. (n.d.). Daily reservoir inflow forecasting usingartificial neural networks with stopped training approach. (Vols. 230 (3, 4), 244e257.). Hydrology.
    O. Köpüklü, A. G. (2019). Real-time Hand Gesture Detection and Classification Using Convolutional Neural Networks. France, Lille: 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019).
    W. Liu, D. A.-Y. (2016). SSD: Single shot multibox detector,. ECCV.
    J. Redmon, S. D. (2016). You only look once: Unified, real-time object detection. CVPR.
    Girshick, R. (2015). Fast R-CNN. ICCV.
    Shaoqing Ren, K. H. (2015). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence.
    Ross B. Girshick, J. D. (2014). Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. IEEE Conference on Computer Vision and Pattern Recognition.

    無法下載圖示 電子全文延後公開
    2026/02/28
    QR CODE