簡易檢索 / 詳目顯示

研究生: 洪翊誠
Hung, Yi-Cheng
論文名稱: 以異質網路圖學習病況事件表示法進行死亡風險預測
Data Representation Learning from Heterogeneous Network of Medical Data for Mortality Prediction
指導教授: 柯佳伶
Koh, Jia-Ling
口試委員: 吳宜鴻 徐嘉連 柯佳伶
Koh, Jia-Ling
口試日期: 2022/01/24
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 93
中文關鍵詞: 異質網路圖資料特徵表示法死亡預測模型
英文關鍵詞: heterogenous network, data representation, mortality prediction model
DOI URL: http://doi.org/10.6345/NTNU202200245
論文種類: 學術論文
相關次數: 點閱:114下載:7
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來以機器自動學習數據的特徵表示法,已顯示有助於提升預測任務的準確率。本論文以電子病歷資料中相異類型的病況資料,依指定時間區間內病況事件同時發生的關聯,建立病況事件異質網路圖,並搭配不同的病況事件序列生成樣式,從取樣的事件序列中,學習儀器偵測數據特徵的病況事件表示法,用來從加護病房病患入病房後48小時的病況資料,以LSTM類神經網路架構進行死亡風險預測。本論文實驗比較使用同質特徵走訪路徑與異質特徵走訪路徑的擷取策略,所學習到的病況事件表示法對模型預測效果的差異。實驗在院內死亡預測及短期死亡預測的任務,初步顯示由異質特徵走訪路徑中學習的病況事件表示法,對兩個預測模型的預測效果皆有提昇。

    In recent years, feature representation learning from data has been shown to be helpful for improving the accuracy of prediction tasks. In this thesis, the various types of attributes combined with the values in the electronic medical record, which implicitly describe patient’s condition, are named clinical events. We constructed a heterogeneous network of clinical events according to their occurring on the same patient within a specified time interval. Then event sequences are sampled by visiting different meta-paths for learning the representations of chart events. The learned representations of chart events are used to input to a framework of LSTM neural network for predicting mortality of ICU patients according to their first 48 hours of in-ICU EMR data. In the experiments, we compared the prediction effectiveness of the learned event representations by changing the time interval of constructing the heterogeneous network and applying homogeneous or heterogeneous meta-path visiting. The preliminary results of experiments show that the representations of chart events learned from the heterogeneous meta-path effectively improve the recall and AUROC on both the tasks of in-hospital mortality prediction and short-term mortality prediction.

    第1章 緒論 1 1.1研究動機 1 1.2研究目的 3 1.3研究限制與範圍 5 1.4論文方法 7 1.5論文架構 11 第2章文獻探討 12 2.1表徵學習法 12 2.1.1文字資料表徵學習 12 2.1.2醫療代碼表徵學習法 13 2.1.3網路圖表徵學習模型 18 2.1.4醫療異質網路表徵學習法 21 2.2死亡率預測模型 23 第3章問題定義與系統架構 28 3.1預測任務定義 28 3.2系統流程架構 29 第4章研究方法 32 4.1資料前處理 32 4.1.1填補遺失值 32 4.1.2清除異常值 35 4.1.3病況事件編碼 36 4.2特徵表示法學習 40 4.2.1建立病況事件異質網路圖 40 4.2.2病況事件特徵表示法學習 42 4.3死亡率預測模型 47 4.3.1模型架構 47 4.3.2擴展預測模型 51 第5章實驗結果與探討 53 5.1實驗資料集及模型參數說明 53 5.2評估指標 56 5.3病況事件表示法學習效果分析 59 5.3.1分析表示法相似搜尋 59 5.3.2儀器偵測數據類型病況事件分群 64 5.4預測效能評估 70 5.4.1資料前處理效果比較 70 5.4.2病況事件表示法之訓練資料擷取策略效果評估 72 5.4.3加入實驗室化驗結果病況事件資訊之模型預測效果 79 第6章結論與未來研究方向 83 參考文獻 85 附錄 89

    [1] Choi, Edward, et al. "Multi-layer representation learning for medical concepts." Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016.
    [2] Choi, Edward, et al. "GRAM: graph-based attention model for healthcare representation learning." Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2017.
    [3] Choi, Edward, et al. "Retain: An interpretable predictive model for healthcare using reverse time attention mechanism." Advances in Neural Information Processing Systems. 2016.
    [4] Ma, Fenglong, et al. "Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks." Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 2017.
    [5] Ma, Fenglong, et al. "Kame: Knowledge-based attention model for diagnosis prediction in healthcare." Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 2018.
    [6] Hosseini, Anahita, et al. "HeteroMed: Heterogeneous Information Network for Medical Diagnosis." Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 2018.
    [7] Mikolov, Tomas, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781(2013).
    [8] Mikolov, Tomas, et al. "Distributed representations of words and phrases and their compositionality." Advances in neural information processing systems. 2013.
    [9] Perozzi, Bryan, Rami Al-Rfou, and Steven Skiena. "Deepwalk: Online learning of social representations." Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2014.
    [10] Grover, Aditya, and Jure Leskovec. "node2vec: Scalable feature learning for networks." Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2016.
    [11] Harutyunyan, Hrayr, et al. "Multitask learning and benchmarking with clinical time series data." arXiv preprint arXiv:1703.07771 (2017).
    [12] Dong, Yuxiao, Nitesh V. Chawla, and Ananthram Swami. "metapath2vec: Scalable representation learning for heterogeneous networks." Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 2017.
    [13] Sun, Yizhou, and Jiawei Han. "Mining heterogeneous information networks: a structural analysis approach." Acm Sigkdd Explorations Newsletter 14.2 (2013): 20-28.
    [14] Sun, Yizhou, and Jiawei Han. "Mining heterogeneous information networks: principles and methodologies." Synthesis Lectures on Data Mining and Knowledge Discovery 3.2 (2012): 1-159.
    [15] Wang, Jizhe, et al. "Billion-scale commodity embedding for e-commerce recommendation in alibaba." Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018.
    [16] Rong, Xin. "word2vec parameter learning explained." arXiv preprint arXiv:1411.2738 (2014).
    [17] Chaudhry, Basit, et al. "Systematic review: impact of health information technology on quality, efficiency, and costs of medical care." Annals of internal medicine 144.10 (2006): 742-752.
    [18] Black, Ashly D., et al. "The impact of eHealth on the quality and safety of health care: a systematic overview." PLoS medicine 8.1 (2011): e1000387.
    [19] Goldzweig, Caroline Lubick, et al. "Costs And Benefits Of Health Information Technology: New Trends From The Literature: Since 2005, patient-focused applications have proliferated, but data on their costs and benefits remain sparse." Health affairs 28.Suppl2 (2009): w282-w293.
    [20] Jha, Ashish K., et al. "Use of electronic health records in US hospitals." New England Journal of Medicine 360.16 (2009): 1628-1638.
    [21] Schuster, Mike, and Kuldip K. Paliwal. "Bidirectional recurrent neural networks." IEEE transactions on Signal Processing 45.11 (1997): 2673-2681.
    [22] Dybowski, Richard, et al. "Prediction of outcome in critically ill patients using artificial neural network synthesised by genetic algorithm." The Lancet 347.9009 (1996): 1146-1150.
    [23] Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II: a severity of disease classification system. Critical care medicine. 1985;13(10):818–829.
    [24] Le Gall JR, Lemeshow S, Saulnier F. A new simplified acute physiology score (SAPS II) based on a European/North American multicenter study. Jama. 1993;270(24):2957–2963.
    [25] Vincent JL, De Mendonc¸a A, Cantraine F, Moreno R, Takala J, Suter PM, et al. Use of the SOFA score to assess the incidence of organ dysfunction/failure in intensive care units: results of a multicenter, prospective study. Critical care medicine. 1998;26(11):1793–1800.
    [26] KRISHNAN, Gokul S.; KAMATH, S. Sowmya. A Supervised learning approach for ICU mortality prediction based on unstructured electrocardiogram text reports. In: International Conference on Applications of Natural Language to Information Systems. Springer, Cham, 2018. p. 126-134.
    [27] HUANG, Gao, et al. Trends in extreme learning machines: A review. Neural Networks, 2015, 61: 32-48.
    [28] LEE, Christine K., et al. Development and validation of a deep neural network model for prediction of postoperative in-hospital mortality. Anesthesiology, 2018, 129.4: 649-662.
    [29] CHOI, Edward, et al. Doctor ai: Predicting clinical events via recurrent neural networks. In: Machine learning for healthcare conference. PMLR, 2016. p. 301-318.
    [30] SONG, Huan, et al. Attend and diagnose: Clinical time series analysis using attention models. In: Thirty-second AAAI conference on artificial intelligence. 2018.
    [31] GUPTA, Priyanka, et al. Transfer learning for clinical time series analysis using deep neural networks. Journal of Healthcare Informatics Research, 2020, 4.2: 112-137.
    [32] https://zh.wikipedia.org/wiki/ROC%E6%9B%B2%E7%BA%BF

    下載圖示
    QR CODE