簡易檢索 / 詳目顯示

研究生: 楊千艎
Yang, Chien-Huang
論文名稱: 基於高維度資料分解的空氣污染視覺化分析
Visual Analytic of Air Pollution Based on PARAFAC-Like Decomposition
指導教授: 王科植
Wang, Ko-Chih
口試委員: 賀耀華
Ho, Yao-Hua
曾琬鈴
Tseng, Wan-Ling
王懌琪
Wang, Yi-Chi
王科植
Wang, Ko-Chih
口試日期: 2023/07/31
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 44
中文關鍵詞: 視覺化資料分析資料探勘資料分解
英文關鍵詞: visualization, data analysis, pattern mining, data decomposition
研究方法: 實驗設計法次級資料分析
DOI URL: http://doi.org/10.6345/NTNU202301163
論文種類: 學術論文
相關次數: 點閱:103下載:7
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 空氣污染是一個嚴重的全球環境問題,對人類健康和生態平衡造成嚴重影響。PM2.5是微粒物質的一個子集,直徑小於2.5微米,已經與嚴重的呼吸和心血管問題、土壤和水污染以及生態系統破壞相關聯。為了更好地了解PM2.5的來源和分佈,我們採用了一種類似PARAFAC的分解方法來分析台灣使用空氣盒子設備收集的空氣質量數據。這種方法允許識別導致某個地區和時間PM2.5濃度較高的因素,從而提供PM2.5分佈模式的洞察。為了增強對這些模式的分析,我們提出了一種通過可視化進行交互式多視圖分析的方法,以探索和理解複雜的數據集。這種方法旨在幫助更好地理解空氣質量,改進複雜數據集的分析和解釋,最終獲得更好的洞察和結果。

    Air pollution is a serious global environmental issue that affects human health and ecological balance. PM2.5, a subset of particulate matter with a diameter of 2.5 micrometers or less, has been linked to severe respiratory and cardiovascular problems, soil and water pollution, and ecosystem disruption. To better understand the sources and distribution of PM2.5, we employed a PARAFAC-like decomposition method to analyze air quality data collected in Taiwan using airbox devices. This method allows for the identification of factors that contribute to high concentrations of PM2.5 in a given area and time, providing insights into the patterns of PM2.5 distribution. To enhance the analysis of these patterns, we propose an interactive multi-view analysis through visualization to explore and understand complex data sets. This approach aims to contribute to a better understanding of air quality and improve the analysis and interpretation of complex data sets, ultimately leading to better insights and outcomes.

    Chinese Abstract i English Abstract ii Dedication iii Acknowledgments iv List of Figures viii 1. Introduction 1 2. Related Work 5 2.1 Air Pollution Analysis 5 2.2 Data Decomposition 7 2.3 Clustering 8 3. Task 11 4. Airbox Data and Data Analysis 14 4.1 AirBox Data 14 4.2 Tensor Decompose 16 5. Visual Interface 20 5.1 Map View 21 5.2 Device t-SNE View 22 5.3 Raw Data Comparison View 23 5.4 Group Representation View 24 5.5 Interactive 27 6. Use Cases 30 6.1 Find Similar Device from Map View and Device t-SNE View 30 6.2 Detect Abnormal Occurrences from Group Representation View 31 7. Conclusion 38 Bibliography 40

    Herv ́e Abdi and Lynne J Williams. Principal component analysis. Wiley inter-disciplinary reviews: computational statistics, 2(4):433–459, 2010.

    Saba Ameer, Munam Ali Shah, Abid Khan, Houbing Song, Carsten Maple, Saif Ul Islam, and Muhammad Nabeel Asghar. Comparative analysis of ma- chine learning techniques for predicting air quality in smart cities. IEEE Access, 7:128325–128338, 2019.

    Bruce N Ames. Identifying environmental chemicals causing mutations and cancer. science, 204(4393):587–593, 1979.

    Rasmus Bro and Henk AL Kiers. A new efficient method for determining the number of components in parafac models. Journal of Chemometrics: A Journal of the Chemometrics Society, 17(5):274–286, 2003.

    Rasmus Bro and Age K Smilde. Principal component analysis. Analytical meth-ods, 6(9):2812–2831, 2014.

    Nigel Bruce, Rogelio Perez-Padilla, and Rachel Albalak. Indoor air pollution in developing countries: a major environmental and public health challenge. Bulletin of the World Health organization, 78(9):1078–1092, 2000.

    Peter A Burrough, Rachael A McDonnell, and Christopher D Lloyd. Principles of geographical information systems. Oxford University Press, USA, 2015.

    Mauro Castelli, Fabiana Martins Clemente, Aleˇs Popoviˇc, Sara Silva, and Leonardo Vanneschi. A machine learning approach to predict air quality in california. Complexity, 2020, 2020.

    Ling-Jyh Chen, Yao-Hua Ho, Hu-Cheng Lee, Hsuan-Cho Wu, Hao-Min Liu, Hsin-Hung Hsieh, Yu-Te Huang, and Shih-Chun Candice Lung. An open framework for participatory pm2.5 monitoring in smart cities. IEEE Access, 5:14441–14454, 2017.

    Richard L Church. Geographical information systems and location science. Computers & Operations Research, 29(6):541–562, 2002.

    N Colvile, Emma Jane Hutchinson, JS Mindell, and RF Warren. The transport sector as a source of air pollution. Atmospheric environment, 35(9):1537–1565, 2001.

    Pierre Comon. Independent component analysis, a new concept? Signal processing, 36(3):287–314, 1994.

    Per-Erik Danielsson. Euclidean distance mapping. Computer Graphics and image processing, 14(3):227–248, 1980.

    Wehbeh Farah, Myriam Mrad Nakhl ́e, Maher Abboud, Isabella Annesi-Maesano, Rita Zaarour, Nada Saliba, Georges Germanos, and Jocelyne Gerard. Time series analysis of air pollutants in beirut, lebanon. Environmental monitoring and assessment, 186:8203–8213, 2014.

    Sarah Elise Finlay, Andrew Moffat, Rob Gazzard, David Baker, and Virginia Murray. Health impacts of wildfires. PLoS currents, 4, 2012.

    Barbara J Finlayson-Pitts and James N Pitts Jr. Tropospheric air pollution: ozone, airborne toxics, polycyclic aromatic hydrocarbons, and particles. Science, 276(5315):1045–1051, 1997.

    Panos G Georgopoulos and John H Seinfeld. Statistical distributions of air pollution concentrations. Environ. Sci. Technol.;(United States), 16(7), 1982.

    Nelson Gouveia and Tony Fletcher. Time series analysis of air pollution and mortality: effects by cause, age and socioeconomic status. Journal of Epidemiology & Community Health, 54(10):750–755, 2000.

    Yuefeng Han, Rong Chen, and Cun-Hui Zhang. Rank determination in tensor factor model. Electronic Journal of Statistics, 16(1):1726–1803, 2022.

    Valentinus Roby Hananto and IGNAW Putra. A dashboard system for monitoring air pollution in surabaya based on pm2. 5. Journal of Information Systems Engineering and Business Intelligence, 4(2):139–147, 2018.

    Richard A Harshman et al. Foundations of the parafac procedure: Models and conditions for an” explanatory” multimodal factor analysis. 1970.

    John A Hartigan and Manchek A Wong. Algorithm as 136: A k-means clustering algorithm. Journal of the royal statistical society. series c (applied statistics), 28(1):100–108, 1979.

    Richard B Hayes, Chris Lim, Yilong Zhang, Kevin Cromar, Yongzhao Shao, Harmony R Reynolds, Debra T Silverman, Rena R Jones, Yikyung Park, Michael Jerrett, et al. Pm2. 5 air pollution and cause-specific cardiovascular disease mortality. International journal of epidemiology, 49(1):25–35, 2020.

    Gerard Hoek, Rob Beelen, Kees De Hoogh, Danielle Vienneau, John Gulliver, Paul Fischer, and David Briggs. A review of land-use regression models to assess spatial variation of outdoor air pollution. Atmospheric environment, 42(33):7561–7578, 2008.

    Stephen C Johnson. Hierarchical clustering schemes. Psychometrika, 32(3):241–254, 1967.

    Gaganjot Kaur Kang, Jerry Zeyu Gao, Sen Chiao, Shengqiang Lu, and Gang Xie. Air quality prediction: Big data and machine learning approaches. Int. J. Environ. Sci. Dev, 9(1):8–16, 2018.

    Frank J Kelly and Julia C Fussell. Air pollution and public health: emerging hazards and improved understanding of risk. Environmental geochemistry and health, 37:631–649, 2015.

    Grace R Kingsy, R Manimegalai, Devasena MS Geetha, S Rajathi, K Usha, and Baseria N Raabiathul. Air pollution analysis using enhanced k-means clustering algorithm for real time sensor data. In 2016 IEEE Region 10 Conference (TENCON), pages 1945–1949. IEEE, 2016.

    Tamara G Kolda and Brett W Bader. Tensor decompositions and applications. SIAM review, 51(3):455–500, 2009.

    K Krishna and M Narasimha Murty. Genetic k-means algorithm. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 29(3):433–439,1999.

    Aristidis Likas, Nikos Vlassis, and Jakob J Verbeek. The global k-means clustering algorithm. Pattern recognition, 36(2):451–461, 2003.

    George C Linderman and Stefan Steinerberger. Clustering with t-sne, provably. SIAM Journal on Mathematics of Data Science, 1(2):313–332, 2019.

    Xian Liu, Dawei Lu, Aiqian Zhang, Qian Liu, and Guibin Jiang. Data-driven machine learning in environmental pollution: gains and problems. Environmental science & technology, 56(4):2124–2133, 2022.

    Chia-Yu Lo, Wen-Hsing Huang, Ming-Feng Ho, Min-Te Sun, Ling-Jyh Chen, Kazuya Sakai, and Wei-Shinn Ku. Recurrent learning on pm 2.5 prediction based on clustered airbox dataset. IEEE Transactions on Knowledge and Data Engineering, 34(10):4994–5008, 2020.

    Leland McInnes, John Healy, and James Melville. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426, 2018.

    Fionn Murtagh and Pierre Legendre. Ward’s hierarchical clustering method: clustering criterion and agglomerative algorithm. arXiv preprint arXiv:1111.6285, 2011.

    David N ́u ̃nez-Alonso, Luis Vicente P ́erez-Arribas, Sadia Manzoor, Jorge OC ́aceres, et al. Statistical tools for air pollution assessment: multivariate and spatial analysis studies in the madrid region. Journal of analytical methods in chemistry, 2019, 2019.

    Luca Paoli and Stefano Loppi. A biological method to monitor early effects of the air pollution caused by the industrial exploitation of geothermal energy. Environmental Pollution, 155(2):383–388, 2008.

    Veerabhadran Ramanathan and Yan Feng. Air pollution, greenhouse gases and climate change: Global and regional perspectives. Atmospheric environment, 43(1):37–50, 2009.

    John D Spengler and Ken Sexton. Indoor air pollution: a public health perspective. Science, 221(4605):9–17, 1983.

    James V Stone. Independent component analysis: an introduction. Trends in cognitive sciences, 6(2):59–64, 2002.

    James V Stone. Independent component analysis: a tutorial introduction. 2004.

    Qinghua Sun, Aixia Wang, Ximei Jin, Alex Natanzon, Damon Duquaine, Robert D Brook, Juan-Gilberto S Aguinaldo, Zahi A Fayad, Valentin Fuster, Morton Lippmann, et al. Long-term air pollution exposure and acceleration of atherosclerosis and vascular inflammation in an animal model. Jama, 294(23):3003–3010, 2005.

    George E Taylor, Dale W Johnson, and Christian P Andersen. Air pollution and forest ecosystems: a regional to global perspective. Ecological Applications, 4(4):662–689, 1994.

    Moyer D Thomas and RH Hendricks. Effects of air pollution on plants. Air pollution, 239, 1961.

    G Touloumi, SJ Pocock, K Katsouyanni, and D Trichopoulos. Short-term effects of air pollution on daily mortality in athens: a time-series analysis. International journal of epidemiology, 23(5):957–967, 1994.

    Filipa C Viola, Jeremy D Thorne, Stefan Bleeck, Julie Eyles, and Stefan Debener. Uncovering auditory evoked potentials from cochlear implant users with independent component analysis. Psychophysiology, 48(11):1470–1480, 2011.

    Svante Wold, Kim Esbensen, and Paul Geladi. Principal component analysis. Chemometrics and intelligent laboratory systems, 2(1-3):37–52, 1987.

    Yu-Fei Xing, Yue-Hua Xu, Min-Hua Shi, and Yi-Xin Lian. The impact of pm2.5 on the human respiratory system. Journal of thoracic disease, 8(1):E69, 2016.

    Masoomeh Zeinalnezhad, Abdoulmohammad Gholamzadeh Chofreh, Feybi Ariani Goni, and Jiˇr ́ı Jarom ́ır Klemeˇs. Air pollution prediction using semi-experimental regression model and adaptive neuro-fuzzy inference system. Journal of Cleaner Production, 261:121218, 2020.

    下載圖示
    QR CODE