簡易檢索 / 詳目顯示

研究生: 簡宏宇
Hung-Yu Chien
論文名稱: 基於關聯式規則在影響個股漲跌之財經新聞事件探勘之應用研究
A Study of Financial News and Stock Trading Mining Based on Association Rules
指導教授: 洪欽銘
Hong, Chin-Ming
學位類別: 碩士
Master
系所名稱: 工業教育學系
Department of Industrial Education
論文出版年: 2005
畢業學年度: 93
語文別: 中文
論文頁數: 54
中文關鍵詞: 資料探勘關聯式規則資訊擷取K-means群聚演算法
英文關鍵詞: Data Mining, Association Rules, Data Crawler, K-means Clustering Algorithm
論文種類: 學術論文
相關次數: 點閱:236下載:19
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文提出一個股市消息面與數值面的研究系統,結合網際網路概念、資訊擷取(Data Crawler)、資料分析(Data Analyzer)、中文斷詞系統、K-means群聚演算法(K-means Clustering Algorithm)與資料探勘(Data Mining)等不同層面的技術,為的是要找出有關個股的新聞事件與個股的股市交易相互影響的隱含關聯規則,以提供一個具有參考價值的資訊。本系統透過網際網路擷取所需的各項資訊,並且儲存至資料庫之中。利用中文斷詞系統為資料庫中的每筆新聞事件標題找出關鍵字詞(Key Item),並針對每筆資料的關鍵字詞藉由相似度鑑別過濾相近的新聞事件。將所有漲跌幅度正規化(Normalization)後,利用K-means群聚演算法將漲跌幅度分群聚,使得關聯式規則(Association Rules)在這些群聚之中找出極大項目集合(Large Itemsets),藉由支持度(Support Level)與信賴度(Confidence Level)這兩個判斷條件,可以探勘出個股新聞事件與交易的隱含關聯規則,以提供使用者在股市交易上一個具有可信度的參考資訊。

    This paper presents a data mining system which combines with the news and the trading of the stock. This system was built by many different kinds of technologies. It includes Internet, Data crawler, Data mining, K-means clustering algorithm and Association rules. We want that we can find the hiding rules between news item and trading by this data mining system. The data crawler agent of this system captures the information from he internet and stores it to database.
    The information of the database will be processed by producing the key items for each news title and filtering the similar news items by the threshold of similitude before we use these information data. In the data mining system of the stock will be the value of the fluctuation transferred into the normalization, and then we will make the price fluctuation as the clusters by the K-means clustering algorithm. The Association rules can be discovered by finding out the large items of these clusters. Finally, the system will provide accurate information by finding out the hiding rules from each cluster.

    第一章 緒論 -------------------- 1 1.1 研究動機與背景 ------------- 1 1.2 研究目的與方法 ------------- 2 1.3 研究範圍與限制 ------------- 3 1.4 研究步驟 ------------------- 4 第二章 資料處理與資料探勘理論 -- 6 2.1 股市投資概說 --------------- 6 2.2 資料擷取 ------------------- 8 2.3 資料處理 ------------------ 11 2.4 資料探勘 ------------------ 15 第三章 系統架構 --------------- 17 3.1 資訊擷取代理人單元 -------- 19 3.1.1 財經新聞事件蒐集 -------- 20 3.1.2 股市交易資訊蒐集 -------- 21 3.2 資訊處理代理人單元 -------- 22 3.2.1 新聞與交易資訊整合 ------ 23 3.2.2 漲跌幅度正規化 ---------- 24 3.2.3 新聞事件分類 ------------ 25 3.3 資料探勘代理人單元 -------- 27 3.3.1 K-means聚類演算法 ------- 27 3.3.2 關聯式規則 -------------- 29 第四章 系統實驗與分析 --------- 31 4.1 資料整合 ------------------ 31 4.2 資料蒐集 ------------------ 32 4.3 實驗目標 ------------------ 33 4.4 新聞事件歸類 -------------- 34 4.5 個股漲跌幅度分類 ---------- 36 4.6 資料探勘與結果分析 -------- 37 第五章 研究結果與建議 --------- 43 5.1 研究結果 ------------------ 43 5.2 研究建議 ------------------ 43

    英文參考文獻:
    [1] Y.F. Wang, “Predicting stock price using fuzzy grey prediction system,” Expert Systems with Applications, vol. 22, pp. 33-39, 2002.
    [2] J.W. Lee, “Stock price prediction using reinforcement learning,” In Proceedings of ISIE 2001. IEEE International Symposium on Industrial Electronics, vol. 1, pp. 690-695, 2001.
    [3] K.P. Lam, “Predictability of intraday stock index,” In Proceedings of the 2002 International Joint Conference on Neural Networks, vol. 3, pp. 2156-2161, 2002.
    [4] W.W. Benjamin, M. Qian, “Constrained Formulations and Algorithms for Stock- Price Predictions Using Recurrent FIR Neural Networks,” In Proceedings of the Eighteenth National Conference on Artificial Intelligence, pp. 211-216, 2002.
    [5] R. Lawrence, “Using Neural Networks to Forecast Stock Market Prices,” Retrieved at May 2004, available at http://www.cs.uiowa.edu/~rlawrence/research/nn.pdf.
    [6] T.C. Chu, C.T. Tsao, Y.R. Shiue, “Application of fuzzy multiple attribute decision making on company analysis for stock selection,” In Proceedings of Fuzzy Systems Symposium, Soft Computing in Intelligent Systems and Information Processing, pp. 509-514, 11-14 December 1996.
    [7] A. Fan, M. Palaniswami, “Stock selection using support vector machines,” In Proceedings of International Joint Conference on Neural Networks, vol. 3, pp. 1793-1798, 2001.
    [8] C.C. Yang, C.H. Chan, F. Lai, “A rule-based neural stock trading decision support system,” In Proceedings of the IEEE/IAFE 1996 Conference on Computational Intelligence for Financial Engineering, pp. 148 –154, March 1996.
    [9] S.T. Lee, “iJADE stock advisor: an intelligent agent based stock prediction system using hybrid RBF recurrent network,” IEEE Transactions on Systems, Man and Cybernetics, Part A, vol. 34, no. 3, pp. 421-428, May 2004.
    [10] K.K. Hung, Y.M. Cheung, L. Xu, “An extended ASLD trading system to enhance portfolio management,” IEEE Transactions on Neural Networks, vol. 14, no. 2, pp. 413-425, March 2003.
    [11] R. Jiang, K.Y. Szeto, “Extraction of investment strategies based on moving averages: a genetic algorithm approach,” In Proceedings of IEEE International Conference on Computational Intelligence for Financial Engineering, pp. 403-410, March 2003.
    [12] K.J. Oh, K.J. Kim, “Analyzing stock market tick data using piecewise nonlinear model,” Expert Systems with Applications, vol. 22, no. 3, pp. 249-255, April 2002.
    [13] K.J. Oh, I. Han, “Using change-point detection to support artificial neural networks for interest rate forecasting,” Expert Systems with Application, vol.19, pp. 105-115, 2000.
    [14] K.J. Oh, I. Han, “An intelligent clustering forecasting system based on change-point detection and artificial neural networks: application to financial economic,” In Proceedings of the Thirty-Fourth Hawaii International Conference on System Sciences, 2001, available at http://ieeexplore.ieee.org/.
    [15] James D Thomas, News and trading rules, Ph.D. Thesis, Graduate School of Industrial Administration, Carnegie Mellon University, 2003.
    [16] Najaf Ali Shan and Ehab M. Elbahesh, “Topic-based clustering of news articles,” Proceedings of the 42nd Annual Southeast Regional Conference, pp. 412-413, 2004.
    [17] Nuno Maria and Mrio J, “Silva. Theme-based Retrieval of Web News,” SIGIR, pp. 354-356, July 2000. Athens, Greece.
    [18] Andrew J. Kurtz and Javed Mostafa, “Topic detection and interest tracking in a dynamic online news source,” Proceedings of the 2003 Joint Conference on Digital Libraries, 2003.
    [19] Wai Lam, Pik-Shan Cheung, and Ruizhang Huang, “Mining events and new name translations from online daily news,” Proceedings of the 4th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 287-295, 2004.
    [20] James Allan, Ron Papka, and Victor Lavrenko, “On-line new event detection and tracking,” SIGIR, pp. 37-45, 1998.
    [21] Wuthrich, B., et al., “Daily stock market forecast from textual web data”, IEEE International Conference on SMC, pp.1-6, 1998.
    [22] Cho. V. and B. Wthrich, “Combining forecasts from multiple textual data sources”, Proceedings of 3’rd Pacific-Asia Conference on KDD (PAKDD-99), pp.174-178, April 1999.
    [23] Vincent Wing-Sing Cho, Knowledge discovery from distributed and textual data, Ph.D. Thesis, The Hong Kong University of Science and Technology, 1999.
    [24] R.J. Kuo, C.H. Chen, Y.C. Hwang, “An intelligent stock trading decision support system through integration of genetic algorithm based fuzzy neural network and artificial neural network,” Fuzzy Sets and Systems, pp. 21-45, 2001.
    [25] Gabriel Pui Cheong Fung, Jeffery Xu Yu, and Wai Lam, “Stock prediction: integrating text mining approach using real-time news,” In Proceedings of IEEE International Conference on Computational Intelligence for Financial Engineering, pp. 395-402, 2003.
    [26] Marc-Andre Mittermayer, “Forecasting intraday stock price trends with text mining techniques,” Proceedings of the 37th Hawaii International Conference on System Sciences, pp. 1-10, 2004.
    [27] Desh Peramunetilleke and Raymond K. Wong, “Currency exchange rate forecasting from news headlines,” Proceedings of the Thirteenth Australasian Database Conference, 2002.
    [28] Pik-Shan Cheung, Ruizhang Huang, and Wai Lam, “Financial activity mining from online multilingual news,” Proceedings of the International Conference on Information Technology: Coding and Computing, 2004.
    [29] Keith V. Nesbitt and Stephen Barrass, “Finding trading patterns in stock market data,” IEEE Computer Graphics and Applications, pp. 45-55, 2004.
    [30] Berry Michael J. A. and Linoff Gordon, Data mining techniques for marketing, sales, and customer relationship management, Indianapolis, Ind.: Wiley, 2004.
    [31] Dunham Margaret H., Data mining introductory and advanced topics, Upper Saddle River, N.J. Prentice Hall/Pearson Education, 2003.
    [32] Mitchell Tom M., Machine learning, New York: McGraw-Hill, 1997.
    [33] J. L. Deng, “Control problems of grey system,” Systems & Control Letters, vol. 1, pp. 288-294, 1982.
    [34] J. L. Deng, “Introduction to grey system theory,” The Journal of Grey System, vol. 1, pp. 1-24, 1989.
    [35] Kaya M., Alhajj R., “Mining multi-cross-level fuzzy weighted association rules,” Proceedings of 2nd International IEEE Conference on Intelligent Systems, vol. 1, pp. 225-230, 2004.
    [36] Wei Wang, Jiong Yang, and Philip S. Yu, “Efficient mining of weighted association rules (WAR),” Proceedings of the sixth ACM SIGKDD international conference on Knowledge Discovery and Data Mining, pp. 270-274, 2000.
    [37] Jin-Song Zhang; Keikichi Hirose, “A study on robust segmentation and location of tone nuclei in Chinese continuous speech.”, IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004. Proceedings (ICASSP '04). vol.1, pp. I - 913-16, 17-21 May 2004
    [38] Syeda-Mahmood, T., “Order-preserving clustering and its application to gene expression data.”, Proceedings of the 17th International Conference on Pattern Recognition, 2004. (ICPR 2004). Vol.4, pp. 637 – 640, 23-26 Aug. 2004.
    [39] Chan, Z.S.H.; Kasabov, N., “Efficient global clustering using the greedy elimination method,”, Electronics Letters, pp. 1611 – 1612, 9 Dec. 2004.
    [40] Camastra, F.; Verri, A., “A novel kernel method for clustering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 801 – 805, May 2005

    中文參考文獻:
    [41] 鉅亨網, available at http://www.cnyes.com/
    [42] Google News, available at http://news.google.com.tw/
    [43] 公開資訊觀測站, available at http://newmops.tse.com.tw/
    [44] 臺灣證券交易所, available at http://www.tse.com.tw/ch/index.php
    [45] 威廉.艾略特, 抓住股市轉折點, 時藝文化, 2004.
    [46] 鄭弘儀, 鄭弘儀教你投資致富, 高寶國際集團, 2003.
    [47] 中文斷詞及詞性標註系統CKIP, available at http://blackjack.iis.sinica.edu.tw/~ma/uwextract/
    [48] 智慧型網路學習系統實驗室, available at http://pels.nhltc.edu.tw/xoops/modules/news/

    QR CODE