研究生: |
郭瑞男 |
---|---|
論文名稱: |
考慮密度限制之數值區間關聯規則探勘 Mining Quantitative Association Rules with Density Constraint |
指導教授: |
柯佳伶
Koh, Jia-Ling |
學位類別: |
碩士 Master |
系所名稱: |
資訊教育研究所 Graduate Institute of Information and Computer Education |
論文出版年: | 2003 |
畢業學年度: | 91 |
語文別: | 中文 |
論文頁數: | 50 |
中文關鍵詞: | 資料探勘 、關聯規則 |
英文關鍵詞: | data mining, association rule |
論文種類: | 學術論文 |
相關次數: | 點閱:181 下載:1 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文提出一個新的數值區間關聯規則探勘方法,稱為PQAR(Partition-based Quantitative Association Rule mining)演算法,以空間分割方式先探勘出滿足相對密度限制的常見數值區間項集合,再由其產生數值區間關聯規則。PQAR方法在探勘常見數值區間項集合時除了考慮最小支持度門檻值的限制外,亦訂定相對密度的限制,避免在相同支持度門檻值要求下,找出資料分佈不集中的區間。此外,PQAR方法採用空間分割方式探勘出符合要求的最大數值區間,不但減少需要掃描資料庫的次數,使得執行時間大為縮短,亦使探勘結果中的區間個數較少,達到找出精簡而重要的數值區間關聯規則之目的。由實驗結果顯示PQAR方法在探勘具不同支持度及相對密度的常見區間項集合,都有很高的正確率。而且在相同的正確率的條件下,本論文方法也較QAR演算法的執行更有效率。
A new approach, called PQAR (Partition-based Quantitative Association Rules mining) algorithm, is proposed in this thesis for mining quantitative association rules. This approach finds out all the frequent interval itemsets that satisfy the minimum relative density requirement based on space partitioning method, and the quantitative association rules are produced from these interval itemsets. When mining frequent interval itemsets, PQAR algorithm considers not only the minimum support as the filtering condition, but also the minimum relative density to prevent finding the intervals in which data distribution is sparse. In addition, based on space partitioning method to find out the largest intervals that meet the threshold requirements, the number of qualified intervals is reduced such that the resulting rules are significant and concise. Furthermore, because the number of times to scan database is reduced possibly in PQAR algorithm, the mining time is shorten considerably than the previous approaches. The experimental results show that, when testing data sets with various supports and relative densities setting, PQAR algorithm obtains results with high accuracy and recall in most cases. Moreover, under the same accuracy condition, PQAR algorithm takes much less time than QAR algorithm.
[1] R. Agrawal, T. Imielinski, and A. Swami, “Mining association rules between sets of items in large databases,” In Proc. of ACM-SIGMOD Int. Conf. on Management of Data, 1993.
[2] R. Agrawal and R. Srikant, “Fast algorithms for mining association rules,” In Proc. of Int. Conf. on Very Large Data Bases, 1994.
[3] R. Agrawal and R. Srikant, “Mining quantitative association rules in large relational tables,” In Proc. of ACM-SIGMOD Int. Conf. on Management of Data, 1996.
[4] R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan, “Automatic subspace clustering of high dimensional data for data mining application,” In Proc. of ACM-SIGMOD Int. Conf. on Management of Data, 1998.
[5] T. Fukuda, Y. Morimoto, S. Morishita, and T. Tokuyama, “Data mining using two-dimensional optimized association rules: Scheme, algorithms, and visualization,” In Proc. of ACM-SIGMOD Int. Conf. on Management of Data, 1996.
[6] J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” In Proc. of ACM-SIGMOD Int. Conf. on Management of Data, 2000.
[7] B. Lent, A. Swami, and J. Widom, “Clustering association rules,” In Proc. of IEEE Int. Conf. on Data Engineering, 1997.
[8] J. S. Park, M. S. Chen, and P. S. Yu, “An effective hash-based algorithm for mining association rules,” In Proc. of ACM-SIGMOD Int. Conf. on Management of Data, 1995.
[9] A. Savasere, E. Omiecinski, and S. Navathe, “An efficient algorithm for mining association rules in large databases,” In Proc. of Int. Conf. on Very Large Data Bases, 1995.
[10] W. Wang, J. Yang, and P. Yu, “Efficient Mining of Weighted Association Rules (WAR),” In Proc. of ACM-SIGMOD Int. Conf. on Management of Data, 2000.
[11] K. Yoda, T. Fukuda, Y. Morimoto, S. Morishita, and T. Tokuyama. Computing optimized rectilinear regions for association rules. In Proc. of ACM Int. Conf. on Knowledge Discovery and Data Mining, 1997.