研究生: |
邱奕智 I-Chih Chiu |
---|---|
論文名稱: |
有效率探勘社交標籤系統中前k名擴展查詢字集之研究 An Efficient Method for Mining Top-k Query Expansions on Social Tagging System |
指導教授: |
柯佳伶
Koh, Jia-Ling |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2012 |
畢業學年度: | 101 |
語文別: | 中文 |
論文頁數: | 75 |
中文關鍵詞: | 社交標籤系統 、前k名擴展查詢字集 、動態樹狀結構 |
英文關鍵詞: | social-tagging system, top-k query expansion, dynamic tree structure |
論文種類: | 學術論文 |
相關次數: | 點閱:228 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文考慮資料物件具有標籤集且含有評分資訊的社交標籤系統,當給定一個查詢找出標籤集中包含所有查詢字的物件資料,本論文方法將對這些物件資料所包含的標籤進行處理,找出可用性分數值最高的前k名擴展查詢字集,且每一個擴展查詢字集都能找到指定數量以上的資料物件。本論文提出的方法分成從查詢結果挑選具代表標籤,以及有效率地探勘前k名擴展查詢字集兩部分。首先,我們運用兩種挑選具代表性標籤的方法:平均差異性及新穎性,計算標籤的代表分數,選定代表分數最高的n個標籤為代表性標籤。接下來,本論文採用一個稱為UT-tree的樹狀結構,用來儲存可形成擴展查詢字集的標籤集資訊。我們提出一個稱為UT-growth的演算法,可從UT-tree中有效率找出可用性最高的前 名擴展查詢字集,且不會產生過多不必要檢查的擴展查詢字集。此外,我們運用動態估算一個擴展查詢字集可用性分數的上限值和下限值的概念,提出一個動態建立UT-tree的方法,並提出dynamic UT-growth演算法,動態更新所找到可用性前 名的擴展查詢字集,若第 名擴展查詢字集的下限值比第 名的擴展查詢字集上限值高,即可提前結束UT-tree的建立及探勘。實驗結果證實先採用代表標籤選取比未採用代表標籤選取,其找出的擴展查詢字集可達到較好的查詢效果。此外,實驗結果顯示UT-growth演算法比相關方法有較好的執行效率,且Dynamic UT-growth演算法在多數情況可提供比UT-growth演算法更有效率的處理。
In this thesis, we consider the social tagging systems in which each object contains a tag set and a corresponding score. After giving a query to find all the objects whose tagsets contain the query keywords, our goal is to find the top-k utility of query expansions from the tags of the objects such that, each query expansion can find the required number of objects. The proposed methods consist of two parts: to select representative tags from query results and to efficiently discover the top-k query expansions. Firstly, we apply two different functions, AveDiversity and Novelty, to compute the representative score of each tag. Then the tags with the top-n highest scores are selected as the representative tags. Next, we design a tree structure called UT-tree which is used to store the tag sets of objects and their corresponding information for generating query expansions. We propose an algorithm called UT-growth, which can efficiently discover out the top-k utility query expansions from the UT-tree by prevent from generating unnecessary query expansions for checking. In addition, by dynamically estimating the upper bound and lower bound of the utility for a query expansion, we provide a dynamic approach to construct UT-tree and propose the Dynamic UT-growth algorithm. This approach dynamically updates the top-k utility of the query expansions. Accordingly, when the utility lower bound of the kth query expansion is larger than the utility upper bound of the (k+1)th query expansion, the construction and mining process on the UT-tree can be early terminated. The experimental results show that finding representative tags before mining top-k query expansions can improve the searching effectiveness of the discovered top-k query expansions. Furthermore, the UT-growth algorithm has better performance on efficiency than the related method, and the Dynamic UT-growth algorithm can provide even more efficient processing than the UT-growth algorithm in most cases.
[1] H. S. Al-Khalifa and H. C. Davis, ”Measuring the Semantic Value of Folksonmies,” Innovations in Information Technology, 2006.
[2] R. Agrawal and R. Srikant, “Fast Algorithm for Mining Association Rule in Large Databases,” in Proceedings of the 20th International Conference on Very Large Databases, 1994.
[3] T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y.-T. Zheng, ”NUS-WIDE: A Real-World Web Image Database from National University of Singapore,” ACM International Conference on Image and Video Retrieval, 2009.
[4] J. Fokker, J. Pouwelse, W. Buntine, “Tag-Based Navigation for Peer-to-Peer Wikipedia,” in Proceedings of the 15th international conference on World wide web(WWW), 2006.
[5] M. Gupta, R. Li, Z. Yin and J. Han, “Survey on Social Tagging Techniques,” in Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining(KDD), 2010.
[6] S. A. Golder and B. A. Huberman, “The Structure of Collaborative Tagging System, ”Information dynamics Lab, HP Labs.
[7] J. Gemmell, A. Shepitsen, B. Mobasher and R. Burke, “Personalization in Folksonomies Based on Tag Clustering,” in Proceedings of the 23rd Association for the Advancement of Artificial Intelligence(AAAI), 2008.
[8] J. Han, J. Pei and Y. Yin, “Mining Frequent Pattern without Candidate Generation,” in Proceedings of the 2000 ACM SIGMOD international conference on Management of data, 2000.
[9] J.-L. Koh, G.-T. Chiang, and I-C. Chiu “The Strategies for Supporting query Specialization and Query Generalization in Social Tagging System,” in Proceedings of the 4th International Workshop on Social Networks and Social Web Mining(SNSM), 2013.
[10] D. Liu, X.-S. Hua, L. Yang, M. Wang and H.-J. Zhang, “Tag Ranking”, in Proceedings of the 18th international conference on World wide web(WWW), 2009.
[11] X. Liang, M. Xie, L. V.S. Lakshmanan, “Adding Structure to Top-K: From Items to Expansions,” in Proceedings of the 20th ACM international conference on Information and knowledge management(CIKM), 2011.
[12] R. Lémdani, G. Polaillon, N. Bennacer, and Y. Bourda, “A semantic similarity measure for recommender systems,” in Proceedings of the 7th International Conference on Semantic Systems(I-Semantics), 2011
[13] C. Mesnage and M. J. Carman, “Tag Navigation,” in Proceedings of the 2nd international workshop on Social software engineering and applications, 2009.
[14] J. Peng, D. Zeng, H. Zhao and F.-Y. Wang, “Collaborative Filtering in Social Tagging Systems Based on Joint Item-Tag Recommendations,” in Proceedings of the 19th ACM international conference on Information and knowledge management(CIKM), 2010.
[15] D. Skoutas and M. Alrifai, “Tag Clouds Revisited,” in Proceedings of the 20th ACM international conference on Information and knowledge management(CIKM), 2011.
[16] P.-N. Tan, M.Steinbach, V. Kumar, “Introduction to Data Mining,” 2006.
[17] D. Vandic, J.-W. v. Dam and F. Hogenboom, “A Semantic Clustering-Based Approach for Searching and Browsing Tag Spaces,” in Proceedings of the 26th ACM Symposium on Applied Computing(SAC), 2011.
[18] Wikipedia. Tag cloud — wikipedia, the free encyclopedia, 2013.[Online; accessed 25-February-2013].
[19] L. Wu, L. Yang, , N. Yu, X.-S. Hua, “Learning to tag,” in Proceedings of the 18th international conference on World wide web(WWW), 2009.
[20] K. Wang, L. Tang, J. Han and J. Lin, “Top Down FP-Growth for Association Rule Mining,” in Proceedings of the 6th Pacific-Asia Conference on Knowledge Discovery and Data Mining(PAKDD), 2002.
[21]A. Zubiaga, A. P. García-Plaza, V. Fresno, and R. Martínez, “Content-based Clustering for Tag Cloud Visualization,” in Proceedings of International Conference on Advances in Social Networks Analysis and Mining(ASONAM), 2009.