研究生: |
陳思穎 Chen, Sih-Ying |
---|---|
論文名稱: |
自動分群搜尋引擎之使用者評估研究 User-based Study of Automatic Clustering Search Engines |
指導教授: |
卜小蝶
Pu, Hsiao-Tieh |
學位類別: |
碩士 Master |
系所名稱: |
圖書資訊學研究所 Graduate Institute of Library and Information Studies |
論文出版年: | 2007 |
畢業學年度: | 95 |
語文別: | 中文 |
論文頁數: | 227 |
中文關鍵詞: | 自動分群 、相關排序 、搜尋引擎 、使用者研究 |
英文關鍵詞: | automatic clustering, relevance ranking, earch engines, user study |
論文種類: | 學術論文 |
相關次數: | 點閱:176 下載:1 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著網路資源迅速成長,利用搜尋引擎檢索網路資源,也成為使用者最有利的檢索工具。然而,現今以相關排序為主的搜尋引擎,仍無法有效地過濾龐雜的檢索結果,反而容易造成使用者的困擾。自動分群搜尋引擎則提供使用者另一種選擇,以自動分群的方式提供使用者分門別類的群集主題,藉以改善檢索效益。
雖然自動分群技術之相關研究已行之有年,但仍缺少使用者方面之相關研究,因此本研究即嘗試以使用者觀點評估群集架構及其使用性。本研究設計兩階段實驗,讓使用者實際參與任務,包含使用者自訂情境與研究者指定情境,並從中觀察使用者使用自動分群搜尋引擎之情形,再利用歷程分析紀錄使用者之檢索歷程,搭配問卷與訪談深入瞭解使用者主觀認知。研究中採用自建之實驗平台、檢索任務及評估問卷、訪談大綱、電腦螢幕錄製軟體等,以利研究之進行。並依據研究目的與問題獲取所需之研究素材,以觀察及訪談方式整理歸納受試者之檢索行為特性,及利用檢索歷程紀錄、訪談、評估問卷等方式,分析搜尋引擎之檢索效率與效益,最後則以訪談與評估問卷分析使用者滿意度。
研究結果發現,使用者使用自動分群搜尋引擎與相關排序搜尋引擎之最大差異在於,使用者過濾資訊時所花費的時間與心力,以及使用之時機皆有所影響;其次,自動分群搜尋引擎對使用者最大的幫助在於,有較佳的檢索效益、可以縮小檢索主題範圍、突顯重要概念、提供多維思考方向,並在簡單/封閉的問題有最佳檢索表現。最後,本研究提出改善自動分群搜尋引擎之建議,包括依據使用者所需提供常用之群集類別與組合、與自訂層級及其檢索結果數量等個人化的服務,或是參考人工分類以及使用者回饋等方式,以使用者為導向提升分群的品質。
This study tries to evaluate the structure and the usability of automatic clustering search engines based on the user’s perspectives. The rapidly increasing amount of Internet resources has made search engines one of the most important tools for searching and accessing Internet resources. The results of major search engines found and presented are based on relevance ranking. They may not be able to effectively and efficiently filter the results since they have brought difficulties in terms of precisely locating what the user is seeking for because of the intimidating amount of the results found and presented. Automatic clustering search engines have offered the user a better option. They provide the user with the function of automatically clustering the categories by topics and may thus increase the effectiveness of searching.
This study designs experiments that enabled the user to actually participate in the searching tasks of using automatic clustering search engines. The tasks that this study designs include self-defined tasks by the participants and the tasks that this study assigns. This study observes the participants’ behaviors during the process of using the automatic clustering search engines assigned. It also records the searching process by log to analyze the effectiveness of the search engines selected. Finally, it tries to understand the satisfaction of the user by interviews and evaluation questionnaire after the tasks assigned were completed.. The research tools that this study adopts include the experimental platform, searching tasks, evaluation questionnaire, interviews outline, and log analysis software.
The results of this study suggests that the time spent and the efforts made by the participant on using the automatic clustering search engines selected are strongly different. The contexts in which the participant used the automatic clustering search engines are also highly different from using the search engines based on relevance-ranking. The results also showed that the automatic clustering search engines selected help enhance the effectiveness of the searching, narrow the topic scope of the searching, highlight key concepts, and provide diverse thinking for searching. Finally, this study provides suggestion to improve automatic clustering search engines. Various combinations of the clusters that the user frequently uses, personalized hierarchical clusters and self-defined numbers of the searching results presented, human-designed clusters and user’s feedback may be provided to enhance the quality of the clustering based on the user.
張郁蔚(民93)。相關排序於資訊檢索之發展與探討。大學圖書館,8卷2期,頁94-123。
張淇龍、卜小蝶(民95)。淺談Web2.0與通俗分類於圖書資訊服務之應用。圖書與資訊學刊,57期,頁74-93。
陳光華、莊雅蓁(民90)。應用於資訊檢索的中文同義詞之建構。中國圖書館學會會報,67期,頁93-107。
楊瑋琳(民95)。以動態階層分群技術為基礎建立虛擬文件倉儲系統。臺灣大學資訊管理學研究所碩士論文,未出版,台北市。
蔡景祥(民94)。網路搜尋結果自動組織之研究。臺灣大學資訊管理學研究所碩士論文,未出版,台北市。
Amitay, E. (1998). Using common hypertext links to identify the best phrasal description of target web document. Proceedings of the SIGIR'98 Post-Conference Workshop on Hypertext Information Retrieval for the Web.
Barker, I. (2005). What is information architecture?. Retrieved Oct. 11, 2006, from http://www.steptwo.com.au
Bates, M. (2002). After the dot-bomb: getting web information retrieval right this time. First Monday, 7(7).
Belkin, N.J., Cool, C., Kelly, D., Lin, S.J., Park, S.Y., Perez-Carballo, J., Sikora, C. (2001). Iterative exploration, design and evaluation of support for query reformulation in interactive information retrieval. Information Processing and Management, 37(3), 403-434.
Belkin, N.J., Scholtz, J., Dumais, S., Wilkinson, R. (2004). Evaluating interactive information retrieval systems: opportunities and challenges. Conference on Human Factors in Computing Systems, Vienna, Austria, 1594-1595.
Berkhin, P. (2002). Survey of clustering data mining techniques. Technical report, Accrue Software, San Jose, CA. Retrieved Oct. 11, 2006, from http://scholar.google.com/url?sa=U&q=http://www.it.bond.edu.au/inft623/053/Downloads/cluster_review.pdf
Beyond Google: Narrow the Search. (2004, Jun. 4). Wired.com. Retrieved April 3, 2006, from http://www.wired.com/science/discoveries/news/2004/01/61783
Chen, H., & Dumais, S. (2000). Bringing order to the Web: Automatically categorizing search results. Paper presented at the SIGCHI conference on Human Factors in Computing, Hague, The Netherlands, pp.145-152. New York: ACM Press.
Chien, L.-F. & Pu, H.-T. (1996). Important Issues on Chinese. Information Retrieval. Computational Linguistics and Chinese Language Processing, 1(1), 205-221.
Cisco, S.L., & Jackson, W.K.(2005) Creating Order out of Chaos with Taxonomies. Information Management Journal, 39(3). Retrieved April. 11, 2007, from http://findarticles.com/p/articles/mi_qa3937/is_200505/ai_n13638950
Crabtree, D., Gao, X., & Andreae, P. (2005). Standardized Evaluation Method for Web Clustering Results. Proceedings of the 2005 IEEE/ACM International Conference on Web Intelligence.
Cross, P., et al. (2000). Subject classification, browsing and searching. In M. Belcher, V. Knight, & E. Place (Eds.), DESIRE Information Gateways Handbook. http://www.carnet.hr/CUC/cuc2000/handbook/handbook.pdf
Delphi Group (2004). Information Intelligence: Content Classification and the Enterprise Taxonomy Practice. Retrieved Dec. 21, 2005, from http://stratify.com/infocenter/download/DelphiResearchReport2004.pdf
Delphi Group(2002). Taxonomy & content classification. Retrieved Dec. 14, 2005, from http://www.entrieva.com/entrieva/downloads/delphitaxonomywhitepaper.pdf
Ellis, D. & Vasconcelos, A. (1999). Ranganathan and the Net: Using facet analysis to search and organise the World Wide Web. Aslib Proceedings, 51(1), 3-10.
Ferragina, P. & Gulli, A. (2005). A Personalized Search Engine Based on Web-Snippet Hierarchical Clustering. In Proceedings of Special Interest Tracks and Posters of the 14th International Conference on World Wide Web, Chiba, Japan, pp801-810.
Frakes, W.B. & Ricardo B.Y. (1992). Information Retrieval-Data Structures & Algorithms. Prentice Hall: New Jersey, pp. 419-442.
Garrett, J.J.(2003). The Elements of User Experience: User-Centered Design for the Web. New York: AIGA.
Gilchrist, A. (2003). Thesauri, taxonomies and ontologies – An etymological note. Journal of Documentation, 59(1), 7-18.
Hackos, B. (2005). Taxonomies – Lessons from users. CIDM Information Management News October 2005. Retrieved Dec. 18, 2005, from http://www.infomanagementcenter.com/enewsletter/200510/fourth.html
Harvey, R. (1999). Organising Knowledge in Australia. New South Wales: Center for Information Studies.
Hearst, M.A. (2006). Clustering Versus Facted Categories for Information Exploration. Communications of the ACM, 49(4).
Huang, C.K., Chien, L.F., & Oyang, Y.J. (2003). Relevant term suggestion in interactive web search based on contextual information in query session logs. Journal of the American Society for Information Science and Technology, 54, 638-49.
Jacob, E.K. (2004). Classification and categorization: A difference that makes a difference. Library Trends, 52(3), 515-540.
Jain, K., Murty, M.N., & Flynn, P.J. (1999). Data clustering: a review. ACM Computing Surveys, 31(3), 265-323.
Jansen, B.J., Spink, A., & Saracevic, T. (2000). Real life, real users, and real needs: a study and analysis of user queries on the web. Information Processing and Management, 36(2), 207-27.
Käki, M.& Aula, A.(2005). Findex: Improving Search Result Use Through Automatic Filtering Categories. Interacting with Computers. Elsevier, 17(2), 187-206.
Käki, M.(2005a). Enhancing Web Search Result Access with Automatic Categorization. Unpublished Doctoral dissertation of Computer Science, University of Tampere, Finland.
Käki, M.(2005b). Findex: Properties of Two Web Search Result Categorizing Algorithms. In Proceedings of the IADIS International Conference on World Wide Web/Internet (Lisbon, Portugal), Oct. 2005. IADIS Press, pp. 93-100.
Käki, M.(2005c). Findex: Search Result Categories Help Users When Document Ranking Fails. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2005 (Portland, USA), April 2005. ACM Press, pp. 131-140.
Käki, M.(2005d). Optimizing the Number of Search Result Categories. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2005 (Portland, USA), April 2005. ACM Press, pp. 1517-1520.
Käki, M.(2005e). Proportional Search Interface Usability Measures. In Proceedings of NordiCHI 2004 (Tampere, Finland), 23-27 Oct. 2004. ACM Press, pp. 365-372.
Käki, M.(2006). fKWIC: Frequency Based Keyword-in-Context Index for Filtering Web Search Results. Journal of the American Society for Information Science and Technology, 57(12), 1606-1615.
Koch, T. & Day, M.(1997). The role of classification schemes in Internet resource description and discovery, DESIRE D3.2 (3). Retrieved Dec. 21, 2005, from http://www.ukoln.ac.uk/metadata/desire/classification/classification.pdf
Koshman, S., Spink, A., & Jansen, B.J.(2006). Web Search on the Vivisimo Search Engine. Journal of the American Society for Information Science and Technology, 57(14), 1875-1887.
Kwasnik, B.H. (1999). The role of classification in knowledge representation and discovery. Library Trends, 48(1), 22-47.
Leouski, A.V., & Croft, W.B. (1996). An evaluation of techniques for clustering search results. Technical Report IR-76, Department of Computer Science, University of Massachusetts, Amherst.
Mai, J.-E. (2004). Classification of the Web: Challenges and inquiries. Knowledge Organization, 31(2), 92-97.
Mayr, E. (1982). The growth of biological thought: Diversity, evolution, and inheritance. Cambridge, MA: Harvard University Press.
Netcraft (2006). May 2006 Web Server Survey. Retrieved May 16,2006, from http://news.netcraft.com/archives/2006/05/09/may_2006_web_server_survey.html
Pew Internet & American Life Project (2005). Reports: Online Activities and Pursuits. Retrieved Dec. 21, 2005, from http://www.pewinternet.org/PPF/r/167/report_display.asp
Pu, H.T., Chuang, S.L., & Yang, C. (2002). Subject categorization of query terms for exploring web users' search interests. Journal of the American Society for Information Science & Technology, 53(8), 617-630.
Rajashekar, T.B. (2004). IS 206 – Information and knowledge organization. Retrieved Dec. 10, 2005, from http://144.16.72.189/is206/topic-12.htm
Rivadeneira, W., & Bederson, B.(2003). A Study of Search Result Clustering Interfaces: Comparing Textual and Zoomable User Interfaces. University of Maryland HCIL Technical Report HCIL-2003-36.
Rosenfeld, L., & Morville, P. (2002). Information Architecture for the World Wide Web. 2nd ed. Sebastopol, CA.: O'Reilly.
Schwartz, C. (2001). Sorting out the Web: Approaches to Subject Access. Stamford, Conn.: Ablex Pub.
SearchTools.com (2003). Taxonomies, categorization, classification, categories, and directories for searching. Retrieved Dec. 21, 2005, from http://www.searchtools.com/info/classifiers.html
Sebrechts, M., Vasilakis, J., Miller, M.S., Cugini, J.V., & Laskowski, S.J.(1999). Visualization of Search Results: A Comparative Evaluation of Text, 2D, and 3D Interfaces. In Proceedings of SIGIR 1999, pp. 3-10.
Silverstein, C., Henzinger, M., & Marais, H. (1998). Analysis of a very large AltaVista query log. Digital System Research Center Technical Report, 1998-014.
Spink, A., Wolfram, D., Jansen, M.J., & Saracevic, T. (2001). Searching the web: The public and their queries. Journal of the American Society for Information Science and Technology, 52(3), 226-234.
Sravanapudi, A. (2004). Categorization – It’s all about context. EContent, 27(7/8), S23.
Su, L.T. (2003). A comprehensive and systematic model of user evaluation of web search engines: I. theory and background. Journal of the American Society for Information Science and Technology, 54(13), 1175-1192.
Tonella, P., Ricca, F., Pianta, E., Girardi, C., Lucca, G.D., Fasolino, A.R., & Tramontana, P. (2003). Evaluation Methods for Web Application Clustering. wse, p. 33, 5th International Workshop on Web Site Evolution.
Valdes-Perez, R. (2007). How to Evaluate a Clustering Search Engine. Retrieved April. 11, 2007, from http://searchdoneright.com/2007/03/how-to-evaluate-a-clustering-search-engine/
Vogel, C. (2003a). A roadmap for proper taxonomy design. Computer Technology Review, 23(7), 42-44.
Vogel, C. (2003b). Designing a knowledge discovery system. Computer Technology Review, 23(10), 42-43.
Xu, R., & Wunsch, II, D. (2005). Survey of Clustering Algorithms. IEEE Transactions on Neural Networks, 16(3), 645-678.
Zamir O. & Etzioni, O. (1999). Grouper: A Dynamic Clustering Interface to Web Search Results. In Proceedings of the Eighth International World Wide Web Conference(WWW8), Toronto, Canada. Retrieved Oct. 11, 2006, from:http://www8.org/w8-papers/3a-search-query/dynamic/dynamic.html
Zamir, O. & Etzioni, O.(1998).Web document clustering: a feasibility demonstration. In Proceeding of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia. pp. 46-54.
Zamir, O.(1999). Clustering Web Documents: A Phrase-Based Method for Grouping Search Engine Results. Unpublished Doctoral dissertation of Computer Science & Engineering, University of Washington.