研究生: |
王思涵 Wang, Szu-Han |
---|---|
論文名稱: |
針對問答社群中的事實問題句自動產生答案摘要之研究 Automatic Answer Generation for Factual Questions on Community Question Answering |
指導教授: |
柯佳伶
Koh, Jia-Ling |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2015 |
畢業學年度: | 103 |
語文別: | 中文 |
論文頁數: | 84 |
中文關鍵詞: | 問題句分類 、問題句關鍵字擷取 、自動產生問題句答案 |
英文關鍵詞: | question classification, question keywords extraction, automatic question answering |
論文種類: | 學術論文 |
相關次數: | 點閱:148 下載:22 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著問答社群(Community Question Answering,cQA)平台的發展,越來越多使用者會在平台上提出問題句並等待他人的回答,然而平台上有大部分的問題句無法即時的得到答案,或是根本沒有被回答。因此,本論文研究的目的是針對使用者在問答社群中提出的事實問題句,利用網路搜尋引擎自動判別回傳結果摘要事實資訊,作為問題句的答案提供給使用者。然而若直接以問題當作查詢詞在搜尋引擎進行查詢,查詢詞中可能包含無關的字,導致回傳結果內包含太多不相關答案,因此本研究探討如何對使用者的問題自動分類出是否為事實問題句,並從事實問題句中自動擷取出查詢主體字詞及面向字詞,並以擷取的查詢關鍵字,結合網頁搜尋結果重要面向事實內容自動擷取之研究技術,摘要出事實資訊作為答案提供給使用者。實驗結果顯示本研究所提出的問題分類方法能有效地將問題進行分類,並且透過本研究所擷取的查詢關鍵字結合結果摘要方法,可有效對事實問題句提供事實資訊。
With the development of Community Question Answering, more and more users post questions on the platform and wait for others to answer. However, the questions posted there did not all get informative answers or were not answered in a timely manner. Accordingly, this thesis aims to automatically summarize the facet information as the answer from the search result for factual questions in CQA. From the summarization result, users can quickly obtain the facet information they need. First, we explore how to automatically classify the factual and the non-factual questions. Second, we extract the target term and facet term from a factual question as the query keywords for search engines. Finally, we apply the technology of search results summarization for getting factual information from the search results. The summary of the factual information is provided to the user as answer of the factual question. The experimental results show that the proposed classification method can identify the factual questions with high accuracy and high recall. Furthermore, by using the query keywords automatically extracted by this study, a factual question can be effectively answered from the facet summarization of web search result.
[1] K. Bae and Y. Ko. An effective category classification method based on a language model for question category recommendation on a cQA service. In CIKM, pages 2255-2258, 2012.
[2] A. Bouchoucha, J. He and J.-Y. Nie. Diversified query expansion using conceptnet. In CIKM, pages 1861-1864, 2013.
[3] F. Cai, S. Liang and M. D. Rijke. Time-sensitive personalized query auto-completion. In CIKM, pages 1599-1608, 2014.
[4] L. Chen, D. Zhang and M. Levene. Understanding user intent in community question answering. In WWW, pages 823-828, 2012.
[5] V. Dang, G. Kumaran and A. Troy. Domain dependent query reformulation for web search. In CIKM, pages 1045-1054, 2012.
[6] V. Dang, X. Xue and W. B. Croft. Inferring query aspects from reformulations using clustering. In CIKM, pages 2117-2120, 2011.
[7] K. T. Maxwell and W. B. Croft. Compact query term selection using topically related text. In SIGIR, pages 583-592, 2013.
[8] U. Ozertem, O. Chapelle and P. Donmez. Learning to suggest: a machine learning framework for ranking query suggestions. In SIGIR, pages 25-34, 2012.
[9] J. H. Paik and D. W. Oard. A fixed-point method for weighting terms in verbose informational queries. In CIKM, pages 131-140, 2014.
[10] A. Shtok, G. Dror and Y. Maarek. Learning from the Past: Answering New Questions with Past Answers. In WWW, pages 759-768, 2012.
[11] P. Sondhi and C-X. Zhai. Mining Semi-Structured Online Knowledge Bases to Answer Natural Language Questions on Community QA Websites. In CIKM, pages 341-350, 2014.
[12] W. Song, Q. Yu, Z. Xu, T. Liu, S. Li and J.-R. Wen. Multi-Aspect Query Summarization by Composite Query. In SIGIR, pages 325-334, 2012.
[13] P.-N. Tan, M. Steinbach and V. Kumar.Introduction to Data Mining: Pearson New International Edition, Agglomerative Hierarchical Clustering, pages516-526.
[14] S. Vargas, R. L. T. Santos, C. Macdonald and I. Ounis. Selecting effective expansion terms for diversity. In OAIR, pages 69-76, 2013.
[15] X. Wang, D. Chakrabarti and K. Punera. Mining broad latent query aspects from search sessions. In KDD, pages 867-876, 2009.
[16] S. Whiting and J. M. Jose. Recent and robust query auto-completion. In WWW, pages 971-982 , 2014.
[17] F. Wu, J. Madhavan and A. Halevy. Identifying aspects for web-search queries. Journal of Artificial Intelligence Research, pages 677-700, 2011.
[18] J. Xu and W. B. Croft. Query expansion using local and global document analysis. In SIGIR, pages 4-11, 1996.
[19] X. Xue and W. B. Croft. Modeling Subset Distributions for Verbose Queries. In SIGIR, pages 1133-1134, 2011.
[20] Y.-H. Yeh. Search results summarization for multiple query aspects. PhD dissertation. Taipei: National Taiwan Normal University Department of Computer Science and Information Engineering, 2014.
[21] S. Yu, D. Cai, J.-R. Wen, and W.-Y. Ma. Improving pseudo-relevance feedback in web information retrieval using web page segmentation. In WWW, pages 11-18, 2003.
[22] T. Zhang, J. H. D. Cho, C. Zhai. Understanding user intents in online health forums. In BCB, pages 220-229, 2014.
[23] LIBSVM http://www.csie.ntu.edu.tw/~cjlin/libsvm/