研究生: |
盧治均 |
---|---|
論文名稱: |
一種可推測網路行為者路徑之模型-以教育部技職傳播網為例 A Model to Estimate the Website Behavior Route-An Example of the Website of the Technological and Vocational Education |
指導教授: | 戴建耘 |
學位類別: |
碩士 Master |
系所名稱: |
工業教育學系 Department of Industrial Education |
論文出版年: | 2007 |
畢業學年度: | 95 |
語文別: | 中文 |
論文頁數: | 90 |
中文關鍵詞: | 資料倉儲 、關聯分析法則 、派翠西網路 、資料探勘 |
論文種類: | 學術論文 |
相關次數: | 點閱:188 下載:6 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
預測網站使用者對網頁知識的蒐尋路徑行為,是網站管理與設計的重要方向。有鑑於此,本研究首先從教育部技職傳播網入口網站後端資料倉儲(Data Warehouse)中,擷取出使用者路徑之歷史資料作為本研究模式之系統訓練資料集,再以派翠西網路(Petri Nets)建立其動態行動描述模型並分析使用者之可能行為路徑,最後以資料探勘(Data Mining)貝氏機率法則結合關連分析法則(Association Rule Analysis)之模型,推算在訓練資料集中網路使用者動作行為路徑及行為屬性之機率值,並建構行為屬性之文件機率表,作為預測下一位使用者是否找尋到文件之依據。藉由文件之發現率及使用者行為路徑之關聯性,可成功的對技職教育網頁系統或架構提出相關修正案及問題解決機制。研究中發現以下結果:
一. 若資料維度過少而使用貝氏分類法來做預測分析,將使誤差變大,因為各屬性間為互相獨立的,會因各屬性連乘積的影響,不管其他條件機率為何,其結果值都為 0。為了克服此一問題,本研究提出採用M-estimate 修正機制,經驗證確實可改善此問題。
二. 本研究隨機選取技職傳播網 150 使用者行為資料筆資料,應用貝氏機
率分類法把各屬性機率求出,並建置 Data Mining 引擎,再隨機選取100 筆使用者行為資料做為測試資料,經驗證發現,使用貝氏機率分類法,找到文件預測成功率都達 91.2%以上。
三. 結合貝氏網路及關聯規則模型,可比單一貝氏網路所建構出之預測模
型更精確,並能圖型化的看出網站路徑之效益。
To predict user’s behavior route in website is very important for Website's administrator. The first, the research extract out historical data from data warehouse in the website of the technological and vocational education. Then we use behavior route data as training data. And use Petri nets to build motion model and analyze possible of behavior route. The last we combine naive bayesian classification and association rule analysis of model in data mining to inference behavior route attribute. Then we can build probability table and we can use this table to predict user’s behavior route. We can follow this model propose some comment to the website's administrator. The research finds some result as the following shows.
1. If there are too few attribute then it can cause value equal zero. And the research proposes M-estimate to improve this problem.
2. We use 150 behavior routes as the training data. Then we verify and find our success rate up to 91.2%.
3. Combination bayesian classification and association rule analysis of model can more accurate than bayesian classification.
一、中文部分
梁定澎(2004)。決策支援系統與企業智慧。台北:智勝文化。頁 15-20~16-24
二、英文部分
Agrawal, R. & Srikant, R.(1995). "Mining Sequential Patterns," Proc. of the Int'l Conference on Data Engineering (ICDE). Taipei, Taiwan, March.
Agrawal, R., Imielinski, T. and Swami A. (1993).“Mining Association Rules between Sets of Items in Large Databases,” Proc. of ACM SIGMOD, pp.207-216.
Agrawal, R. & Srikant, R. (1994). “Fast Algorithms for Mining Association Rules,” Proc. of Int’l Conference on Very Large Database, pp.487-499.
Berry, M. J. A.. & Linoff, G. S. (1997). Data Mining Techniques. John Wiley & Sons, Inc. pp.407.
Berson, A., & Smith, S. J., & Thearling, K. (1999). Building Data Mining Applications for CRM. Mc Graw-Hill, pp. 6-31.
Berry, M. J. A. & Linoff, G. S (2000). Mastering Data Mining John Wiley & Sons, Inc. pp.8.
Chaudhuri, S., & Dayal, U. (1997). An Overview of Data Warehousing and OLAP Technology . ACM SIGMOD Record, 26, pp. 65-74.
David, H. A (1995). tutorial on learning with Bayesian networks. Technical Report MSR-TR-95-06, Microsoft Research, March.
Enrique,C., Jose, M.G. & Ali, S.H., (1997).Expert Systems and Probabilistic Network Models,Springer-Verlag,New York.
Frawley, W.J., Paitetsky-Shapiro, G. & Matheus C. J.(1991). Discovery in 76 Databases: An Overview," Knowledge Discovery in Databases, California, Edited by Paitetsky-Shapiro, G.. and Frawley,W.J., AAAI/MIT Express. pp.1-30.
Fayyad, U. M. (1996). Data Mining and Knowledge Discovery:Making Sense out of Data . IEEE Expert, 11, 5, pp. 20-25.
Gnardellis, T. & Boutsinas, B. (2001). On Experimenting with Data Mining in Education.
Green, P. J.(1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination . Biometrika, 82,711-713.
Gerald, W.D. (2001). Regional geography, progress in International Encyclopedia of the Social & Behavioral Sciences , Neil, J.S., Paul B.B.(editors-in-chief), Oxford: Elsevier,1-11.
Geoffrey, S.K.& Carlos, A.O.& Jeffrey, D.S. (2002). The Networked Readiness Index: Measuring the Preparedness of Nations for the Networked World, Information Technologies Group,Center for International Development at Harvard University.
Hoven, J.V.D.(1998). Data Warehousing: Bringing It All Together,Information System Management , pp.92-95
Inmon, W. H.(1996). Building the Data Warehouse. John Wiley & Sons, N.Y.
Jiawei, H. & Micheline, K.(2001). Data mining: Concepts and Techniques . ch.2.
Kimball, R.(1996). The Data Warehouse Tool kit . Wiley & Sons, N.Y.
Lewis, P. M. & Bernstein, A. & Kifer M. (2001). Database and Transaction Processing-An Application-Oriented Approach . Addison Wesley pp.645-663.
Liu, Z. & Guo, M. (2001). A Proposal of Integrating Data Mining and On-Line 77Analytical Processing in Data Warehouse Proceedings.
ICII 2001 - Beijing. 2001 International Conferences, 3, pp.146 – 151.
Lauritzen, S., Thiesson, B., & Spiegelhalter, D.(1994). Diagnostic systems created by model selection methods:A case study, In Cheeseman, P. and Oldford,R., editors, AI and Statistics IV, volume Lecture Notes in Statics,89,pages 143-152. SpringerVerlag, New York.
Murtaza, A.(1998). A framework for Developing Enterprise Data Warehouse .
Information System Management, Fall 1998, pp.21-26.
Petri, C. A.(1962).“Kommunikation mit Automation.” PhD thesis, Institut fur intstrumentelle Mathematik,Boon.
Pasquier, N., Bastide, Y., Taouil R. & Lakhal, L. (1999). "Efficient Mining Of Association Rules Using Closed Itemset Lattices," Information Systems, Vol. 24,87 No. 1, March, pp. 25-46.
Ponniah, P. (2001). Data Warehousing fundamentals . John Wiley & Sons, Inc.
Rebecca, B. (2001). What kind of space is cyberspace, Minerva - An Internet Journal of Philosophy Vol 5.
Satyaki, D. (2003). Predicate Abstraction. Ph.D. Thesis. Department of Electrical Engineering, Stanford University, December.
Shachter, R. (1988). Probabilistic inference and inference diagrams. Operations Research, 36: 589-604.
Savasere, A., Omiecinski, E. & Navathe, S., (1995). "An Efficient Algorithm for Mining Association Rules in Large Databases," Proc. Int'l Conf. Very Large Data Bases, Zurich, Switzerland, Sep., pp. 432-444.
Toivonen, H. (1996). "Sampling Large Databases For Association Rules," The 22th International Conference on Very Large Databases (VLDB'96), Mumbay, India, Sep, pp. 134-145.