簡易檢索 / 詳目顯示

研究生: 李惠雅
Hui-Ya Li
論文名稱: 分群演算法之超大型積體電路架構研究
VLSI Architectures for Clustering Algorithms
指導教授: 黃文吉
Hwang, Wen-Jyi
學位類別: 博士
Doctor
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2012
畢業學年度: 100
語文別: 英文
論文頁數: 183
中文關鍵詞: 分群c-平均值競爭式學習模糊c-平均值可程式化閘陣列可程式化系統晶片
英文關鍵詞: clustering, c-means, competitive learning, fuzzy c-means, field programmable gate array (FPGA), system on programmable chip (SOPC)
論文種類: 學術論文
相關次數: 點閱:277下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文對於c-平均值(c-means)、競爭式學習(competitive learning)、模糊c-平均值(fuzzy c-means),以及帶空間約束之模糊c-平均值(fuzzy c-means with spatial constraint)等多種分群演算法分別提出硬體架構。這些架構皆已在場域可程式化閘陣列(Field Programmable Gate Array,FPGA)裝置上實作建構出適用於分群(clustering)的可程式化系統晶片(System on Programmable Chip,SOPC)系統。
    由於分割(partitioning)與質心計算(centroid computation)等運算全為管線化運作,故本文所提出的c-平均值架構可同時處理多筆訓練向量(training vector)。查表式除法器(lookup table based divider)則用以減少面積成本及質心計算的延遲。
    文中另提出兩種針對k贏家全取(k-winners-take-all,kWTA)操作的硬體實現。第一種架構,經由在小波域(wavelet domain)中執行部分距離搜尋(partial distance search,PDS)來找出關於每一個輸入向量的k個贏家。一種單純利用查表來做計算的硬體除法器則用以構成神經元的更新程序。部分距離搜尋模組及除法器均採取有限精度計算(finite precision calculation)來降低部分距離搜尋及硬體除法器的面積成本。另採用子空間搜尋(subspace search)及多係數累積(multiple-coefficient accumulation)等技巧來降低PDS的運算延遲。第二種則是一個高效率的管線化架構,可同時進行不同訓練向量的kWTA競賽。此管線化架構使用了一個嶄新的碼字交換機制(codeword swapping scheme),使那些在競賽過程中落敗的神經元可立即投入後續訓練向量的競賽。
    文中所提出的模糊c-平均值架構是個高效率的平行計算方案。此架構利用查表式除法來降低計算權重值(membership coefficient)與質心的面積成本及計算複雜度。為了避開龐大的儲存需求,權重矩陣(membership coefficient matrix)及質心的更新,從過去慣用的迭代法,改為合併成單一步驟。這樣的架構還延伸到帶空間約束之模糊c-平均值的實現。並採用查表法來處理開根號運算,以便放寬模糊度(degree of fuzziness)的限制。
    實驗結果顯示文中所提出的架構具有成本效益(cost-effective),且在面對龐大的資料集合及/或眾多的群集數時,較其他軟硬體實現能有更高的加速(speedup)。

    In this dissertation, several hardware architectures are proposed for various clustering algorithms, including the c-means, competitive learning, fuzzy c-means, and fuzzy c-means with spatial constraint algorithms. All these architectures have been implemented on field programmable gate array (FPGA) devices to construct system on programmable chip (SOPC) systems for clustering.
    Both the partitioning and centroid computation operations in the proposed c-means architecture are fully pipelined thus multiple training vectors can be concurrently processed. A lookup table based divider is employed to reduce the area cost and latency for centroid computation.
    Two kinds of hardware realization of competitive learning algorithm with k-winners-take-all (kWTA) activation are presented. In the first architecture, the k winners associating with an input vector are identified by a module performing partial distance search (PDS) in the wavelet domain. The neuron updating process is based on a hardware divider with simple table lookup operations. Both the partial distance search module and hardware divider adopt finite precision calculation for area cost reduction. Subspace search and multiple-coefficient accumulation techniques are also employed to reduce the computation latency for the PDS search. The second architecture is based on an efficient pipeline allowing kWTA competition processes associated with different training vectors to be performed concurrently. The pipeline architecture employs a novel codeword swapping scheme so that neurons failing the competition for a training vector are immediately available for the competitions for subsequent training vectors.
    The proposed fuzzy c-means architecture is an efficient parallel solution. The architecture reduces the area cost and computational complexity for membership coefficients and centroid computation by employing lookup table based dividers. The usual iterative operations for updating the membership matrix and cluster centroid are merged into one single updating process to evade the large storage requirement. Such architecture is also extended to for the implementation of fuzzy c-means with spatial constraint. In the architecture, lookup table based root operators are adapted to relax the restriction on the degree of fuzziness.
    Experimental results show that the proposed architectures are cost-effective, and can attain high speedup over other hardware or software implementations for large data sets and/or large number of clusters.

    中文摘要 i Abstract iii Table of Contents vi List of Figures vii List of Tables xiii Chapter 1. Introduction 1 Chapter 2. Architecture for C-Means Algorithm 12 2.1 C-Means Algorithm 13 2.2 The Proposed Architecture 15 2.2.1 Partitioning Unit 17 2.2.2 Centroid Computation Unit 20 2.2.3 The SOPC system based on the proposed architecture 28 2.3 Experimental Results 29 Chapter 3. Architecture for Competitive Learning with Partial Distance Search in the Wavelet Domain 38 3.1 PDS in the Wavelet Domain for CL 40 3.2 The Proposed Architecture 44 3.2.1 Subspace Search and Finite Precision Calculation 46 3.2.2 Multiple-Coefficient Partial Distance Accumulation 48 3.2.3 Sorting Circuit 50 3.2.4 Neuron Updating 55 3.2.5 Outline of the CL with kWTA for hardware implementation 57 3.2.6 Complete CL Architecture 60 3.2.7 Complete CL Architecture 63 3.3 Experimental Results 66 Chapter 4. Pipelined Architecture for Competitive Learning 74 4.1 The proposed Architecture 75 4.1.1 Architecture for codeword swapping scheme 76 4.1.2 The architecture of the winner selection unit 85 4.1.3 The architecture of the winner update unit 90 4.1.4 The Proposed Architecture for On-Chip Learning 93 4.2 Experimental Results 93 Chapter 5. Architecture for Fuzzy C-Means Algorithm 102 5.1 Fuzzy C-Means Algorithm 104 5.2 The Proposed Architecture 106 5.2.1 Pre-computation unit 107 5.2.2 Membership coefficients updating unit 110 5.2.3 Centroid updating unit 113 5.2.4 Cost function computation unit 116 5.2.5 The SOPC system based on the proposed architecture 117 5.3 Experimental results 119 Chapter 6. Fuzzy C-Means Architecture for Image Segmentation 129 6.1 Fuzzy C-Means for Image Segmentation 131 6.2 The proposed Architecture 133 6.2.1 Pre-Computation Unit for Original FCM with General Fuzziness 134 6.2.2 Membership Coefficients Updating Unit for Original FCM with General Fuzziness 140 6.2.3 Centroid Updating Unit for Original FCM with General Fuzziness 142 6.2.4 Cost Function Computation Unit for Original FCM with General Fuzziness 145 6.2.5 FCM-S Architecture 146 6.2.6 The SOPC System Based on the Proposed Architecture 151 6.3 Experimental Results 152 Chapter 7. Conclusion 163 References 174

    [1] A.K. Jain, M.N. Murty and P.J. Flynn, “Data Clustering: A Review,” ACM Computing Surveys, Vol. 31, No. 3, 1999, pp. 264–323.

    [2] B.S. Everitt, S. Landau, and M. Leese, “Clustering Analysis,” Arnold Publishers, London, fourth edition, May 2001.

    [3] M.R. Anderberg, “Clustering Analysis for Applications,” Academic Press, New Your, December 1973.

    [4] P. Arabie, L. Hubert, and G.D. Soete, “An Overview of Combinatorial Data Analysis,” Clustering and Classification, World Scientific, Singapore, pp. 188-217, January 1996.

    [5] T. Mitchell, “Machine Learning,” McGraw-Hill, Boston, MA, 1997.

    [6] R.O. Duda, P.E. Hart, and D.G. Stork, “Pattern Classification,” John Wiley & Sons, Inc., New York, second edition, 2001.

    [7] P. Berkhin, “Survey of Clustering Data Mining Techniques,” Technical report, Accrue Software, San Jose, CA, 2002.

    [8] C. Chinrungrueng and C. H. Sequin, “Optimal Adaptive K-means Algorithm with Dynamic Adjustment of Learning Rate,” IEEE Transactions on Neural Networks, Vol. 6, No. 1, January 1995, pp.157-169.

    [9] M. Sarkar and B. Yegnanarayana, “A Clustering Algorithm Using Evolutionary Programming,” IEEE International Conference on Neural Networks, USA, Vol. 2, 1996, pp. 1162-1167.

    [10] D. Lee, S. Back, and K. Sung, “Modified K-means Algorithm for Vector Quantizer Design,” IEEE Signal Processing Letters, Vol. 4, No. 1, January 1997, pp.2-4.

    [11] K. Krishna and M. N. Murty, “Genertic K-means Algorithm,” IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, Vol. 29, No.3, June 1999, pp. 433-439.

    [12] C. Olaru and L. Wehenkel, “Data Mining,” IEEE Computer Application in Power, Vol. 12, No. 3, July1999, pp. 19-25.

    [13] K. Alsabti, S. Ranka, and V. Singh, “An Efficient K-means Clustering Algorithm,” First Workshop on High-Performance Data Mining, 1998.

    [14] C. Elkan, “Using the Triangle Inequality to Accelerate K-means,” Proceedings of International Conference on Machine Learning, 2003.

    [15] W. J. Hwang, S.S. Jeng and B.Y. Chen, “Fast Codeword Search Algorithm Using Wavelet Transform and Partial Distance Search Techniques,” Electronic Letters, Vol.33, February 1997, pp. 365-366.

    [16] Estlick, M., Leeser, M., Theiler, J., and Szymanski, J.J., “Algorithmic Transformations in the Implementation of K-means Clustering on Reconfigurable Hardware,” Proceedings of ACM/SIGDA ninth international symposium on Field programmable gate arrays, 2001.

    [17] A. Filho, S. Gda, A.C. Frery, C.C. de Araujo, H. Alice, J. Cerqueira, J.A. Loureiro, M.E. de Lima, Oliveira, G.S. Mdas, M.M. Horta, “Hyperspectral Images Clustering on Reconfigurable Hardware Using the K-means Algorithm,” Proceedings of 16th Symposium on Integrated Circuits and Systems Design, 2003, pp. 99–104.

    [18] Gokhale, M., Frigo, J., Mccabe, K., Theiler, J., Wolinski, C., and Lavenier, D., “Experience with a Hybrid Processor: K-means Clustering,” The Journal of Supercomputing, 2003, pp. 131-148.

    [19] Maruyama, T. “Real-time K-means Clustering for Color Images on Reconfigurable Hardware,” Proceedings of 18th International Conference on Pattern Recognition, 2006.

    [20] S. Haykin, “Neural Networks: A Comprehensive Foundation,” Prentice Hall: Engle Cliffs, NJ, 1998.

    [21] Grossberg S, “Competitive Learning: From Interactive Activation to Adaptive Response,” Cognitive Science, Vol. 11, 1987, pp. 23-63.

    [22] Kohonen T., “The Self-Organizing Map,” Proceedings of the IEEE, Vol. 78, 1990, pp.1464-1480.

    [23] Hwang W. J., Ye B. Y., Lin C. T., “A Novel Competitive Learning Algorithm for the Parametric Classification with Gaussian Distributions,” Pattern Recognition Letters, Vol.21, pp.375-380, 2000.
    [24] Xu L., Krzyzak A., Oja A. E., “Rival Penalized Competitive Learning for Clustering Analysis, RBF Net, and Curve Detection,” IEEE Transactions on Neural Networks, Vol. 4, pp. 636-649, 1993.

    [25] A. Gersho and R.M. Gray, “Vector Quantization and Signal Compression,” Kluwer, Norwood, Massachusetts, 1992.

    [26] Hofmann T., Buhmann J.M., “Competitive Learning Algorithm for Robust Vector Quantization,” IEEE Transactions on Signal Processing, Vol. 46, 1998, pp. 1665-1675.

    [27] J. Hertz, A. Krogh, R.G. Palmer, “Introduction to the Theory of Neural Computation,” Addison-Wesley: New York, NY, 1991.

    [28] W.J. Wolfe, D. Mathis, C. Anderson, J. Rothman, M. Gottler, G. Brady, R.Walker, G. Duane, “K-Winner Networks,” IEEE Transactions on Neural Networks, Vol. 2, 1991, pp. 310-315.

    [29] X. Hu and J. Wang, “An Improved Dual Neural Network for Solving a Class of Quadratic Programming Problems and Its k Winners-Take-All Application,” IEEE Transactions on Neural Network, Vol. 19, No. 12, 2008, pp. 2022-2031.

    [30] Q. Liu, J. Wang, “Two k-Winners-Take-All Networks with Discontinuous Activation Functions,” Neural Networks, Vol. 21, 2008, pp. 406-413.

    [31] Q. Liu, C. Dang and J. Cao, “A Novel Recurrent Neural Network With One Neuron and Finite-Time Convergence for k-Winners-Take-All Operation,” IEEE Trans. Neural Networks, Vol. 21, No. 7, 2010, pp. 1140-1148.

    [32] C.D. Bei and R.M. Gray, “An Improvement of the Minimum Distortion Encoding Algorithm for Vector Quantization,” IEEE Transactions on Communication, Vol. COM-33, Oct. 1985, pp. 1132-1133.

    [33] Vetterli M. and Kovacevic J., “Wavelets and Subband Coding,” Prentice Hall: Engle Cliffs, NJ, 1995.

    [34] W.J. Hwang, F.J. Lin, Y.C. Zeng, “Fast Design Algorithm for Competitive Learning,” Electronics Letters, 1997, Vol.33, pp.1469-1470.

    [35] T.C. Hsu, S.D Wang, “k-Winners-Take-All Neural Net with Θ(1) Time Complexity,” IEEE Transactions on Neural Networks, Vol. 8, 1997, pp. 1557-1561.

    [36] C.S. Lin, P. Ou, B.D. Liu, “Design of k-WTA/Sorting Network Using Maskable WTA/MAX Circuit,” Proceedings of the 2001 VLSI Symposium on Technology, Systems and Applications, pp. 69-72.

    [37] F. Höppner, F. Klawonn, R. Kruse, T. Runkler, “Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition,” Wiley, Chichester, 1999.

    [38] J. C. Bezdek, “Fuzzy Mathematics in Pattern Classification,” Ph. D. dissertation, Cornell University, Ithaca, NY 1973.

    [39] J. C. Bezdek, “Pattern Recognition with Fuzzy Objective Function Algorithms,” Plenum Press, New York, 1981.

    [40] R. Cannon, J. Dave, J. Bezdek, “Efficient Implementation of the Fuzzy C-means Clustering Algorithm,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 1986, pp. 248-255.

    [41] T.W. Cheng, D, B. Goldgof and L.O. Hall, “Fast Fuzzy Clustering,” Fuzzy Sets and Systems, 1998, pp. 49-56.

    [42] S. Eschrich, J. Ke, L. O. Hall, and D. B. Goldgof, “Fast Accurate Fuzzy Clustering Through Data Reduction,” IEEE Transactions on Fuzzy Systems, 2003, pp. 262-270.

    [43] J.F. Kolen and T. Hutcheson, “Reducing the Time Complexity of the Fuzzy C-means Algorithm, IEEE Transactions on Fuzzy Systems, Vol. 10, 2002, pp. 263-267.

    [44] J. Garcia-Lamont, L.M. Flores-Nava, F. Gomez-Castaneda, J.A. Moreno-Cadenas, “CMOS Analog Circuit for Fuzzy C-Means Clustering,” IEEE Proc. 5th Biannual World Automation Congress, 2002.

    [45] J. Lazaro, J. Arias, J. L. Martin, C. Cuadrado and A. Astarloa, “Implementation of a Modified Fuzzy C-means Clustering Algorithm for Realtime Applications,” Microprocessors and Microsystems, 2005, pp. 375-380.

    [46] J.K. Udupa, S. Samarasekera, “Fuzzy Connectedness and Object Definition: Theory, Algorithm and Applications in Image Segmentation,” Graphical Models and Image Processing, Vol. 58, Issue 3, 1996, pp. 246-261.

    [47] S.M. Yamany, A.A. Farag, S. Hsu, “A Fuzzy Hyperspectral Classifier for Automatic Target Recognition (ATR) Systems,” Pattern Recognition Letters, Vol. 20, No. 11, pp. 1431-1438.

    [48] M.S. Yang, Y.J. Hu, K. C.R. Lin and C. C.L. Lin, “Segmentation Techniques for Tissue Differentiation in MRI of Ophthalmology Using Fuzzy Clustering Algorithms,” Magnetic Resonance Imaging, Vol. 20, Issue 2, 2002, pp. 173-179.
    [49] N. Ferahta, A. Moussaoui, K. Benmahammed and V. Chen, “New Fuzzy Clustering Algorithm Applied to RMN Image Segmentation,” International Journal of Soft Computing, Vol. 1, Issue 2, 2006, pp. 137-142.

    [50] M.N. Ahmed, S.M. Yamany, N. Mohamed, A.A. Farag, T. Moriarty, “A Modified Fuzzy C-means Algorithm for Bias Field Estimation and Segmentation of MRI Data.” IEEE Transactions on Medical Imaging, Vol. 21, 2002, pp. 193-199.

    [51] S.C. Chen, D.Zhang, “Robust Image Segmentation Using FCM with Spatial Constraints Based on New Kernel-Induced Distance Measure,” IEEE Systems, Man, and Cybernetics, Vol. 34, 2004, pp. 1907-1916.

    [52] K.S. Chuang, H.L. Tzeng, S. Chen, J. Wu, T.J. Chen, “Fuzzy C-means Clustering with Spatial Information for Image Segmentation,” Computerized Medical Imaging and Graphics, 2006, Vol. 30, Issue 1, pp. 9-15.

    [53] M.A. Mahjoub, “Improved FCM Algorithm applied to Color Image Segmentation,” Canadian Journal on Image Processing and Computer Vision, Vol. 2, No. 2, 2011, pp. 16-19.

    [54] K. Bondalapati, V.K. Prasanna, “Reconfigurable Computing Systems,” Proceedings of the IEEE, 2002, Vol.90, Issue 7, pp. 1201-1217.

    [55] S. Hauck and A. Dehon, “Reconfigurable Computing,” Morgan Kaufmann, 2008.

    [56] Altera Corporation (2011). NIOS II Processor Reference Handbook,
    http://www.altera.com/literature/lit-nio2.jsp

    [57] K. Sayood, “Introduction to Data Compression,” Morgan Kaufmann, 2000.
    [58] M. Bracco, S. Ridella, R. Zunino, “Digital Implementation of Hierarchical Vector Quantization,” IEEE Transactions on Neural Network, 2003, pp. 1072–1084.

    [59] W.J. Hwang, W.K. Wei, Y.J. Yeh, “FPGA Implementation of Full-Search Vector Quantization based on Partial Distance Search,” Microprocessors and Microsystems, 2007, pp. 516–528.

    [60] H. Park, V.K. Prasanna, “Modular VLSI Architectures for Real-Time Full-Search-Based Vector Quantization,” IEEE Transactions on Circuits and Systems for Video Technology, 1993, pp. 309–317.

    [61] C.L. Wang, L.M. Chen, “A new VLSI Architecture for Full-Search Vector Quantization,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 5, 1996, pp. 389–398.

    [62] Altera Corporation (2011). Stratix II Device Handbook,
    online:http://www.altera.com/literature/lit-stx2.jsp

    [63] M. Hutton, J. Schleicher, D. Lewis, B. Pedersen, R. Yuan, S. Kaptanoglu, G. Baeckler, B. Ratchev, K. Padalia, M. Bourgeault, A. Lee, H. Kim and R. Saini, “Improving FPGA Performance and Area Using an Adaptive Logic Module,” Lecture Notes in Computer Science, Vol. 3203, 2004, pp. 135–144.

    [64] P. Hung, H. Fahmy, O. Mencer, M.J. Flynn, “Fast Division Algorithm with a Small Lookup Table,” IEEE Asilomar Conference on Signals, Systems, and Computers, 1999, pp. 1465–1468.

    [65] A.A. Colavita, A. Cicuttin, F. Fratnik, G. Capello, “SORTCHIP: A VLSI Implementation of a Hardware Algorithm for Continuous Data Sorting,” IEEE Journal of Solid-State Circuits, 2003, Vol. 38, pp. 1076-1079.
    [66] R. Cole, A.R. Seigel, “Optimal VLSI Circuit for Sorting,” Journal of ACM, 1998, Vol. 35, pp. 777-809.

    [67] Altera Corporation (2002). Custom Instructions for NIOS Embedded Processors, Application Notes 188, http://www.altera.com/literature/lit-nio.jsp

    [68] Altera Corporation (2005). Stratix Device Handbook,
    http://www.altera.com/literature/lit-stx.jsp

    [69] Altera Corporation (2008). Cyclone III Device Handbook,
    http://www.altera.com/products/devices/cyclone3/cy3-index.jsp

    [70] D.Y. Aksin, “A High-Precision High-Resolution WTA-MAX Circuit of O(N) Complexity,” IEEE Transactions on Circuits and Systems 2002, Vol. 49, No. 1, pp. 48-53.

    [71] H.Y. Li, Y.J. Yeh and W.J. Hwang, “Using Wavelet Transform and Partial Distance Search to Implement kNN Classifier on FPGA with Multiple Modules,” Lecture Notes in Computer Science (ICIAR 2007), Vol. 4633, pp. 1105-1116, 2007.

    [72] P. Hore, L. O. Hall, and D. B. Goldgof, “A Fuzzy C Means Variant for Clustering Evolving Data Streams,” IEEE International Conference on Systems, Man and Cybernetics, 2007, pp. 360-365.

    [73] P. Jihong, Y. Xuan, G. Xinbo, and X. Weixing, “Weighting Exponent m in Fuzzy C-Means (FCM) Clustering Algorithm,” Proceedings of SPIE, Vol. 4554, 2001, pp. 246-251.

    [74] N.R. Pal and J.C. Bezdek, “On Cluster Validity for the Fuzzy C-means Model,” IEEE Transactions on Fuzzy System, Vol. 3, No. 3, 1995, pp. 370-379.

    [75] Vriend. S.P. van Gaans, P.F.M., Middelburg, J. and de Nijs. A. “The Application of Fuzzy C-means Cluster Analysis and Nonlinear Mapping to Geochemical Datasets: Examples from Portugal,” Appl. Geochem., 1988, Vol. 3, pp. 213-224.

    [76] Zimmermann, Hans J., “Fuzzy Set Theory and Its Applications,” Kluwer Academic Publishers, Boston, 1990.

    下載圖示
    QR CODE