研究生: |
蔡洪弦 Tsai, Hong-Xian |
---|---|
論文名稱: |
基於非監督式跨領域深度學習之單張影像雜訊去除 Unsupervised Cross Domain Deep Learning for Single Image Noise Removal |
指導教授: |
康立威
Kang, Li-Wei |
口試委員: |
陳士煜
Chen,Shih-Yu 李曉祺 Li, Hsiao-Chi 康立威 Kang, Li-Wei |
口試日期: | 2022/07/26 |
學位類別: |
碩士 Master |
系所名稱: |
電機工程學系 Department of Electrical Engineering |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 中文 |
論文頁數: | 52 |
中文關鍵詞: | 影像雜訊去除 、非監督式網路 、深度學習 、生成對抗網絡 |
英文關鍵詞: | Image denoising, Unsupervised Leaning, Deep Learning, Generative Adversarial Network |
研究方法: | 實驗設計法 、 準實驗設計法 、 比較法 |
DOI URL: | http://doi.org/10.6345/NTNU202201103 |
論文種類: | 學術論文 |
相關次數: | 點閱:133 下載:24 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
數位多媒體資料於我們的日常生活中已無所不在,尤其以影像及視訊資料為大宗,例如:隨時隨地皆有無法計數之影像資料來自各類行動裝置及無所不在之路邊監視器。這些龐大之影像資訊可能帶來日常生活中大量的應用。然而,影像資料之來源非常廣泛且品質難以控制。影像品質過低可能會使其相關應用之效能大打折扣,甚至毫無用途。因此,數位影像品質回復或強化已為一重要之研究議題。近年基於深度學習技術的快速發展,已有許多基於深度學習網路之影像品質回復技術問世。然而,目前架構大多基於端對端之監督式學習且利用人工合成之訓練影像資料集。其主要問題為以人造訓練資料所訓練之網路未必適合於真實世界之影像品質下降問題,且真實低品質影像及其高品質版本配對之資料集卻難以取得。因此,最近基於跨領域 (cross-domain) 之深度學習已被研究來解決可能之領域間隔閡的問題。本論文提出研究基於跨領域深度學習之影像品質回復技術,並嘗試解決目前方法潛在的可能問題,例如:(1)有限的一般化特性:可能使得現有方法難以適用於不同種類的影像;(2)領域偏移問題:對於無成對訓練資料之非監督式學習,可能會因不容易學到好的影像特徵表示法及因為低品質影像之影像雜訊變異過大的關係導致領域偏移;及(3)不明確之領域邊界:當訓練影像之雜訊變異過大及影像內容過於複雜且無成對訓練資料時,低品質及高品質影像間的領域界線不明,使得不易達成良好之跨領域學習。
為了解決上述問題及考慮其實際應用,本論文提出一基於跨領域非監督式深度學習之影像雜訊去除網路架構。我們的目標為根據輸入之雜訊影像資料集學習影像特徵表示法,並使得此表示法能貼近乾淨影像之特徵表示法,以期達到更佳的影像品質回復。本論文提出利用雙向生成對抗網路將非成對之訓練影像分別做雙向之影像轉換 (雜訊轉換成乾淨影像及乾淨轉換成雜訊影像),並使用多項影像空間域及影像頻率域之損失函數以訓練一影像雜訊去除 (或噪聲去除) 深度學習網路。在實驗階段,我們使用了多個知名影像資料集 (CBSD68、SIDD及NIH-, AAPM- and Mayo Clinic-sponsored Low Dose CT Grand Challenge) 來訓練及測試所提出的深度學習模型。實驗結果已證實所提出的方法優於傳統基於非深度學習及近年具代表性之基於深度學習方法且適合用於解決實際問題。
Digital multimedia data have been ubiquitous in our daily life, especially for images and videos. For example, a huge amount of image data may be captured from different mobile devices or ubiquitous surveillance cameras. The huge amount of image data may enable different types of applications for our daily life, such as face/object detection and recognition, event detection, security monitoring, environment mining, autonomous driving, medical diagnosis, industrial inspection, and social media mining. However, image sources are highly diverse and their qualities are not easy to control, and thus, low-quality images may significantly degrade the performances of the related applications. Therefore, image quality restoration has been a popular and important research topic. With the recently rapid development of deep learning techniques, several deep learning-guided image restoration frameworks have been presented. Most of them were end-to-end supervised deep networks trained by synthesized paired image datasets, which may not fit real-world problems. To solve the problem that real paired training data are hard to collect, in recent, cross-domain deep learning is investigated to solve the problem of domain gap, which has been also applied to unsupervised image restoration via deep learning. In this thesis, we investigate cross-domain deep learning-based image restoration for single image denoising and solve the problems remained in currently most cross-domain learning frameworks, described as follows: (i) limited generalization: a learned deep model may not be well generalizable to different types of images; (ii) domain shift: an unsupervised domain-adaption model may not extract strong enough feature representations due to the high diversity of noises in input image data, which may cause the domain shift problem; and (iii) unclear domain boundary: high diversity of noises and complicated image contents may blur domain boundaries between unpaired image inputs, resulting in poor image reconstruction performance.
To solve the above mentioned problems, a novel cross-domain unsupervised deep learning network is presented in this thesis for single image noise removal. Our goal is to learn invariant feature representation from input noisy images, which would be expected to align the representation of clean images for better image restoration. More specifically, we propose a generative adversarial network (GAN)-based architecture with different types of discriminators and loss functions in both image spatial and frequency domains. In our framework, we aim at learning two image generators to transfer noisy images to clean images as well as clean images to noisy images, respectively, based on unpaired training images. Extensive experimental results on several well-known image datasets, such as CBSD68、SIDD及NIH-, AAPM- and Mayo Clinic-sponsored Low Dose CT Grand Challenge, have verified that the proposed deep learning model for image denoising outperforms the traditional non-deep-learning-based and the state-of-the-art deep learning-based methods quantitatively and qualitatively.
[1] Zhang, K., Zuo, W., Chen, Y., Meng, D., & Zhang, L. (2017). Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE transactions on image processing, 26(7), 3142-3155.
[2] Moen, T. R., Chen, B., Holmes III, D. R., Duan, X., Yu, Z., Yu, L., ... & McCollough, C. H. (2021). Low‐dose CT image and projection dataset. Medical physics, 48(2), 902-911.
[3] Kim, N., Jang, D., Lee, S., Kim, B., & Kim, D. S. (2021). Unsupervised Image Denoising with Frequency Domain Knowledge. arXiv preprint arXiv:2111.14362.
[4] Ahn, N., Kang, B., & Sohn, K. A. (2018). Fast, accurate, and lightweight super-resolution with cascading residual network. In Proceedings of the European conference on computer vision (ECCV) (pp. 252-268).
[5] Kim, Y., Soh, J. W., Park, G. Y., & Cho, N. I. (2020). Transfer learning from synthetic to real-noise denoising with adaptive instance normalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 3482-3492).
[6] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014). Generative adversarial nets. Advances in neural information processing systems, 27.
[7] Ronneberger, O., Fischer, P., & Brox, T. (2015, October). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Springer, Cham.
[8] Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1125-1134).
[9] Dzanic, T., Shah, K., & Witherden, F. (2020). Fourier spectrum discrepancies in deep network generated images. Advances in neural information processing systems, 33, 3022-3032.
[10] Xie, J., Xu, L., & Chen, E. (2012). Image denoising and inpainting with deep neural networks. Advances in neural information processing systems, 25.
[11] Gardner, M. W., & Dorling, S. R. (1998). Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmospheric environment, 32(14-15), 2627-2636.
[12] Mao, X. J., Shen, C., & Yang, Y. B. (2016). Image restoration using convolutional auto-encoders with symmetric skip connections. arXiv preprint arXiv:1606.08921.
[13] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
[14] Ioffe, S., & Szegedy, C. (2015, June). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning (pp. 448-456). PMLR.
[15] Alsaiari, A., Rustagi, R., Thomas, M. M., & Forbes, A. G. (2019, March). Image denoising using a generative adversarial network. In 2019 IEEE 2nd international conference on information and computer technologies (ICICT) (pp. 126-132). IEEE.
[16] Durall, R., Keuper, M., & Keuper, J. (2020). Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7890-7899).
[17] Cai, M., Zhang, H., Huang, H., Geng, Q., Li, Y., & Huang, G. (2021). Frequency domain image translation: More photo-realistic, better identity-preserving. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 13930-13940).
[18] Chen, Y., Li, G., Jin, C., Liu, S., & Li, T. (2021, May). Ssd-gan: Measuring the realness in the spatial and spectral domains. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 2, pp. 1105-1112).
[19] Liang, T., Jin, Y., Li, Y., & Wang, T. (2020, December). Edcnn: Edge enhancement-based densely connected network with compound loss for low-dose ct denoising. In 2020 15th IEEE International Conference on Signal Processing (ICSP) (Vol. 1, pp. 193-198). IEEE.
[20] Luthra, A., Sulakhe, H., Mittal, T., Iyer, A., & Yadav, S. (2021). Eformer: Edge enhancement based transformer for medical image denoising. arXiv preprint arXiv:2109.08044.
[21] Huang, Z., Zhang, J., Zhang, Y., & Shan, H. (2021). DU-GAN: Generative adversarial networks with dual-domain U-Net-based discriminators for low-dose CT denoising. IEEE Transactions on Instrumentation and Measurement, 71, 1-12.
[22] Mao, X., Li, Q., Xie, H., Lau, R. Y., Wang, Z., & Paul Smolley, S. (2017). Least squares generative adversarial networks. In Proceedings of the IEEE international conference on computer vision (pp. 2794-2802).
[23] Wang, X., & Yu, J. (2020). Learning to cartoonize using white-box cartoon representations. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8090-8099).
[24] Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision (pp. 2223-2232).
[25] Chen, J., Chen, J., Chao, H., & Yang, M. (2018). Image blind denoising with generative adversarial network based noise modeling. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3155-3164).
[26] Guo, S., Yan, Z., Zhang, K., Zuo, W., & Zhang, L. (2019). Toward convolutional blind denoising of real photographs. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 1712-1722).
[27] Brooks, T., Mildenhall, B., Xue, T., Chen, J., Sharlet, D., & Barron, J. T. (2019). Unprocessing images for learned raw denoising. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 11036-11045).
[28] Tran, L. D., Nguyen, S. M., & Arai, M. (2020). GAN-based noise model for denoising real images. In Proceedings of the Asian Conference on Computer Vision.
[29] Lehtinen, J., Munkberg, J., Hasselgren, J., Laine, S., Karras, T., Aittala, M., & Aila, T. (2018). Noise2Noise: Learning image restoration without clean data. arXiv preprint arXiv:1803.04189.
[30] Krull, A., Buchholz, T. O., & Jug, F. (2019). Noise2void-learning denoising from single noisy images. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2129-2137).
[31] Yuan, Y., Liu, S., Zhang, J., Zhang, Y., Dong, C., & Lin, L. (2018). Unsupervised image super-resolution using cycle-in-cycle generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 701-710).
[32] Du, W., Chen, H., & Yang, H. (2020). Learning invariant representation for unsupervised image restoration. In Proceedings of the ieee/cvf conference on computer vision and pattern recognition (pp. 14483-14492).
[33] He, K., Sun, J., & Tang, X. (2010, September). Guided image filtering. In European conference on computer vision (pp. 1-14). Springer, Berlin, Heidelberg.
[34] Dabov, K., Foi, A., Katkovnik, V., & Egiazarian, K. (2007). Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Transactions on image processing, 16(8), 2080-2095.
[35] Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4), 600-612.
[36] Chambolle, A. (2004). An algorithm for total variation minimization and applications. Journal of Mathematical imaging and vision, 20(1), 89-97.
[37] Johnson, J., Alahi, A., & Fei-Fei, L. (2016, October). Perceptual losses for real-time style transfer and super-resolution. In European conference on computer vision (pp. 694-711). Springer, Cham.
[38] Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248-255). Ieee.
[39] Wu, H., Zheng, S., Zhang, J., & Huang, K. (2018). Fast end-to-end trainable guided filter. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1838-1847).
[40] Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001, July). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001 (Vol. 2, pp. 416-423). IEEE.
[41] Abdelhamed, A., Lin, S., & Brown, M. S. (2018). A high-quality denoising dataset for smartphone cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1692-1700).
[42] Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001, July). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001 (Vol. 2, pp. 416-423). IEEE.