研究生: |
楊士民 Yang, Shih-Min |
---|---|
論文名稱: |
基於多任務學習之非監督式域適應方法 Unsupervised Multi-Task Domain Adaptation |
指導教授: |
葉梅珍
Yeh, Mei-Chen |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2019 |
畢業學年度: | 107 |
語文別: | 中文 |
論文頁數: | 42 |
中文關鍵詞: | 深度學習 、非監督式域適應 、遷移學習 、多任務學習 |
英文關鍵詞: | Deep learning, Unsupervised domain adaptation, Transfer learning, Multi-task learning |
DOI URL: | http://doi.org/10.6345/NTNU201900467 |
論文種類: | 學術論文 |
相關次數: | 點閱:187 下載:5 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著日以增長的豐富標籤資料,深度(卷積)神經網絡已經在許多視覺識別任務中顯示出巨大的成功。因此,近期越來越多工作將強大的特徵遷移到新的、但標籤少或不具標籤的資料集。這種知識遷移的主要挑戰稱為域位移,此現象發生於訓練和測試樣本遵循不同的分佈下。無監督域適應技術旨在處理訓練的源域資料(有標籤)與測試的目標域資料(無標籤)之間的域位移現象。
近來,對抗適應方法日亦受到關注,其透過對抗性訓練縮小域位移差異。例如,Zak等人提出的I2I適應架構藉由一組損失函數來約束特徵提取的編碼器,進而讓編碼器提取的特徵不受域位移影響。這些域適應方法提昇模型的泛化性,改善域位移現象之影響,但它們大多在單任務學習的環境下建構與評估。此外,多任務學習旨在利用任務之相關性來共同學習多個任務,進而提高每個任務的泛化性能。該原理已成功藉由共享參數或特徵表示來改善多個不同但相關的任務的預測效果。
本文中,我們探索基於多任務之非監督式域適應方法。此設置涉及多個任務,每個任務包括兩個不同分佈的圖像(源域與目標域)。而僅有源域圖像具有標籤資料。我們擴展I2I適應架構為多任務設置,並透過多個任務(例如分類與語義分割)的源域標籤來增進其域適應效果。最後,我們的實驗結果顯示多任務域適應設置可以額外處理域移位問題。
關鍵字:深度學習、非監督式域適應、遷移學習、多任務學習。
With the availability of abundant labeled data, deep (convolutional) neural networks have shown great success in many visual recognition tasks. As such, an increasing amount of recent work has been introduced to transfer the learned, powerful features to novel datasets where data is unlabeled or sparsely labeled. The major challenge for such knowledge transfer is a phenomenon known as domain shift, in which the training and the testing examples follow different distributions. Unsupervised domain adaptation techniques aim to correct the mismatch between the source domain (with labels) on which classifiers are trained and the target domain (without labels) to which those classifiers are applied.
Recently, adversarial adaptation methods have received increasingly attention which seek to minimize the domain discrepancy through an adversarial objective with respect to a domain discriminator. For example, the I2I (image to image) Adapt framework proposed by Zak et al. uses a set of losses to constrain the features extracted by the encoder. Such methods advance state of the art and improve the generalization performance; however, they are mostly built and evaluated in a single-task learning setting. Alternatively, multi-task learning (MTL) aims to learn multiple tasks jointly by exploiting their relatedness to improve the generalization performance for each task. This principle has been successfully employed to improve the prediction performance on more than one different but related problem through shared parameters or representations.
In this thesis, we investigate the problem of unsupervised multi-task domain adaptation. This formulation involves multiple related tasks and each task consists of training images of two domains. Annotation data are available only for source domain images. The goal is to jointly learn the models applied to unlabeled target domain images for all tasks. We extend the I2I Adapt framework for multi-task domain adaptation, which leverages the annotation data labeled for different tasks (e.g., classification and segmentation) to improve the performance in each individual task. Finally, the experimental results show that domain adaptation methods in a multi-task learning setting can additionally handle the domain shift problem.
Keywords: Deep learning; Unsupervised domain adaptation; Transfer learning; Multi-task learning
[0] DENG, Jia, et al. Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009. p. 248-255.
[1] KRIZHEVSKY, Alex; SUTSKEVER, Ilya; HINTON, Geoffrey E. Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. 2012. p. 1097-1105.
[2] RUSSAKOVSKY, Olga, et al. Imagenet large scale visual recognition challenge. International journal of computer vision, 2015, 115.3: 211-252.
[3] HE, Kaiming, et al. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. p. 770-778.
[4] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. Imagenet large scale visual recognition
challenge
[5] Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. Cnn features off-the-shelf: An astounding baseline for recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2014.
[6] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1125–1134, 2017.
[7] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, pages 2223–2232, 2017.
[8] Ming-Yu Liu, Thomas Breuel, and Jan Kautz. Unsupervised image-to-image translation networks. In Advances in Neural Information Processing Systems, pages 700–708, 2017.
[9] Zak Murez, Soheil Kolouri, David Kriegman, Ravi Ramamoorthi, and Kyungnam Kim. Image to image translation for domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4500–4509, 2018.
[10]A. Gretton, AJ. Smola, J. Huang, M. Schmittfull, KM. Borgwardt, and B. Schölkopf. Covariate shift and local learning by distribution matching, pages 131–160. MIT Press, Cambridge, MA, USA, 2009.
[11]YUILLE, Alan L.; LIU, Chenxi. Deep Nets: What have they ever done for Vision?. arXiv preprint arXiv:1805.04025, 2018.
[12]PENG, Xingchao, et al. Visda: The visual domain adaptation challenge. arXiv preprint arXiv:1710.06924, 2017.
[13]Sebastian Ruder. An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098, 2017.
[14]KENDALL, Alex; GAL, Yarin; CIPOLLA, Roberto. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018. p. 7482-7491.
[15]Shai Ben-David, John Blitzer, Koby Crammer, and Fernando Pereira. Analysis of representations for domain adaptation. In Advances in neural information processing systems, pages 137–144, 2007.
[16]A. Gretton, AJ. Smola, J. Huang, M. Schmittfull, KM. Borgwardt, and B. Schölkopf. Covariate shift and local learning by distribution matching, pages 131–160. MIT Press, Cambridge, MA, USA, 2009.
[17]Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, 2014.
[18]Yaniv Taigman, Adam Polyak, and Lior Wolf. Unsupervised cross-domain image generation. arXiv preprint arXiv:1611.02200, 2016.
[19]Judy Hoffman, Eric Tzeng, Taesung Park, Jun-Yan Zhu, Phillip Isola, Kate Saenko, Alexei A Efros, and Trevor Darrell. Cycada: Cycle-consistent adversarial domain adaptation. arXiv preprint arXiv:1711.03213, 2017.
[20]Yaroslav Ganin and Victor Lempitsky. Unsupervised domain adaptation by backpropagation. arXiv preprint arXiv:1409.7495, 2014.
[21]Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor Lempitsky. Domain-adversarial training of neural networks. Journal of Machine Learning Research, 17(1):2096–2030, 2016.
[22]Eric Tzeng, Judy Hoffman, Trevor Darrell, and Kate Saenko. Simultaneous deep transfer across domains and tasks. In Proceedings of the IEEE International Conference on Computer Vision, pages 4068–4076, 2015.
[23]Judy Hoffman, Dequan Wang, Fisher Yu, and Trevor Darrell. Fcns in the wild: Pixel-level adversarial and constraint-based adaptation. arXiv preprint arXiv:1612.02649, 2016.
[24]Eric Tzeng, Judy Hoffman, Kate Saenko, and Trevor Darrell. Adversarial discriminative domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7167–7176, 2017.
[25] LONG, Jonathan; SHELHAMER, Evan; DARRELL, Trevor. Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2015. p. 3431-3440.
[26]MAO, Xudong, et al. Least squares generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision. 2017. p. 2794-2802.
[27]HE, Kaiming, et al. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision. 2015. p. 1026-1034.
[28]KINGMA, Diederik P.; BA, Jimmy. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
[29]Maaten, Laurens van der, and Geoffrey Hinton. "Visualizing data using t-SNE." Journal of machine learning research 9.Nov (2008): 2579-2605.