侯麗 劉琦



【摘 要】跨攝像機行人因光照、視角、姿態的差異,會使其外觀變化顯著,給行人再識別的研究帶來嚴峻挑戰。文中提出基于深度學習和度量學習的行人再識別方法。首先采用手工特征和深度特征融合網絡FFN提取行人圖像特征,然后將核矩陣應用于KISSME距離度量學習中,獲取更優的距離度量模型。在具有挑戰的VIPeR和PRID450S兩個公開數據集上進行仿真實驗,實驗結果表明所提出的行人再識別算法的有效性。
【關鍵詞】行人再識別;特征融合網絡;深度學習;距離度量學習
中圖分類號: TP391.41文獻標識碼: A文章編號: 2095-2457(2019)29-0112-002
DOI:10.19694/j.cnki.issn2095-2457.2019.29.051
Deep Learning and Metric Learning Based Person Re-identification
HOU Li LIU Qi
(School of Information Engineering,Huangshan University,Huangshan Anhui 245041,China)
【Abstract】Pedestrian may vary greatly in appearance due to differences in illumination, viewpoint, and poses across cameras, which can bring serious challenges in person re-identification. A deep learning and metric learning based algorithm is proposed for person re-identification in this paper. Features of human images are first extracted by a feature fusion net (FFN) composed of handcraft features and deep features, and then a kernel matrix is applied to KISSME distance metric learning to obtain a better distance metric model. Experimental results have shown that the proposed algorithm effectively improves recognition rates on two challenging datasets (VIPeR, PRID450s).
【Key words】Person re-identification; Feature fusion net; Deep learning; Distance metric learning
0 引言
行人再識別屬于一種智能視頻分析技術,對行人目標的跨攝像頭跟蹤以及行人行為分析等具有重要的研究意義。行人再識別技術,是指讓計算機去判斷不同攝像頭拍攝的行人圖像是否具有相同身份,通過行人的外觀去匹配不同攝像頭拍攝的行人圖像。因監控場景的多變性和跨攝像機行人外觀變化的復雜性,對行人再識別的研究極具挑戰性。
當前對行人再識別的研究主要集中于兩方面:一是提取具有辨識力的特征來描述行人外觀[1-11],二是探索具有辨識力的距離度量學習方法[12-18]。然而,大多數手工提取的特征(顏色/紋理/形狀等)在進行跨攝像機行人匹配時,或者辨識力不夠,或者對視角變化不具有魯棒性。深度特征在一定程度上彌補了手工提取特征的不足,但需要通過大量樣本的監督學習才能獲取更優的特征模型。而距離度量學習在一定程度上減輕了跨攝像機行人匹配時的外觀差異,然而因有限的訓練樣本數據,可能無法獲取跨攝像機行人更優的距離度量。
為了更好地解決跨攝像機行人外觀的顯著變化,文中結合深度學習技術和度量學習技術進行行人再識別,其算法流程如圖1所示。首先采用手工特征和深度特征融合網絡FFN對行人的訓練樣本進行辨識特征提取,然后將核矩陣K應用于KISSME距離度量學習中,以獲取更優的距離度量模型,從而提高行人再識別的準確率和魯棒性。
圖1 算法流程
1 辨識特征提取
為了更準確地描述行人外觀,文中采用手工特征和深度特征融合網絡FFN提取行人圖像特征[3],如圖2所示。FFN由兩個子網絡組成。第一個子網絡使用傳統的CNN(卷積、池化、激活函數)來處理輸入行人圖像;第二個子網絡使用額外的手工特征(RGB, HSV, LAB, YCbCr, YIQ顏色特征和Gabor紋理特征)來表示相同的行人圖像。兩個子網絡共同作用形成更加充分的行人圖像描述。第二個子網絡在特征學習過程中用于調整第一個子網絡的學習方向。最終,在融合層產生4096維的FFN特征向量。
圖2 FFN特征提取圖解[3]
2 核距離度量學習
為了減輕跨攝像機行人外觀的變化,在行人匹配階段,采用基于核技巧的KISSME[12]距離度量學習方法,獲取最優的馬氏距離度量學習模型。
給定一對樣本(xi,xj),其馬氏距離定義如公式(1)所示:
d■■(xi,xj)=(xi-xj)TM(xi-xj)(1)
式中:M=∑■■-∑■■為正的半正定馬氏距離矩陣,能夠很容易地從訓練樣本中學習。∑S=■∑■(x■-x■)(x■-x■)■和∑D=■∑∑■(x■-x■)(x■-x■)分別表示行人圖像相似對S和不相似對D的協方差矩陣。
文中通過核技巧將樣本特征向量從輸入特征空間映射到高維核空間,樣本特征向量之間借助核函數的映射獲取核矩陣K,即:K=ΦT(X)Φ(X)表示。X表示樣本特征,Φ(X)表示輸入特征空間到核空間的非線性映射。核函數的引入避免“維數災難”,可大大減少計算量,也可通過自由的選取合適的核函數改善算法的性能。
3 實驗結果
文中應用具有挑戰性的兩個公開數據集:VIPeR和PRID450S,估計所提出的行人再識別算法的累計匹配特性(CMC)。通過隨機選取行人數的一半作為訓練樣本集,另一半作為測試樣本集。訓練集中的樣本用于學習距離度量模型,測試集中的樣本用于衡量跨攝像機行人圖像的特征距離。
表1和圖3給出了VIPeR和PRID450S兩個數據集的實驗結果。由表1可知,基于相同特征FFN,在PRID450S數據集中有更優的識別率。在VIPeR數據集排序為1時識別率僅為26.9%,而在PRID450S數據集排序為1時識別率為49.33%。
表1VIPeR和PRID450S兩個數據集的最高識別率(%)。列出了排序為1,5,10,20的累積匹配分數。
表1
圖3 VIPeR和PRID450S兩個數據集的最高識別率(%)
4 結論
文中提出了基于深度學習和度量學習的行人再識別算法。采用深度特征和手工特征融合網絡FFN提取行人圖像特征,并將核矩陣K應用于KISSME距離度量學習中,獲取更優的距離度量模型。在具有挑戰的VIPeR和PRID450S兩個行人再識別數據集上的實驗結果展示了文中提出的行人再識別算法的有效性。
【參考文獻】
[1]S. Liao, Y. Hu, X. Zhu, and S. Z. Li, “Person re-identification by local maximal occurrence representation and metric learning,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, Massachusetts, USA, 2015.6.7-2015.6.12.
[2]T. Xiao, H. Li, W. Ouyang, and X. Wang, “Learning deep feature representations with domain guided dropout for person re-identification,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, Nevada, USA, 2016.6.26-2016.7.1.
[3]S. Wu, Y. C. Chen, X. Li, A. C. Wu, J. J. You, W. S. Zheng, “An enhanced deep feature representation for person re-identification,” IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Placid, NY, USA, 2016.3.7-2016.3.9
[4]D. Cheng, Y. Gong, S. Zhou, J. Wang, and N. Zheng, “Person re-identification by multi-channel parts-based CNN with improved triplet loss function,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, Nevada, USA, 2016.6.26-2016.7.1.
[5]Y. Chen, X. Zhu, and S. Gong, “Person re-identification by deep learning multi-scale representations,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, 2017.7.21-2017.7.26
[6]H. Zhao, et al., “Spindle net: Person re-identification with human body region guided feature decomposition and fusion,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, 2017.7.21-2017.7.26.
[7]X. Liu, et al., “Hydraplus-net: Attentive deep features for pedestrian analysis,” IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017.10.22-2017.10.29.
[8]Y. Sun, et al., “Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline),” European Conference on Computer Vision (ECCV), Munich, Germany, 2018.9.8-2018.9.14.
[9]L.Zhao,et al.,“Deeply-learned part-aligned representations for person re-identification,”IEEE International Conference on Computer Vision (ICCV),Venice,Italy,2017.10.22-2017.10.29.
[10]L. He, et al., “Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, Utah, 2018.6.18-2018.6.22.
[11]X. Chang, et al., “Multi-level factorisation net for person re-identification,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, Utah, 2018.6.18-2018.6.22.
[12]M. Koestinger, et al., “Large scale metric learning from equivalence constraints,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, Rhode Island, USA, 2012.6.16-2012.6.21.
[13]S. Pedagadi, et al., “Local fisher discriminant analysis for pedestrian re-identification,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, Oregon, USA, 2013.6.23-2013.6.28.
[14]F. Xiong, M. Gou, O. Camps, M. Sznaier, “Person re-identification using kernel-based metric learning methods,” European conference on computer vision (ECCV), Zurich, Switzerland, 2014.9.6-2014.9.12.
[15]S. Paisitkriangkrai, C. Shen, A. Hengel, “Learning to rank in person re-identification with metric ensembles,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, Massachusetts, USA, 2015.6.7-2015.6.12.
[16]Y. Yang, S. Liao, Z. Lei, S. Z. Li, “Large scale similarity learning using similar pairs for person verification,” AAAI Conference on Artificial Intelligence (AAAI), Phoenix, Arizona, USA, 2016.2.12-2016.2.17.
[17]L. Hou, K. Han, W. G. Wan, J-N Hwang, H. Y. Yao, “Normalized Distance Aggregation of Discriminative Features for Person Re-identification,” Journal of Electronic Imaging, 2018, 27(2): 023006.
[18]X. Yang, M. Wang, and D. Tao, “Person re-identification with metric learning using privileged information,” IEEE Transactions on Image Processing, 2018, 27(2),791-805.