






摘要:針對(duì)由于受光照條件變化、 行人身高差異等影響, 致使監(jiān)控視頻圖像在不同時(shí)刻的成像存在較大的跨模態(tài)差異問(wèn)題, 為準(zhǔn)確識(shí)別跨模態(tài)圖像中的行人, 提出基于泛化遷移深度學(xué)習(xí)的跨模態(tài)圖像行人識(shí)別算法。通過(guò)循環(huán)生成對(duì)抗網(wǎng)絡(luò)(Cyele GAN: Cycle Generative Adversarial Network)形成跨模態(tài)圖像, 采用單目標(biāo)圖像處理對(duì)基準(zhǔn)圖分割處理, 得到人體候選區(qū)域, 在匹配圖中搜索和其匹配的區(qū)域, 得到人體區(qū)域的視差, 通過(guò)視差提取人體區(qū)域的深度和透視特征。將注意力機(jī)制和跨模態(tài)行人識(shí)別相結(jié)合, 分析兩種不同類(lèi)型圖像的差異, 將兩個(gè)子空間映射到同一個(gè)特征空間, 同時(shí)引入泛化遷移深度學(xué)習(xí)算法對(duì)損失函數(shù)度量學(xué)習(xí), 自動(dòng)篩選跨模態(tài)圖像的行人特征, 最終通過(guò)模態(tài)融合模塊將篩選的特征融合處理完成行人識(shí)別。實(shí)驗(yàn)結(jié)果表明, 所提算法可以快速、 準(zhǔn)確地提取不同模態(tài)圖像中的行人, 識(shí)別效果較好。
關(guān)鍵詞:泛化遷移深度學(xué)習(xí); 跨模態(tài)圖像; 行人識(shí)別; 特征提取
中圖分類(lèi)號(hào): TP311 文獻(xiàn)標(biāo)志碼: A
Pedestrian Recognition Algorithm of Cross-Modal Image under Generalized Transfer Deep Learning
CAI Xianlong, LI Yang, CHEN Xi
(School of Information Engineering, Xi'an Mingde Institute of Technology, Xi'an 710124, China)
Abstract:Due to the influence of changes in lighting conditions and pedestrian height differences, there are large cross modal differences in surveillance video images at different times. In order to accurately identify pedestrians in cross modal images, a pedestrian recognition algorithm based on generalized transfer depth learning is proposed. The cross modal image is formed through Cyele GAN(Cycle Generative Adversarial Network), and the reference map is segmented using single object image processing to obtain candidate human body regions. The matching regions are searched in the matching map to obtain the disparity of human body regions, and the depth and perspective features of human body regions are extracted through the disparity. The attention mechanism and cross modal pedestrian recognition are combined to analyze the differences between the two types of images. The two subspaces are mapped to the same feature space. And the generalized migration depth learning algorithm is introduced to learn the loss function measurement, automatically screen the pedestrian features of the cross modal images, and finally complete pedestrian recognition through the modal fusion module to fuse the filtered features. The experimental results show that the proposed algorithm can quickly and accurately extract pedestrians from different modal images, and the recognition effect is good.
Key words:generalization transfer deep learning; cross-modal images; pedestrian recognition; feature extraction
0 引 言
由于在光照條件較差的環(huán)境中對(duì)單模態(tài)行人識(shí)別, 無(wú)法滿足相關(guān)領(lǐng)域?qū)π腥俗R(shí)別效果的預(yù)期要求, 因此人們將深度學(xué)習(xí)技術(shù)應(yīng)用于行人識(shí)別[1-2]中, 并在對(duì)應(yīng)的數(shù)據(jù)集中取得了較高的識(shí)別率。由于晝夜光照差異比較明顯, 導(dǎo)致跨模態(tài)的行人識(shí)別面臨巨大挑戰(zhàn)。目前人們針對(duì)跨模態(tài)行人識(shí)別方面的研究已有許多報(bào)道, 如王留洋等[3]優(yōu)先組建雙模態(tài)特征提取網(wǎng)絡(luò), 通過(guò)構(gòu)建的網(wǎng)絡(luò)對(duì)圖像深度特征實(shí)行提取操作, 增強(qiáng)處理全部特征后融合圖像的全部像素信息, 完成行人識(shí)別。……
吉林大學(xué)學(xué)報(bào)(信息科學(xué)版)
2024年1期