張建軍 趙小明 何亞東 文虹茜 卿粼波



摘 要 ???:視覺情感分析旨在分析人們對視覺刺激的情感反映,近年來受到了共享平臺和網(wǎng)絡(luò)社交等多媒體視覺數(shù)據(jù)相關(guān)領(lǐng)域的關(guān)注.傳統(tǒng)的圖片情感分析側(cè)重于單標(biāo)簽的情感分類,忽略了圖片表達(dá)的情感的復(fù)雜性和圖像潛在的情緒分布信息,不能體現(xiàn)出圖片所表達(dá)的不同情緒之間的相關(guān)性.針對以上問題,首先采用ViT和ResNet網(wǎng)絡(luò)進(jìn)行全局和局部融合的多尺度情感特征提取,通過主導(dǎo)情緒分類和標(biāo)簽分布學(xué)習(xí)進(jìn)行圖片情感識別,充分表征圖片的復(fù)雜情感.在公開的Flickr_LDL數(shù)據(jù)集和Twitter_LDL數(shù)據(jù)集上取得了顯著的效果,證明了提出方法的有效性.
關(guān)鍵詞 :視覺情感分析; 深度學(xué)習(xí); 標(biāo)簽分布學(xué)習(xí); 圖片情感
中圖分類號 :TP391.4 文獻(xiàn)標(biāo)識碼 :A DOI : ?10.19907/j.0490-6756.2023.043002
Image emotion distribution learning based on multi-scale feature fusion
ZHANG Jian-Jun ?1, ZHAO Xiao-Ming ?1, HE Ya-Dong ?1, WEN Hong-Qian ?2, QING Lin-Bo ?2
(1. CHN ENERGY Dadu River Dagangshan Power Generation Co., Ltd, Yaan 625409, China;
2.College of Electronics and Information Engineering, Sichuan University, Chengdu 610065, China)
Visual emotion analysis aims to analyze the emotional response of human beings to visual stimuli, which has attracted multimedia visual data related fields such as sharing platforms and social networking in recent years. Traditional image emotion analysis focuses on the classification of single label emotions, ignoring the complexity of emotions expressed in pictures and the potential emotional distribution information of images, and failing to reflect the correlation between different emotions expressed in pictures. To solve the above problems, ViT and Resnet networks are used to extract multi-scale emotional features with global and local fusion, and the label distribution learning method is used for image emotion prediction. Significant results are achieved on the public available Flickr_LDL dataset and Twitter_LDL dataset, which demostrate the effectiveness of the proposed method.
Visual emotion analysis; Deep learning; Label distribution learning; Image emotion
1 引 言
理解圖像輪廓和色彩中隱含的情感表達(dá)一直以來受到藝術(shù)與心理學(xué)領(lǐng)域的關(guān)注,隨著互聯(lián)網(wǎng)的發(fā)展,視覺情感分析成為計(jì)算機(jī)視覺領(lǐng)域的一個(gè)重要課題 ?[1,2],應(yīng)用在美學(xué)分析、智能廣告和社交媒體輿情檢測等 ?[3-6]眾多領(lǐng)域.為了分析圖片表達(dá)的情感,需要對圖片進(jìn)行情緒標(biāo)注,通過手工設(shè)計(jì)或深度學(xué)習(xí)的方法提取圖片的特征,完成情緒的識別與歸類,并在此基礎(chǔ)上做進(jìn)一步的分析.目前大部分的方法忽略了圖片隱含的情緒分布信息,如何有效提取圖片的情感特征也是一個(gè)亟待解決的問題.
視覺特征的提取是圖片情緒識別的重要內(nèi)容 ?[1].傳統(tǒng)的視覺情緒識別使用底……