曾敏 吳圣健 李坊 陳直



摘要:近年來(lái),基于深度學(xué)習(xí)模型的圖像識(shí)別技術(shù)已成為智能零售柜的主要解決方案。設(shè)計(jì)了一種新的基于雙神經(jīng)網(wǎng)絡(luò)模型的智能零售柜系統(tǒng)。該系統(tǒng)與單模型設(shè)計(jì)比較,除檢測(cè)召回率和分類準(zhǔn)確度有顯著提升外,還大大降低了因增加新品種而導(dǎo)致的模型再訓(xùn)練時(shí)間。首先,利用Faster RCNN模型完成商品大類(按包裝分類)的分類檢測(cè)任務(wù),以提高檢測(cè)召回率;其次,利用ResNet50模型完成商品小類(按品種分類)的分類任務(wù),以提高分類準(zhǔn)確度。與此同時(shí),還在最難分品種集上進(jìn)行了多種數(shù)據(jù)增強(qiáng)消融實(shí)驗(yàn)研究,以改進(jìn)該難分品種集所屬大類數(shù)據(jù)集的分類準(zhǔn)確度。
關(guān)鍵詞:深度學(xué)習(xí);圖像檢測(cè);圖像分類;智能零售柜;神經(jīng)網(wǎng)絡(luò)模型
中圖分類號(hào):TP181? ? ? 文獻(xiàn)標(biāo)識(shí)碼:A
文章編號(hào):1009-3044(2021)26-0009-05
開(kāi)放科學(xué)(資源服務(wù))標(biāo)識(shí)碼(OSID):
Design and Implementation of Intelligent Retail Cabinet Based on Double Neural Network Model
ZENGMin1,WU Sheng-jian2, LI Fang1, CHEN Zhi1
(1. Dept. of Communication and Information Engineering, Shanghai Technical institute of Electronics & information, Shanghai 201411, China;2. FinVolution Group, Shanghai 201203, China)
Abstract:In recent years, image recognition technology based on deep learning models has become the main solution for intelligent retail cabinets. A new intelligent retail cabinet system based on dual neural network model is introduced. Compared with the single model design, this system not only significantly enhances the detection recall rate and classification accuracy, but also greatly reduces the model retraining time caused by the addition of new varieties. First, the Faster RCNN model is used to complete the rough classification and detection task of commodity categories (classified by packaging) to improve the detection recall rate; secondly, the ResNet50 model is employed to complete the fine classification task of commodity categories (classified by variety) to improve classification accuracy degree. At the same time, some data augment ablation experiments were conducted on the most difficult-to-classification variety set of this project to refine the fine classification accuracy of the commodity categories (classified by variety) to which the difficult-to-classification variety set belongs.
Key words:deeplearning; image detection; image classification; intelligent retail cabinets; neural network model
近年來(lái),無(wú)人零售作為一種便利的零售新業(yè)態(tài),在我國(guó)許多城市得到了長(zhǎng)足發(fā)展。根據(jù)前瞻產(chǎn)業(yè)研究院發(fā)布的《中國(guó)新零售行業(yè)商業(yè)模式創(chuàng)新與投資機(jī)會(huì)深度研究報(bào)告》預(yù)測(cè),2022年無(wú)人零售用戶可達(dá)2.45億人,交易額將超1.8萬(wàn)億元[1]。無(wú)人零售的快速增長(zhǎng),得益于多種技術(shù)的發(fā)展和融合,特別是移動(dòng)支付的普及和人工智能、云計(jì)算等高新技術(shù)的應(yīng)用落地[2]。
目前,我國(guó)無(wú)人值守零售柜有4種技術(shù)實(shí)現(xiàn)形式[3,5],分別是①以“友寶公司”為代表的機(jī)械式自動(dòng)售賣機(jī)。其發(fā)展較早,技術(shù)難度低,產(chǎn)品成熟,但制造成本較高,購(gòu)物流程相對(duì)煩瑣;②以“每日優(yōu)鮮”為代表的RFID(Radio Frequency Identification)零售柜。其技術(shù)成熟,市場(chǎng)占有率高,但RFID標(biāo)簽制作成本也高;③以“京東到家”為代表的重力感應(yīng)零售柜。其依靠重力感應(yīng)來(lái)識(shí)別商品的品類和價(jià)格,商品可自由擺放,空間利用率高,但對(duì)稱重傳感器的靈敏度要求高;④以“深蘭”“購(gòu)呀”為代表的視覺(jué)識(shí)別零售柜。其主要利用圖像識(shí)別技術(shù),能適應(yīng)復(fù)雜多樣的消費(fèi)場(chǎng)景,是未來(lái)零售智能化的方向[6]。視覺(jué)識(shí)別零售柜又分為動(dòng)態(tài)和靜態(tài)兩種,其中深蘭以3D動(dòng)態(tài)視覺(jué)技術(shù)見(jiàn)長(zhǎng),其TakeGo與AmazonGo類似,識(shí)別率的提高除采用較大神經(jīng)網(wǎng)絡(luò)模型外,還需要相應(yīng)的糾錯(cuò)算法來(lái)降低諸如用戶單手取多件商品等行為的識(shí)別誤差,設(shè)備成本和計(jì)算量相對(duì)于靜態(tài)識(shí)別都較高,擴(kuò)大市場(chǎng)規(guī)模的難度較大;購(gòu)呀目前專注于做靜態(tài)識(shí)別零售柜,其設(shè)備簡(jiǎn)單,成本低,易于擴(kuò)大規(guī)模[3-4]。但這種低成本的無(wú)人值守零售柜的技術(shù)難點(diǎn)是如何提高所售商品的檢測(cè)召回率和分類準(zhǔn)確度。為此,本文設(shè)計(jì)了一種新的基于雙神經(jīng)網(wǎng)絡(luò)模型的智能零售柜系統(tǒng),其售賣流程見(jiàn)圖1所示:通過(guò)手機(jī)掃碼開(kāi)門,客戶自助取貨;關(guān)門后系統(tǒng)智能識(shí)別,結(jié)算扣款。該系統(tǒng)力圖在有限的硬件支持下,利用雙神經(jīng)網(wǎng)絡(luò)模型,使其所售商品的檢測(cè)召回率和分類準(zhǔn)確度達(dá)到落地商用的要求。