陳 燕,王佳盛,曾澤欽,鄒湘軍,陳明猷
大視場下荔枝采摘機器人的視覺預定位方法
陳 燕,王佳盛,曾澤欽,鄒湘軍※,陳明猷
(1. 華南農業大學工程學院,廣州 510642; 2. 華南農業大學南方農業機械與裝備關鍵技術教育部重點實驗室,廣州 510642)
機器人采摘荔枝時需要獲取多個目標荔枝串的空間位置信息,以指導機器人獲得最佳運動軌跡,提高效率。該文研究了大視場下荔枝采摘機器人的視覺預定位方法。首先使用雙目相機采集荔枝圖像;然后改進原始的YOLOv3網絡,設計YOLOv3-DenseNet34荔枝串檢測網絡;提出同行順序一致性約束的荔枝串配對方法;最后基于雙目立體視覺的三角測量原理計算荔枝串空間坐標。試驗結果表明,YOLOv3-DenseNet34網絡提高了荔枝串的檢測精度與檢測速度;平均精度均值(mean average precision,mAP)達到0.943,平均檢測速度達到22.11幀/s。基于雙目立體視覺的荔枝串預定位方法在3 m的檢測距離下預定位的最大絕對誤差為36.602 mm,平均絕對誤差為23.007 mm,平均相對誤差為0.836%,滿足大視場下采摘機器人的視覺預定位要求,可為其他果蔬在大視場下采摘的視覺預定位提供參考。
機器人;圖像處理;目標檢測;荔枝采摘;大視場;卷積神經網絡;立體視覺
研發荔枝采摘機器人,實現荔枝采摘的自動化與智能化,是解決國內的荔枝采摘作業自動化程度低的重要途徑。視覺系統是荔枝采摘機器人的重要組成部分[1],以機器視覺為主的定位技術近年來被廣泛應用到農業領域[2-4]。視覺定位算法是視覺系統的關鍵,其性能直接影響荔枝采摘機器人的采摘效率和質量。因此荔枝采摘視覺定位技術具有重要研究意義。
華南農業大學鄒湘軍教授團隊對荔枝視覺采摘機器人開展了大量的研究[5-8]。該團隊提出了自然環境下的荔枝分割方法[9-12]。此外,國內還有許多研究者在各類果實采摘的識別定位進行了研究[13-16]。但上述研究都是基于小視場、僅有一兩串荔枝的場景。而借鑒國外果蔬采摘的經驗[17-18],在機器人到達作業范圍之前對荔枝樹整體的果實分布做預定位,可指導機器人運動到采摘位置,再進行精確的采摘點定位,從而提高機器人采摘效率。大視場是指相機的視野覆蓋范圍較廣,但是在這個條件下,相機視野范圍內會出現多串荔枝,這提高了荔枝串的定位難度。因此,有必要對大視場下荔枝采摘機器人的視覺預定位進行研究。
近年來,隨著深度學習,特別是卷積神經網絡的發展,有許多學者利用卷積神經網絡進行分類、分割、識別與檢測[19-32]。如文獻[20]在VGGNet的基礎上優化網絡結構,提高番茄主要器官的特征提取能力,并通過Selective Search生產檢測區域,實現不同種類、不同成熟度的番茄主要器官的檢測。文獻[28-29]分別使用YOLO算法對采摘目標進行了識別、定位并取得不錯的結果。因此,使用深度學習方法有助于荔枝果串的預定位。
試驗設備由硬件設備與軟件組成,硬件設備主要包括:2臺GigE工業相機構成的雙目立體視覺系統,型號為維視 MV-EM200C,分辨率1600×1200像素,幀率60幀/s,鏡頭焦距為16 mm;博世激光測距儀,型號為GLM50,有效測量范圍0.05~50 m,測量精度±1.5 mm;維視高精度圓點標定板,圓點數量為9×11個,圓心距離(30±0.01)mm;筆記本電腦,主要配置:i7-7700HQ處理器;16 G,2 400 MHz內存;GTX1060 6G顯卡。
軟件系統主要以OpenCV視覺庫與DarkNet深度學習框架為基礎編寫而成。
在拍照采樣之前,需要對雙目立體視覺系統進行標定。根據三角測量原理,基線距離越大,測量精度越高,但是基線距離越大,2個相機的公共視場越小。為了保證在較高的精度下有較大的公共視場,經過多次調試后選擇基線距離為110 mm。為確保圖像的準確度,相機標定在大視場范圍內進行,即相機與目標果實的距離為2.5~3 m。在采集圖像前,使用圓點標定板完成相機雙目立體視覺系統的標定。
試驗圖像的拍攝時間為2018年6-7月,拍攝地點為廣州市增城區和廣州市從化區。在野外環境下采集大視場范圍下的荔枝圖像,并用激光測距儀測量荔枝串的距離,用于與本文算法所得結果進行比對。共采集雙目圖像250對。由于樣本數據較小,容易出現過擬合,因此需要對原圖與極線校正后的圖像使用了小范圍的隨機裁剪、縮放對樣本進行擴充,最終的圖片數據集為4 000張。最后借助開源工具LabelImg制作目標檢測網絡的數據集。
在大視場條件下,圖像背景復雜,如果直接對全圖進行稠密立體匹配,匹配效率低且效果差。另外,如圖1中藍色框與紅色框所示,部分荔枝串無法完全同時出現在公共視場中,這會影響荔枝串圖像的模板匹配,從而難以準確定位荔枝串。因此,本文首先對左、右目圖像做目標檢測,在目標檢測的基礎上提出基于同行順序一致性約束的荔枝串配對算法,根據三角測量原理,以各串荔枝中心的視差計算出荔枝串的三維空間坐標。

注:黃色框表示公共視場中圖像完整;藍色框表示公共視場中圖像有部分缺失;紅色框表示公共視場中圖像完全缺失。
1.3.1 荔枝串目標檢測
借鑒YOLOv3[30]目標檢測網絡以及DenseNet[31]分類網絡,并結合荔枝串檢測任務的場景單一(僅為果園環境)、目標單一的特點優化網絡結構,設計了深度為34層的密集卷積層(下文稱為Dense Module),基于Dense Module設計荔枝串檢測網絡YOLOv3-DenseNet34。
由卷積層(convolution,Conv),批歸一化層(batch normalization,BN)以及激活層(leaky ReLU)組成一個基本組件層(DarkNet convolution, batch normalization, leaky ReLU, 下文稱為DBL)(如圖2左下角),其中DBL(1×1)指卷積層的卷積核大小為1×1。多個DBL層組成一個DBL模塊(如圖2右下角);多個DBL模塊組成Dense Module,模塊之間的連接模式如圖2所示。

圖2 Dense Module結構示意圖
YOLOv3-DenseNet34的先驗框尺寸通過對樣本集所有圖像中荔枝的寬高進行K-means聚類獲得。根據樣本的尺度分布,聚類時選取聚類數為6。最終得到的先驗框聚類結果為(20, 20),(33, 27),(26, 39),(48, 49),(32, 56),(57, 95)。
根據上述聚類結果可知,最大的先驗框邊長為95,使用3×3卷積的感受野,可知YOLOv3-DenseNet34的下采樣次數為5。
為了不損失原始數據,YOLOv3-DenseNet34使用步長為2的卷積來代替最大池化(max pooling)進行下采樣。下采樣次數與卷積感受野、先驗框邊長存在以下關系:

式中為卷積的感受野尺寸;為下采樣次數;為最大的先驗框邊長。
本文設計的YOLOv3-DenseNet34目標檢測網絡結構如圖3所示。其中DBL(步長=2)即為代替下采樣的卷積層。該網絡使用包含4個Dense Module的34層卷積backbone提取多尺度特征,使用3個不同尺度的特征圖做預測輸出,即圖3中的1,2,3,其中1、2、3分別下采樣5、4、3次。每個尺度預測2個輸出,每個輸出包含目標的位置坐標和尺度在不同方向上的偏移量、置信度和目標類別的one-hot共6個數據,因此預測輸出的深度均為12。
1.3.2 基于雙目立體視覺的荔枝串預定位
完成相機的單目與雙目標定后,需要對左右圖像對應點做立體匹配,然后計算匹配點視差,最后根據三角測量原理計算匹配點的三維坐標。如果對整幅大視場的荔枝圖像進行稠密立體匹配,計算量會很大,并且容易出現誤匹配,即使完成了全局的立體匹配,仍然不能得到各串荔枝的位置信息。
檢測到圖像中的荔枝串后,可使用直接模板匹配方法,直接以左目圖像的荔枝串檢測結果為模板,在右目圖像上做模板匹配,將匹配得分最高的點作為匹配點,從而實現稀疏的立體匹配。但是直接模板匹配需要對左目圖像中的每個荔枝串都在整幅右目圖像上做搜索,計算量大,并且容易出現誤匹配,如圖4所示。圖4中左右圖像中相同的數字代表直接模板匹配算法認為是同一荔枝串的區域。可以明顯看出第5、6、8串荔枝出現了誤匹配。
為了解決上述問題,在完成荔枝串檢測的基礎上,提出基于同行順序一致性約束的一種稀疏立體匹配算法。同行順序一致性約束的荔枝串配對方法是在外極線矯正后進行。以左目圖像的目標檢測結果為模板,根據行約束在同行內搜索匹配圖像,以減小搜索范圍。另外對于光軸平行式雙目立體視覺模型,空間點在右目圖像的軸坐標一定比左目圖像的小。因此,如果模板圖像在左目圖像的右下角的橫坐標為x,則它在右目圖像的搜索范圍x可限制在0~x之間,這樣可以進一步減小搜索范圍。基于同行順序一致性約束的匹配方法可以減少搜索范圍,提高匹配速度,減少誤匹配。同行順序一致性約束的匹配范圍如圖5所示。

圖3 YOLOv3-DenseNet34網絡結構示意圖

注:黃色框表示荔枝串在左目圖像中的檢測狀況;紫色框表示荔枝串在右目圖像中的檢測狀況。

注:xl為目標在左目圖上的橫坐標;xr為目標在右目圖像上的橫坐標。
為了剔除同行順序一致性匹配方法的誤匹配,計算每個候選匹配區域與右目圖像目標檢測結果的重合度,每個候選匹配區域保留重合度最大的目標檢測結果作為其配對結果。然后剔除不重合或者重合度極低(IoU<0.2)的配對。
最后,對上述匹配結果修正。在右目圖像上取配對結果的重合區域(圖6b中白色框)作為模板,在左目圖像上用基于同行順序一致性約束的匹配方法進行模板匹配。但此時約束范圍稍有變化:假設重合區域在右目圖像的左上角橫坐標為x,則滑窗檢索的范圍在與重合區域同行的(x,)內,其中為圖像寬度。修正效果如圖6中白色框所示。

注:黃色框表示目標檢測結果;紫色框表示左目圖像目標檢測結果在右目圖像上的匹配狀況;白色框表示左、右目圖像的匹配結果重合區域。
1.3.3 亞像素視差計算
荔枝串配對后,需確定匹配點用于計算視差。配對框大小相同時,左、右目圖像的中心點視差與配對框左上角的視差一致。為克服視差的計算結果為像素級,設計了一種計算亞像素級視差的方法,流程如下:先計算配對框的相似度和視差;然后計算相像素級精度下鄰視差的匹配相似度,此時包含原匹配點和相似度總共可以確定視差-相似度平面內的3個點(如圖7點1、2、3),這3個點可以唯一確定一條二次曲線(如圖7曲線)。最后求解該二次曲線頂點(如圖7點),頂點的橫坐標即為亞像素精度下的視差。得到視差后即可計算匹配點的三維空間坐標。

注:p1、p2、p3為原匹配點和相似度所確定視差-相似度平面內的3個點;t為二次曲線的頂點。
1.3.4 預定位誤差計算
匹配點在左相機坐標系下的坐標是無法直接測量的,故無法直接計算3個坐標值之間的誤差。因此采用空間點的距離誤差來衡量定位誤差。具體計算方法如下:

式中為測量誤差,mm;為視覺測量距離,mm;d激光測量距離,mm;,,為視覺測量激光點的坐標值,mm。
試驗數據的采集時間、地點以及采集設備同1.1、1.2節。試驗過程中,首先完成雙目立體視覺的標定。然后調整三腳架云臺的位置,使激光點落在某串荔枝果實上,并鎖死三腳架云臺,待激光測距儀數值穩定后記錄激光測量距離t,同時讓2臺相機同時采樣荔枝圖像。不斷重復上述過程,共記錄30組數據和30對荔枝圖像。30對荔枝圖像均以圖像激光點為中心,選取一個固定大小的區域作為目標檢測結果,然后使用基于同行順序一致性約束的匹配方法進行匹配和視差計算,計算匹配點三維坐標并得出荔枝果實到相機的距離。最后計算與t之間的誤差。
DarkNet[29]是YOLOv3的骨干網絡,用于進行網絡構建與訓練,DarkNet53與DenseNet34的訓練參數設置如表1所示。

表1 網絡訓練參數設置
根據前人研究[29-31],采用Loss值表示損失狀況,可用于衡量網絡的正確性與收斂狀況。本文網絡訓練過程中前1 000次迭代的Loss數值很大而且沒有意義,曲線從第1 000次迭代開始記錄,如圖8所示。
由圖8可知,2種網絡在前2 000次迭代中迅速擬合,之后偏向穩定,YOLOv3-DenseNet34的Loss值比原始網絡下降慢,但最后均能收斂。表明本文所設計的網絡結構可靠。

圖8 荔枝串檢測網絡訓練過程Loss曲線
使用平均精度均值[33-34](mean average precision,mAP)指標來衡量荔枝串檢測精度,它能很好地反映目標檢測網絡的識別能力,是目前目標檢測領域最常用的指標。用幀率(frame per second,FPS)來表示模型的檢測速度。其中mAP計算公式如下:

式中為準確率;tp為正例正確地分類為正例的數量;fp為負例錯誤地分類為正例的數量;A為平均精度;為識別圖像總數;mAP為平均精度均值;C為識別類別總數。
統計試驗所得mAP、平均檢測速度與模型大小,結果如表2所示。

表2 荔枝串檢測網絡的性能對比
由表2可知,YOLOv3-DenseNet34檢測速度比原始的YOLOv3提高約0.6倍,達到22.11幀/s,同時mAP提高5.6%,達到0.943,模型大小只有9.3 MB,僅為原始網絡的1/26。由此可見,本文改進的荔枝串檢測網絡YOLOv3-DenseNet34與原始YOLOv3模型在檢測速度與檢測精度以及模型參數大小上都有改進和提高。
荔枝串預定位的激光測量值、視覺測量值、測量誤差等數據如表3所示。計算可得雙目立體視覺荔枝串預定位的最大絕對誤差為33.602 mm,平均絕對誤差為23.007 mm,標準差為7.434 mm,平均相對誤差為0.836%,表明本文方法檢測精度高,滿足預定位要求。

表3 荔枝預定位視覺測量值及其誤差
本文研究了大視場下荔枝采摘機器人視覺預定位方法。通過改進的原始的YOLOv3,設計了荔枝串檢測網絡YOLO-DenseNet34;提出了同行順序一致性約束的荔枝串配對方法;最后基于雙目立體視覺的三角測量原理計算荔枝串空間坐標。試驗結果表明,YOLOv3-DenseNet34網絡提高了荔枝串的檢測精度與檢測速度;mAP值達到0.943,平均檢測速度達到22.11幀/s。基于雙目立體視覺的荔枝串預定位方法在3 m的檢測距離下預定位的最大絕對誤差為36.602 mm,平均絕對誤差為23.007 mm,平均相對誤差為0.836%。本文所研究的大視場下荔枝采摘機器人視覺預定位方法在精度與速度上都能滿足大視場下采摘視覺預定位要求,可為其他果蔬大視場下采摘的視覺預定位提供參考。
[1]程祥云,宋欣. 果蔬采摘機器人視覺系統研究綜述[J]. 浙江農業科學,2019,60(3):490-493.
Cheng Xiangyun, Song Xin. A review of research on vision system of fruit and vegetable picking robot[J]. Journal of Zhejiang Agricultural Sciences. 2019, 60(3): 490-493. (in Chinese with English abstract)
[2]羅陸鋒,鄒湘軍,程堂燦,等. 采摘機器人視覺定位及行為控制的硬件在環虛擬試驗系統設計[J]. 農業工程學報,2017,33(4):39-46.
Luo Lufeng, Zou Xiangjun, Cheng Tangcan, et al. Design of virtual test system based on hardware-in-loop for picking robot vision localization and behavior control[J].Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(4): 39-46. (in Chinese with English abstract)
[3]熊俊濤,何志良,湯林越,等. 非結構環境中擾動葡萄采摘點的視覺定位技術[J]. 農業機械學報,2017,48(4):29-33,81.
Xiong Juntao, He Zhiliang, Tang Linyue, et al. Visual localization of disturbed grape picking point in non-structural environment[J]. Transactions of the Chinese Society for Agricultural Machinery, 2017, 48(4): 29-33, 81. (in Chinese with English abstract)
[4]朱镕杰,朱穎匯,王玲,等. 基于尺度不變特征轉換算法的棉花雙目視覺定位技術[J]. 農業工程學報,2016,32(6):182-188.
Zhu Rongjie, Zhu Yinghui, Wang Ling, et al. Cotton positioning technique based on binocular vision with implementation of scale invariant feature transform algorithm[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2016, 32(6): 182-188. (in Chinese with English abstract)
[5]葉敏,鄒湘軍,羅陸鋒,等. 荔枝采摘機器人雙目視覺的動態定位誤差分析[J]. 農業工程學報,2016,32(5):50-56.
Ye Min, Zou Xiangjun, Luo Lufeng, et al. Error analysis of dynamic localization tests based on binocular stereo vision on litchi harvesting manipulator[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2016, 32(5): 50-56. (in Chinese with English abstract)
[6]Zou X, Ye M, Luo C, et al. Fault-tolerant design of a limited universal fruit-picking end-effector based on vision-positioning error[J]. Applied Engineering in Agriculture, 2016, 32(1): 5-18.
[7]Zou X, Zou H, Lu J. Virtual manipulator-based binocular stereo vision positioning system and errors modelling[J]. Machine Vision and Applications. 2012, 23(1): 43-63.
[8]陳燕,鄒湘軍,徐東風,等. 荔枝采摘機械手機構設計及運動學仿真[J]. 機械設計,2010,27(5):31-34.
Chen Yan, Zou Xiangjun, Xu Dongfeng, et al. Mechanism design and kinematics simulation of litchi picking manipulator[J]. Journal of Machine Design, 2010, 27(5): 31-34. (in Chinese with English abstract)
[9]熊俊濤,鄒湘軍,陳麗娟,等. 基于機器視覺的自然環境中成熟荔枝識別[J]. 農業機械學報,2011,42(9):162-166.
Xiong Juntao, Zou Xiangjun, Chen Lijuan, et al. Recognition of mature litchi in natural environment based on machine vision[J]. Transactions of the Chinese Society for Agricultural Machinery, 2011, 42(9): 162-166. (in Chinese with English abstract)
[10]郭艾俠,鄒湘軍,朱夢思,等. 基于探索性分析的的荔枝果及結果母枝顏色特征分析與識別[J]. 農業工程學報,2013,29(4):191-198.
Guo Aixia, Zou Xiangjun, Zhu Mengsi, et al. Color feature analysis and recognition for litchi fruits and their main fruit bearing branch based on exploratory analysis[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2013, 29(4): 191-198. (in Chinese with English abstract)
[11]熊俊濤,鄒湘軍,王紅軍,等. 基于Retinex圖像增強的不同光照條件下的成熟荔枝識別[J]. 農業工程學報,2013,29(12):170-178.
Xiong Juntao, Zou Xiangjun, Wang Hongjun, et al. Recognition of ripe litchi in different illumination conditions based on Retinex image enhancement[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2013, 29(12): 170-178. (in Chinese with English abstract)
[12]彭紅星,鄒湘軍,陳麗娟,等. 基于雙次Otsu算法的野外荔枝多類色彩目標快速識別[J]. 農業機械學報,2014,45(4):61-68.
Peng Hongxing, Zou Xiangjun, Chen Lijuan, et al. Fast recognition of multiple color targets of litchi image in field environment based on double otsu algorithm[J]. Transactions of the Chinese Society for Agricultural Machinery, 2014, 45(4): 61-68. (in Chinese with English abstract)
[13]Fu Longsheng, Tola Elkamil, Al-Mallahi Ahmad, et al. A novel image processing algorithm to separate linearly clustered kiwifruits[J]. Biosystems Engineering, 2019, 183: 184-195.
[14]傅隆生,孫世鵬,Vázquez-Arellano Manuel,等. 基于果萼圖像的獼猴桃果實夜間識別方法[J]. 農業工程學報,2017,33(2):199-204.
Fu Longsheng, Sun Shipeng, Vázquez-Arellano Manuel, et al. Kiwifruit recognition method at night based on fruit calyx image[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(2): 199-204. (in Chinese with English abstract)
[15]梁喜鳳,金超杞,倪梅娣,等. 番茄果實串采摘點位置信息獲取與試驗[J]. 農業工程學報,2018,34(16):163-169.
Liang Xifeng, Jin Chaoqi, Ni Meidi, et al. Acquisition and experiment on location information of picking point of tomato fruit clusters[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(16): 163-169. (in Chinese with English abstract)
[16]李寒,張漫,高宇,等. 溫室綠熟番茄機器視覺檢測方法[J]. 農業工程學報,2017,33(增刊1):328-334,388.
Li Han, Zhang Man, Gao Yu, et al. Green ripe tomato detection method based on machine vision in greenhouse[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(Supp.1): 328-334, 388. (in Chinese with English abstract)
[17]Van Henten E J, Van Tuijl B A J, Hemming J, et al. Field Test of an Autonomous Cucumber Picking Robot[J]. Biosystems Engineering. 2003, 86(3): 305-313.
[18]Mehta S S, Burks T F. Vision-based control of robotic manipulator for citrus harvesting[J]. Computers and Electronics in Agriculture. 2014, 102: 146-158.
[19]薛金林,閆嘉,范博文. 多類農田障礙物卷積神經網絡分類識別方法[J]. 農業機械學報,2018,49(S1):35-41.
Xue Jinlin, Yan Jia, Fan Bowen. Classification and identification method of multiple kinds of farm obstacles based on convolutional neural network[J]. Transactions of the Chinese Society for Agricultural Machinery, 2018, 49(S1): 35-41. (in Chinese with English abstract)
[20]周云成,許童羽,鄭偉,等. 基于深度卷積神經網絡的番茄主要器官分類識別方法[J]. 農業工程學報,2017,33(15):219-226.
Zhou Yuncheng, Xu Tongyu, Zheng Wei, et al. Classification and recognition approaches of tomato main organs based on DCNN[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(15): 219-226. (in Chinese with English abstract)
[21]傅隆生,馮亞利,Elkamil Tola,等. 基于卷積神經網絡的田間多簇獼猴桃圖像識別方法[J]. 農業工程學報,2018,34(2):205-211.
Fu Longsheng, Feng Yali, Elkamil Tola, et al. Image recognition method of multi-cluster kiwifruit in field based on convolutional neural networks[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(2): 205-211. (in Chinese with English abstract)
[22]陳鋒軍,王成翰,顧夢夢,等. 基于全卷積神經網絡的云杉圖像分割算法[J]. 農業機械學報,2018,49(12):188-194.
Chen Fengjun, Wang Chenghan, Gu Mengmeng, et al. Spruce image segmentation algorithm based on fully convolutional networks[J]. Transactions of the Chinese Society for Agricultural Machinery, 2018, 49(12): 188-194. (in Chinese with English abstract)
[23]韓巧玲,趙玥,趙燕東,等. 基于全卷積網絡的土壤斷層掃描圖像中孔隙分割[J]. 農業工程學報,2019,35(2):128-133.
Han Qiaoling, Zhao Yue, Zhao Yandong, et al. Soil pore segmentation of computed tomography images based on fully convolutional network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(2): 128-133. (in Chinese with English abstract)
[24]高云,郭繼亮,黎煊,等. 基于深度學習的群豬圖像實例分割方法[J]. 農業機械學報,2019,50(4):179-187.
Gao Yun, Guo Jiliang, Li Xuan, et al. Instance-level segmentation method for group pig images based on deep learning[J]. Transactions of the Chinese Society for Agricultural Machinery, 2019, 50(4): 179-187. (in Chinese with English abstract)
[25]王丹丹,何東健. 基于R-FCN深度卷積神經網絡的機器人疏果前蘋果目標的識別[J]. 農業工程學報,2019,35(3):156-163.
Wang Dandan, He Dongjian. Recognition of apple targets before fruits thinning by robot based on R-FCN deep convolution neural network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(3): 156-163. (in Chinese with English abstract)
[26]楊國國,鮑一丹,劉子毅. 基于圖像顯著性分析與卷積神經網絡的茶園害蟲定位與識別[J]. 農業工程學報,2017,33(6):156-162.
Yang Guoguo, Bao Yidan, Liu Ziyi. Localization and recognition of pests in tea plantation based on image saliency analysis and convolutional neural network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(6): 156-162. (in Chinese with English abstract)
[27]畢松,高峰,陳俊文,等. 基于深度卷積神經網絡的柑橘目標識別方法[J]. 農業機械學報,2019,50(5):181-186.
Bi Song, Gao Feng, Chen Junwen, et al. Detection method of citrus based on deep convolution neural network[J]. Transactions of the Chinese Society for Agricultural Machinery, 2019, 50(5): 181-186. (in Chinese with English abstract)
[28]趙德安,吳任迪,劉曉洋,等. 基于YOLO深度卷積神經網絡的復雜背景下機器人采摘蘋果定位[J]. 農業工程學報,2019,35(3):164-173.
Zhao Dean, Wu Rendi, Liu Xiaoyang, et al. Apple positioning based on YOLO deep convolutional neural network for picking robot in complex background[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(3): 164-173. (in Chinese with English abstract)
[29]薛月菊,黃寧,涂淑琴,等. 未成熟芒果的改進YOLOv2識別方法[J]. 農業工程學報,2018,34(7):173-179.
Xue Yueju, Huang Ning, Tu Shuqin, et al.Immature mango detection based on improved YOLOv2[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(7): 173-179.(in Chinese with English abstract)
[30]Redmon J, Farhadi A. Yolov3: An incremental improvement[R]. arXiv, 2018.
[31]Huang G, Liu Z, Maaten L V D, et al. Densely Connected Convolutional Networks[C]//CVPR. IEEE Computer Society, 2017.
[32]Lin G, Tang Y, Zou X. et al. Fruit detection combined with color, depth, and shape information[J/OL]. Precision Agriculture. https://doi.org/10.1007/s11119-019-09654-w, 2019-06-29.
[33]劉挺,秦兵,張宇. 信息檢索系統導論[M]. 北京:機械工業出版社,2008.
[34]Wu Shengli, McClean Sally. Lecture Notes in Computer Science[M]. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006.
Vision pre-positioning method for litchi picking robot under large field of view
Chen Yan, Wang Jiasheng, Zeng Zeqin, Zou Xiangjun※, Chen Mingyou
(1510642,; 2.,,510642,)
Litchi picking robot is an important tool for improving the automation of litchi picking operation. The spatial position information of litchi cluster needs to be acquired when the robot picks litchi normally. In order to guide the robot moving to the picking position and improve the picking efficiency, the vision pre-positioning method of litchi picking robot under large field of view is proposed in this paper studied. Firstly, using the binocular stereo vision system composed of two industrial cameras that have been calibrated, 250 pairs of litchi cluster images under large field of view was taken in the litchi orchard in Guangzhou, the spatial positions of key litchi clusters were recorded by using a laser range finder, and the results were compared with those tested in the paper. In order to expand the sample size, the original image and the polar line correction image were randomly cropped and scaled in a small range, and the final image data set was 4 000 sheets. After that, by using LabelImg, the data set of the target detection network was created. Secondly, by using the YOLOv3 network and the DenseNet classification network, combined with the characteristics of single target and single scene of litchi cluster detection task (only for orchard environment), the network structure was optimized, a Dense Module with a depth of 34 layers and a litchi cluster detection network YOLOv3-DenseNet34 based on the Dense Module was designed. Thirdly, Because of the the complexity of the background image under large field of view, the dense stereo matching degree of the whole image is low and the effect is poor, at the same time, some litchi clusters can not appear in the public view of the image at the same time, therefore, a method for calculating sub-pixel parallax was designed, peer-to-peer sequential consistency constraint matching method was proposed. By solving the quadratic curve composed of parallax and similarity, the parallax under sub-pixel was used to calculate the spatial positions of the litchi cluster. Through the comparison with the original network of YOLOv3, the test network performance of the paper was tested, and found that the YOLOv3-DenseNet34 network improved the detection accuracy and detection speed of the litchi cluster, the mAP (mean average precision) value was 0.943, the average detection speed was 22.11 frame/s and the model size was 9.3 MB, which was 1/26 of the original network of YOLOv3. Then, the detection results of the method were compared with the results of the laser range finder. The max absolute error of the pre-positioning at the detection distance of 3 m was 36.602 mm, the mean absolute error was 23.007 mm, and the average relative error was 0.836%. Test results showed that the vision pre-positioning method studied in this paper can basically meet the requirements of vision pre-positioning under large field of view in precision and speed. And this method can provide reference for other vision pre-positioning methods under large field of view of fruits and vegetables picking.
robs; image processing; object detection; litchi picking; large field of view; convolutional neural network; stereo vision
陳 燕,王佳盛,曾澤欽,鄒湘軍,陳明猷. 大視場下荔枝采摘機器人的視覺預定位方法[J]. 農業工程學報,2019,35(23):48-54.doi:10.11975/j.issn.1002-6819.2019.23.006 http://www.tcsae.org
Chen Yan, Wang Jiasheng, Zeng Zeqin, Zou Xiangjun, Chen Mingyou. Vision pre-positioning method for litchi picking robot under large field of view[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(23): 48-54. (in Chinese with English abstract) doi:10.11975/j.issn.1002-6819.2019.23.006 http://www.tcsae.org
2019-06-30
2019-11-11
國家自然科學基金資助項目(31571568);廣東省自然科學基金項目(2018A030307067)
陳 燕,副教授,主要從事農業機器人、農業智能裝備和智能設計與制造的研究,Email:cy123@scau.edu.cn
鄒湘軍,教授,博士生導師,主要從事農業機器人、機器視覺的研究,Email:xjzou1@163.com
10.11975/j.issn.1002-6819.2019.23.006
TP391.41
A
1002-6819(2019)-23-0048-07