

















摘" 要: 目前大多數的SLAM系統主要針對靜態場景,然而,在實際環境中不可避免地存在許多動態對象,這將大大降低算法的魯棒性和相機的定位精度。針對動態對象造成的軌跡偏差問題,文中提出一種結合目標檢測網絡和多視圖幾何結構的動態SLAM算法。首先,基于YOLOv5算法框架,將骨干網絡CSPDarkNet?53替換為輕量型L?FPN(Lite?FPN)結構,并使用VOC2007數據集進行預訓練。與YOLOv5s原始模型相比,新網絡的計算量減少了45.73%,檢測速率提高了31.90%;然后,將檢測物體劃分為高動態對象、中動態對象以及低動態對象,利用多視圖幾何方法計算閾值,并根據閾值對中高動態對象進行二次檢測,以決定是否剔除預測框中的特征點;最后,在TUM數據集上的實驗結果顯示,該方法在定位精度上平均提升了82.08%,證明了其在準確性方面的顯著改進。
關鍵詞: 同步定位與地圖構建; 動態環境; 多視圖幾何結構; 目標檢測; 特征點; 輕量型
中圖分類號: TN911.73?34; TP391.41" " " " " " " " "文獻標識碼: A" " " " " " " " "文章編號: 1004?373X(2025)01?0135?09
Dynamic SLAM algorithm based on object detection and multi?view geometry
YU Qingcang, DONG Genyang, FANG Caiwei, SUN Shusen
(School of Computer Science and Technology (School of Artificial Intelligence), Zhejiang Sci?Tech University, Hangzhou 310018, China)
Abstract: Nowadays, most SLAM (simultaneous localization and mapping) systems mainly focus on static scenes. However, there are many dynamic objects inevitably in the real environment, which will greatly reduce the robustness of the algorithm and the positioning accuracy of the camera. Therefore, a dynamic SLAM algorithm combining object detection network and multi?view geometric structure is proposed to get rid of the trajectory deviation caused by dynamic objects. On the basis of the framework of YOLOv5 algorithm, the backbone network CSPDarkNet?53 is replaced with a lightweight L?FPN (lightweight feature pyramid network) structure, and the dataset VOC2007 is used for pre?training. The parameters of the proposed network is reduced by 45.73%, and its detection rate is increased by 31.90% in comparison with those of the original model YOLOv5s. Then, the detected objects are categorized into high dynamic objects, medium dynamic objects and low dynamic objects. The multi?view geometric method is used to calculate the threshold value, and the medium and high dynamic objects are detected twice based on the threshold value, so as to decide whether to eliminate the feature points in the prediction frame. The experimental results on the dataset TUM show that the positioning accuracy of the proposed method is improved by 82.08% on average, demonstrating significant improvement in accuracy.
Keywords: SLAM; dynamic environment; multi?view geometry; object detection; feature point; lightweight
0" 引" 言
隨著研究的不斷深入,同步定位與建圖(Simultaneous Localization And Mapping, SLAM)的應用場景變得愈加廣泛[1]。在復雜的環境中,例如室內、隧道等GPS信號弱的地方,SLAM技術可以利用自身傳感器獲得感知信息,遞增地創建一個與周圍環境相一致的地圖,并利用創建的地圖實現自主定位[2?3]。SLAM的實現途徑根據使用的傳感器不同,主要分為聲吶SLAM、激光SLAM和視覺SLAM(VSLAM),相較于激光雷達,相機具有信息量大、靈活性高、成本低等優點[4]。雖然VSLAM在靜態場景中已經表現出了優異的性能,例如ORB?SLAM2[5]、ORB?SLAM3[6]。但由于靜態場景這一假設,當有大量動態對象存在于目標場景中時,視覺SLAM算法的性能會顯著下降,這也限制了其在實際動態環境中的應用[7]。……