








摘" 要: 為提升對多模態數據的管理效果,提高數據訪問速度并減輕數據庫負載,設計一種基于循環神經網絡的多模態數據層次化緩存系統。在DRAM/NVM混合內存模塊中,利用DRAM完成主存NVM的緩存。當DRAM存在緩存缺失時,利用訪問監控模塊內置高速采集卡來采集NVM上頻繁訪問4 KB數據塊的歷史訪問記錄,再將歷史訪問記錄編碼為訪問向量后構建訓練集,作為長短期記憶(LSTM)網絡的輸入,用于預測訪問頻率。在緩存過濾模塊中,將訪問頻率預測結果高于設定閾值部分的4 KB多模態數據讀取到DRAM中進行緩存。實驗結果顯示:所設計系統可最大程度地降低系統帶寬占用情況,TLB缺失率低,緩存執行效率較高,面對大頁面具備顯著緩存優勢。
關鍵詞: 多模態數據; 層次化緩存; 循環神經網絡; 長短期記憶(LSTM)網絡; DRAM; NVM; 訪問頻率
中圖分類號: TN919.5?34; TP303" " " " " " " " " 文獻標識碼: A" " " " " " " " " " "文章編號: 1004?373X(2025)04?0052?05
Design of multimodal data hierarchical caching system based on recurrent neural network
ZHANG Yan
(Xinjiang Normal University, Urumqi 830017, China)
Abstract: In order to improve the management effect of multimodal data, improve data access speed and reduce database load, a multimodal data hierarchical caching system based on recurrent neural networks is designed. In the DRAM/NVM hybrid memory module, DRAM is used to cache the main memory NVM. When there is a cache loss in DRAM, the high?speed acquisition card built?in in the access monitoring module is used to collect the historical access records of frequently accessed 4 KB data blocks on NVM. The historical access records are encoded as access vectors to construct a training set, which is used as input for the long short term memory network (LSTM) to predict access frequency. In the cache filtering module, the 4 KB multimodal data with predicted access frequency exceeding the set threshold is read into DRAM for caching. The experimental results show that the designed system can minimize the bandwidth usage of the system, and has low TLB miss rate, high cache execution efficiency, and significant caching advantages when facing large pages.
Keywords: multimodal data; hierarchical caching; recurrent neural network; long short term memory network; DRAM; NVM; access frequency
0" 引" 言
多模態數據是指由多種不同的數據模態組成的數據[1?3],這些模態可以是文本、圖像、音頻、視頻等不同類型。多模態數據的出現源于多種信息源和感知方式,每種模態都有獨特的特性和表達方式,可以提供更豐富、全面的信息,有助于提高數據的表達和傳輸效果[4?5]。但多模態數據具有較大數據量和復雜的特征,直接處理會消耗大量計算資源和時間。通過將多模態數據緩存在內存中,可以減少對原始數據的重復訪問和計算,提高數據處理的效率。同時,通過合理地設計緩存策略,可以根據數據的訪問頻率和重要程度進行分層存儲,提高緩存的命中率,從而進一步優化數據處理的性能[6]?!?br>