曾閩川 方勇 許益家



迄今為止,基于日志的異常檢測研究已經(jīng)取得了很多進(jìn)展,然而,在現(xiàn)實條件下仍舊存在兩個挑戰(zhàn):(1) 是日志數(shù)據(jù)通常以“數(shù)據(jù)孤島”形式儲存在不同的服務(wù)器上,單一公司或組織的日志數(shù)據(jù)中異常樣本量不足,且異常模式較為固定,很難通過這些數(shù)據(jù)訓(xùn)練出一個準(zhǔn)確率高的檢測模型. 為了解決這個問題,將不同來源的日志數(shù)據(jù)整合成更大的數(shù)據(jù)集可以提高模型訓(xùn)練的效果但可能會在數(shù)據(jù)傳輸過程中產(chǎn)生日志數(shù)據(jù)泄露問題;(2) 是不同應(yīng)用系統(tǒng)類型的日志數(shù)據(jù)通常在結(jié)構(gòu)和語法上存在差異,簡單地整合并用于訓(xùn)練模型效果不佳. 基于以上原因,本文提出一種基于聯(lián)邦遷移學(xué)習(xí)的日志異常檢測模型訓(xùn)練框架LogFTL,該框架利用基于匹配平均的聯(lián)邦學(xué)習(xí)算法,在保證客戶端數(shù)據(jù)隱私安全的前提下于服務(wù)器聚合客戶端的模型參數(shù)形成全局模型,再將全局模型分發(fā)給客戶端并基于客戶端的本地數(shù)據(jù)進(jìn)行遷移學(xué)習(xí),優(yōu)化客戶端本地模型針對自身常見異常行為的檢測能力. 經(jīng)過實驗表明,本文提出的LogFTL框架在聯(lián)邦學(xué)習(xí)場景下效果超過了傳統(tǒng)的日志異常檢測方法,同時也證明了該框架中遷移學(xué)習(xí)的效果.
日志異常檢測; 聯(lián)邦學(xué)習(xí); 遷移學(xué)習(xí); LSTM;? 數(shù)據(jù)孤島
TP391.1A2023.033002
收稿日期: 2023-01-04
基金項目: 國家自然科學(xué)基金(U20B2045)
作者簡介: 曾閩川(1998-), 男, 四川樂山人, 碩士研究生, 研究方向為網(wǎng)絡(luò)信息對抗.E-mail: 2422342691@qq.com
通訊作者: 許益家.E-mail: xuyijia@stu.scu.edu.cn
Research on application system log anomaly detection based on federated transfer learning
ZENG Min-Chuan, FANG Yong, XU Yi-Jia
(School of Cyber Science and Engineering, Sichuan University, Chengdu 610065, China)
Significant progress has been made in the research of log anomaly detection. However, two challenges still exist in reality. Firstly, log data is often stored on different servers, creating "data islands", the number of abnormal samples in the log data of a single company or organization is insufficient and the abnormal patterns are relatively limited,? it is a challenge to train a detection model with high accuracy through these data. Integrating log data from different sources can improve the model's performance but may result in log data leakage during transmission; Secondly,the log data of different application system types varies in log structure and syntax, and simple integration for training models is ineffective. To address these issues, this paper proposes a log anomaly detection training framework called? LogFTL based on federated transfer learning, which uses federated learning algorithm based on matching average. On the premise of ensuring the privacy and security of the client's data, LogFTL aggregates the model parameters of the client on the server side to form a global model which is then distributed? to the client side.? Using the client's local data, the LogFTL framework migrates and learns to optimize the clients local model and? the detection effect of local log data is improved.The experiment resluts show that the LogFTL framework proposed in this paper outperforms traditional log anomaly detection methods in federated learning scenarios, and demonstrate the? transfer learning effectiveness of LogFTL.
Log anomaly detection; Federal learning; Transfer learning; LSTM; Data islands
1 引 言
隨著云計算產(chǎn)業(yè)的……