王碩 王培良



摘 要:傳統的基于數據驅動的間歇過程故障診斷方法往往需要對過程數據的分布進行假設,而且對非線性等復雜數據的監控往往會出現誤報和漏報,為此提出一種基于長短期記憶網絡(LSTM)與批規范化(BN)結合的監督學習方法,不需要對原始數據的分布進行假設。首先,對間歇過程原始數據運用一種按變量展開并連續采樣的預處理方式,使處理后的數據可以向LSTM單元輸入;然后,利用改進的深層LSTM網絡進行特征學習,該網絡通過添加BN層,結合交叉熵損失的表示方法,可以有效提取間歇過程數據的特征并進行快速學習;最后,在一類半導體蝕刻過程上進行仿真實驗。實驗結果表明,所提方法比多元線性主成分分析(MPCA)方法故障識別的種類更多,可以有效地識別各類故障,對故障的整體檢測率達到95%以上;比傳統單層LSTM模型建模速度更快,且對故障的整體檢測率提高了8個百分點以上,比較適合處理間歇過程中具有非線性、多工況等特征的故障檢測問題。
關鍵詞:數據驅動;深度學習;長短期記憶網絡;間歇過程;故障檢測
中圖分類號: TP277
文獻標志碼:A
Abstract: Traditional fault detection methods for batch process based on data-driven often need to make assumptions about the distribution of process data, and often lead to false positives and false negatives when dealing with non-linear data and other complex data. To solve this problem, a supervised learning algorithm based on Long Short-Term Memory (LSTM) network and Batch Normalization (BN) was proposed, which does not need to make assumptions about the distribution of original data. Firstly, a preprocessing method based on variable-wise unfolding and continuous sampling was applied to the batch process raw data, so that the processed data could be input to the LSTM unit. Then, the improved deep LSTM network was used for feature learning. By adding the BN layer and the representation method of cross entropy loss, the network was able to effectively extract the characteristics of the batch process data and learned quickly. Finally, a simulation experiment was performed on a semiconductor etching process. The experimental results show that compared with Multilinear Principal Component Analysis (MPCA) method, the proposed method can identify more faults types, which can effectively identify various faults, and the overall detection rate of faults reaches more than 95%. Compared with the traditional single-LSTM model, it has higher recognition speed, and its overall detection rate of faults is increased by more than 8%, and it is suitable for dealing with fault detection problems with non-linear and multi-case characteristics in the batch process.
Key words: data driven; deep learning; Long Short-Term Memory (LSTM) network; batch process; fault detection
0 引言
隨著工業系統向大型化、復雜化方向發展,傳統數據驅動的故障診斷方法無法適應新時期這種工業大數據特性的故障診斷需求,具體表現在過程數據量大、種類多,且價值密度低。雖然數據維數多,但對監測診斷任務來說不一定都是有用、有價值的[1]。間歇生產過程[2]是一類復雜工業過程,指生產過程在同一位置但在不同的時間分批進行,操作狀態不穩定,過程參數隨時間而變,由于不同的操作階段具有不同的過程特性,使得監測變量會受到時間維度上的影響。傳統的故障診斷方法依據多元統計分析如主元分析(Principal Component Analysis, PCA)和偏最小二乘(Partial Least Square, PLS),在故障診斷中有著廣泛的應用[3-5],但是在具有多工序、非線性、非高斯等特點的間歇過程故障檢測中應用效果不理想;例如傳統PCA方法假定過程是線性的,特別是在確定霍特林T平方(Hotellings T-squared, T2)統計量和平方預測誤差(Squared Prediction Error, SPE)統計量的控制限時需要進行變量服從多元高斯分布的假設[6],這些假設在實際生產中通常難以滿足。文獻[7]中提出的基于支持向量數據描述(Support Vector Data Description, SVDD)的多時段間歇過程故障檢測,利用時間片數據樣本集構建的SVDD超球體半徑值與支持向量個數的變化劃分間歇過程的多時段,不需要假設過程數據服從正態分布及變量間線性相關,同時實現了多時段間歇過程的時段劃分和故障檢測;但在面對數據量大、種類多的間歇過程時,該方法建模速度較慢,易于過擬合。文獻[8]提出一種基于K近鄰規則的故障檢測方法,該方法在故障檢測過程中適應數據非線性和多工況的特點,在應用中取得較好的效果;但仍需要依據統計學中顯著性水平設置控制限,并假設原始數據為高斯分布,實驗結果顯示,對于非高斯分布等特征的復雜數據檢測存在一定的誤差。而利用深度學習中的長短期記憶網絡(Long Short-Term Memory,LSTM)單元[9],可以很好地學習并提取具有非線性、多時段或多工況的間歇過程的特征,并且不需要對原始數據分布進行假設,完全從過程數據中學習特征。
深度學習的概念起源于神經網絡的研究[10],有多個隱含層的多層感知器是深度學習模型的顯著特征。相對于普通人工神經網絡而言,深度學習算法具有更好地逼近復雜非線性函數的能力,并有許多方法來解決普通多層神經網絡存在的梯度消失、過擬合等問題,比起淺層神經網絡所需參數更少,且收斂速度和分類準確率都有所提升[10]。深度學習的基本模型是深度神經網絡(Deep Neural Network,DNN),在故障診斷領域,在此基礎上改進并出現了許多框架模型,包括深度置信網絡(Deep Belief Network, DBN)[11]、卷積神經網絡(Convolutional Neural Network, CNN)[12]、堆疊自動編碼器(Stacked Autoencoder, SAE)[13]、遞歸神經網絡(Recurrent Neural Network,RNN)[14]等。其中,RNN是一種帶有記憶單元的神經網絡,其特點是充分考慮了樣本批次之間的關聯關系,可用于處理時序數據或者前后關聯數據,適用于復雜設備或系統的實時故障診斷;如文獻[15]使用遞歸深度神經網絡實現了對風力發電系統的運行行為建模,構造了一種動態的神經網絡模型去模擬正常系統的行為,并通過比較真實系統和模型得出殘差,仿真表明該方法可在很短時間內實現故障檢測且誤報率非常低,也說明了RNN非常適用于處理與時間序列高度相關的問題。LSTM是對RNN的一種改進,可以有效改善RNN在疊加多層時的梯度消失問題[16]。
4 結語
本文針對間歇過程的故障檢測問題,建立了基于LSTM-BN的深度學習網絡,用于監測間歇過程的故障,并對一類半導體蝕刻過程進行仿真實驗,結果表明,基于LSTM-BN的深度學習網絡對于間歇過程的故障檢測是有效的,且具有很高的準確率。相比通用的MPCA方法和DNN-BN方法,LSTM-BN模型非常適用于處理與時間序列高度相關的問題,其優勢體現在不需要對原始數據的分布進行假設,而且可以很好地記憶時間序列的信息,比傳統的單層LSTM模型建模更快。
本文實驗中,由于故障集明顯少于正常集,對于有監督學習來說易于過擬合,而LSTM網絡模型可以不斷學習更新,在得到某個新樣本為故障而又無法檢測時,可以將此樣本再次通過損失函數進行參數更新,即在有更多數據時可以繼續學習新數據的特性來提高模型的檢測率和泛化能力,這是傳統的MPCA模型無法做到的。
參考文獻:
[1] 任浩,屈劍鋒,柴毅,等.深度學習在故障診斷領域中的研究現狀與挑戰[J].控制與決策,2017,32(8):1345-1358. (REN H, QU J F, CHAI Y, et al. Deep learning for fault diagnosis: The state of the art and challenge[J]. Control and Decision, 2017, 32(8):1345-1358.)
[2] 趙春暉,王福利,姚遠,等.基于時段的間歇過程統計建模、在線監測及質量預報[J].自動化學報,2010,36(3):366-374. (ZHAO C H, WANG F L, YAO Y, et al. Phase-based statistical modeling, online monitoring and quality prediction for batch processes [J]. Acta Automatica Sinica, 2010, 36(3): 366-374.)
[3] HUNG H, WU P, TU I, et al. On multilinear principal component analysis of order-two tensors [J]. Biometrika, 2012, 99(3): 569-583.
[4] WANG J, HE Q P, QIN S J, et al. Recursive least squares estimation for run-to-run control with metrology delay and its application to STI etch process [J]. IEEE Transactions on Semiconductor Manufacturing, 2005, 18(2): 309-319.
[5] YU J. Fault detection using principal components-based Gaussian mixture model for semiconductor manufacturing processes [J]. IEEE Transactions on Semiconductor Manufacturing, 2011, 24(3): 432-444.
[6] JACKSON J E, MUDHOLKAR G S. Control procedures for residuals associated with principal component analysis [J]. Technometrics, 2012, 21(3): 341-349.
[7] 王建林,馬琳鈺,邱科鵬,等.基于SVDD的多時段間歇過程故障檢測[J].儀器儀表學報,2017,38(11):2752-2761. (WANG J L, MA L Y, QIU K P, et al. Multi-phase batch processes fault detection based on support vector data description[J]. Chinese Journal of Scientific Instrument, 2017, 38(11): 2752-2761.)
[8] HE Q P, WANG J. Fault detection using the k-nearest neighbor rule for semiconductor manufacturing processes[J]. IEEE Transactions on Semiconductor Manufacturing, 2007, 20(4): 345-354.
[9] GRAVES A. Supervised Sequence Labelling with Recurrent Neural Networks[M]. Berlin: Springer, 2012: 37-45.
[10] HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks [J]. Science, 2006, 313(5786): 504-507.
[11] WU S, ZHANG L, ZHENG W, et al. A DBN-based risk assessment model for prediction and diagnosis of offshore drilling incidents [J]. Journal of Natural Gas Science and Engineering, 2016, 34: 139-158.
[12] SUN J, XIAO Z, XIE Y. Automatic multi-fault recognition in TFDS based on convolutional neural network [J]. Neurocomputing, 2017, 222: 127-136.
[13] LU C, WANG -Y, QIN W-L, et al. Fault diagnosis of rotary machinery components using a stacked denoising autoencoder-based health state identification [J]. Signal Processing, 2017, 130: 377-388.
[14] de TIM B, VERBERT K, BABUSKA R. Railway track circuit fault diagnosis using recurrent neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(3): 523-533.
[15] TALEBI N, SADRNIA M A, DARABI A. Robust fault detection of wind energy conversion systems based on dynamic neural networks [J]. Computational Intelligence and Neuroscience, 2014, 4(7): 580972
[16] PASCANU R, MIKOLOV T, BENGIO Y. On the difficulty of training recurrent neural networks [C]// Proceedings of the 30th International Conference on Machine Learning: Vol. 28. Atlanta, GA: JMLR, 2013, 28: 1310-1318.https://arxiv.org/pdf/1211.5063.pdf
[17] IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift [C]// Proceedings of the 32nd International Conference on Machine Learning: Vol. 37. Atlanta, GA: JMLR, 2015: 448-456.https://arxiv.org/pdf/1502.03167.pdf
[18] GOODFELLOW I, BENGIO Y, COURVILLE A, et al. Deep learning [M]. Cambridge, UK: MIT Press, 2016:172-187.
[19] DUCHI J, HAZAN E, SINGER Y. Adaptive subgradient methods for online learning and stochastic optimization [J]. Journal of Machine Learning Research, 2011, 12: 2121-2159.
[20] WISE B M, GALLAGHER N B, BUTLER S W, et al. A comparison of principal component analysis, multiway principal component analysis, trilinear decomposition and parallel factor analysis for fault detection in a semiconductor etch process [J]. Journal of Chemometrics, 1999, 13(3/4): 379-396.
[21] 常玉清,王姝,譚帥,等.基于多時段MPCA模型的間歇過程監測方法研究[J].自動化學報,2010,36(9):1312-1320. (CHANG Y Q, WANG S, TAN S, et al. Research on multistage-based MPCA modeling and monitoring method for batch processes[J]. Acta Automatica Sinica, 2010, 36(9):1312-1320.)
[22] 陶棟琦,薄翠梅,易輝.基于多時段MPCA的半導體蝕刻過程監測方法[J].傳感技術學報,2015,28(6):798-802. (TAO D Q, BO C M, YI H. Semiconductor etch process monitoring based on multi-stage MPCA [J]. Chinese Journal of Sensors and Actuators, 2015, 28(6): 798-802.)
[23] GLOROT X, BORDES A, BENGIO Y. Deep sparse rectifier neural networks [C]//Proceedings of the 2011 Fourteenth International Conference on Artificial Intelligence and Statistics: Vol. 15. Atlanta, GA: JMLR, 2011: 315-323.