999精品在线视频,手机成人午夜在线视频,久久不卡国产精品无码,中日无码在线观看,成人av手机在线观看,日韩精品亚洲一区中文字幕,亚洲av无码人妻,四虎国产在线观看 ?

Multi-view feature fusion for rolling bearing fault diagnosis using random forest and autoencoder

2019-10-15 05:47:24SunWenqingDengAidongDengMinqiangZhuJingZhaiYimengChengQiangLiuYang

Sun Wenqing Deng Aidong Deng Minqiang Zhu Jing Zhai Yimeng Cheng Qiang Liu Yang

(National Engineering Research Center of Turbo-Generator Vibration, Southeast University, Nanjing 210096, China)(School of Energy and Environment, Southeast University, Nanjing 210096, China)

Abstract:To improve the accuracy and robustness of rolling bearing fault diagnosis under complex conditions, a novel method based on multi-view feature fusion is proposed. Firstly, multi-view features from perspectives of the time domain, frequency domain and time-frequency domain are extracted through the Fourier transform, Hilbert transform and empirical mode decomposition (EMD).Then, the random forest model (RF) is applied to select features which are highly correlated with the bearing operating state. Subsequently, the selected features are fused via the autoencoder (AE) to further reduce the redundancy. Finally, the effectiveness of the fused features is evaluated by the support vector machine (SVM). The experimental results indicate that the proposed method based on the multi-view feature fusion can effectively reflect the difference in the state of the rolling bearing, and improve the accuracy of fault diagnosis.

Key words:multi-view features; feature fusion; fault diagnosis; rolling bearing; machine learning

The rolling bearing is one of key parts in a wind turbine drive train. Due to the harsh operating environment of the wind turbine, rolling bearing failures occur frequently. According to statistics, 30% of the rotating machinery failures are caused by rolling bearings[1]and about 80% of the wind turbine gearbox failures are caused by bearing failure[2]. Therefore, bearing fault diagnosis is essential for efficient and reliable operation of the wind turbines.

Traditionally, the fault diagnosis of wind turbine rolling bearing is based on the spectrum analysis of vibration signals[3]. The key technology of it is to extract the fault characteristic frequency from noisy signals. Methods of spectrum analysis include the Fourier transform[4], Hilbert transform along with some joint time-frequency analysis methods such as empirical mode decomposition (EMD)[5]and variational mode decomposition (VMD). The traditional methods only study the bearing vibration signal from a certain perspective, and the features are manually extracted depending on much prior knowledge about signal processing techniques and diagnostic expertise[6], which cannot meet the requirements of real-time and portability of fault diagnosis in the era of big data.

In order to comprehensively analyze the difference between faults, it is necessary to examine the vibration signal of the bearing from multiple perspectives, so as to grasp the overall state of the bearing. In this paper, both the spectrum analysis of the vibration signal and the time-frequency analysis are performed. Then, the features are extracted from the time domain, frequency domain (frequency spectrum and envelope spectrum) and time-frequency domain (EMD). Although multi-view features are highly complementary, these features tend to be redundant, which is not conducive to fault diagnosis. Therefore, feature selection and feature fusion before classification are necessary.

Feature selection is the process of selecting some of the most effective features of the original features to reduce the dimension of the feature set, and it is an important means to improve the performance of the learning algorithm[7]. The way to select features is to sort them based on certain evaluation criteria, and then determine the number of features selected according to needs. Commonly used evaluation criteria are feature missing values, feature variance, Pearson correlation coefficients, etc. In recent years, many studies have used the random forest (RF) model for feature selection[8-9]. Random forest derives the importance of features based on the performance of the training data, by calculating the importance of each feature in each tree, then taking a weighted average to achieve the final feature importance assessment.

Feature fusion is to reduce feature redundancy after feature selection. The methods can be divided into linear fusion and nonlinear fusion. Common linear fusion methods are principal components analysis (PCA) and linear discriminant analysis (LDA). Locally linear embedding (LLE) and autoencoder are typical nonlinear fusion methods. In the case of autoencoder (AE),compression and decompression of features are implemented in an unsupervised manner via a neural network[10]. Feature selection and fusion make it easy to train the model and visualization. The deeper meaning is to transform raw features into a new and concise representation[11].

The multi-view feature set can make full use of the information of the original signal to reflect the difference among states. The extraction of multi-view features and the reduction of redundancy by feature fusion are two essential components in this model.

In this paper, a new strategy for rolling bearing fault diagnosis using multi-view features is proposed. The scheme consists of two parts including feature extraction and feature fusion. Specifically, features are extracted from the time domain, frequency domain and time frequency domain. Then, the random forest model (RF) and autoencoder are employed for feature selection and feature fusion, respectively. Finally, the support vector machine (SVM)[12]is introduced to evaluate the features.

The main contributions are summarized as follows:

1) The technology of feature extraction from multiple perspectives is proposed in this work, which takes the advantages of signal processing in feature extraction.

2) Aiming at the redundancy of multi-view features, a novel feature fusion strategy based on the random forest and the autoencoder is developed.

3) The validity and superiority of the proposed method in practical applications are verified through experimental analyses.

1 Signal Processing and Feature Extraction

The flow chart of the proposed method is shown in Fig.1. The main part of the method includes feature extraction and feature fusion.

Fig.1 The framework of the proposed method

In this section, the method of feature extraction of the vibration signal from the time domain, frequency domain and time frequency domain is introduced.

1.1 Time domain features

The original vibration signal contains the information about the normal vibration component, fault vibration component and environmental noise. In order to reduce dependence on prior knowledge, 43 statistical features are extracted from the timing waveform. Some features are as follows:

Mean

(1)

Maximum amplitude

(2)

Maximum peak

(3)

Standard deviation

(4)

Square root amplitude

(5)

Skewness

(6)

Kurtosis

(7)

Clearance factor

(8)

1.2 Frequency domain features

The data points in the time domain must first be transferred to the frequency domain by the Fourier transform and Hilbert transform, and then the signal can be analyzed in the frequency domain. The frequency domain features include all the basic 43 statistical features, and other features are constructed for the frequency, shown as follows:

Average frequency

(9)

Center frequency

(10)

Frequency root mean square

(11)

Frequency standard deviation

(12)

1.3 Time frequency domain features

EMD decomposes signals into a finite number of signals defined as intrinsic mode functions (IMFs) from high frequency to low frequency of unequal bandwidth adaptively. The internal volatility of the signal is reflected in the extracted IMFs which include real physical information of the signal. Except the basic 43 statistical features, entropy features and fractal dimension are also extracted from selected IMFs.

1) Entropy feature. Entropy usually reflects the degree of chaos in the signal, which is a function that characterizes the uncertainty of the signal. Assuming that the IMF sequence isX={x1,x2,…,xn}, the probability of each data point occurring ispi=P(x(i)).

(13)

Then, the information entropy of the IMF signal can be expressed as

(14)

2) Fractal dimension. Box dimension is the degree of irregularity and complexity of characterizing fractal sets at different scales. The fractal dimension can describe the structural features of signals. Specific calculation methods can refer to Ref.[13].

2 Feature Selection and Fusion

2.1 Feature selection

In this step, a random forest is applied to the training data, then the decision of all trees in the forest is aggregated for all the data based on the most voting for the classification[14].The feature importance assessment using RF is to calculate contributes of each features to each tree in the random forest, then take the average and finally compare the contribution between different features[15].

Metrics of contribution include Gini index (orGfor short) and out-of-bag data (OOB). In the evaluation method, the calculation of the Gini index is

(15)

whereKis the number of categories;Pmkrepresents the proportion of categorykin nodem. The importance of the featureXiat nodemis the change in the Gini index before and after branching at nodem:

(16)

(17)

Assume that the forest has a total ofntrees, then

(18)

All the obtained importance scores are normalized to obtain the score of the importance of the featureXi:

(19)

2.2 Feature fusion

Standardization before data input can effectively improve the convergence speed and training effect, since the autoencoder is a neural network model. All the feature vectors were standardized by removing the mean and scaling to unit variance. The standard score of a samplexis calculated as

(20)

whereuis the mean of the training samples; andsis the standard deviation of the training samples.

The autoencoder consists of an encoder and a decoder. To learn more abstract features, feature fusion can be performed by operating on the trained encoder. The basic architecture of an autoencoder is shown in Fig.2.

The first layer is the input layer, the middle layer is the

Fig.2 Architecture of an autoencoder

(21)

whereWis the weight of hidden layer nodes;bis the bias of hidden layer neurons;mis the number of hidden layer neurons.

The trained encoder can conduct a nonlinear transformation to reduce the dimensionality of high-dimensional features by back propagation and gradient descent.

3 Experiments

3.1 Datasets

Bearing vibration data is obtained from Case Western Reserve University[17]. The test stand is shown in Fig.3.

(a)

(b)

The test stand consists of a 2-hp motor (left), a torque transducer/encoder (center), a dynamometer (right), and control electronics (not shown). The test bearings support the motor shaft. Single point faults were introduced to the test bearings using electro-discharge machining with fault diameters of 2.13, 4.26 and 6.39 mm. An accelerometer was attached to the motor housing at the drive end of the motor.

Vibration data was collected at a sampling frequency of 12 kHz and each sample contains 4 000 points. In order to more clearly demonstrate the gain of the proposed method for classification accuracy, noise is added to the bearing vibration signal used, with a SNR of -2. The details about the bearing datasets are shown in Tab.1.

Tab.1 Descriptions of bearing datasets

The dataset contains 300 signal samples covering nine different conditions, i.e., normal condition, ball fault, inner race fault,and outer race fault. When the model is trained, the entire sample set is randomly divided into the training set and the test set. The training set contains 200 samples, and the test set includes 100 samples.

3.2 Signal transformation and decomposition

The feature extraction for the vibration signal is from three aspects: time domain, frequency domain and time-frequency domain. The time domain waveform, frequency spectrum and envelope spectrum of the signal with the outer race fault are given in Fig.4.

The waveforms of the first five IMFs containing the main feature information are given in Fig.5. Time-frequency domain features are extracted from the first five IMF components obtained after EMD decomposition.

3.3 Feature extraction

The time domain features are statistical features. The frequency domain features include statistical features and four frequency related features based on the frequency spectrum and envelope spectrum, and the time-frequency domain features include statistical features, three entropy features, and a box-dimensional feature. The dimensions of time domain features, frequency domain features and time-frequency features are 43, 94 and 235, respectively, as shown in Tab.2. Therefore, features with a dimension of 372 are extracted for each sample.

(a)

(b)

Fig.5 The first five IMFs yielded by EMD

Time domainFrequency domainTime-frequency domain43 statistic features86 statistic features8 frequency features215 statistic features15 entropy feature5 fractal dimension43 features94 features235 features

A multi-view feature set can comprehensively reveal the conditions of the bearing from multiple perspectives. However, among them there are many features that do not change with fault conditions, which make fault recognition more difficult. Usually, a good classification result is not guaranteed if all the features are directly fed to the classifier without feature fusion.

3.4 Feature selection and fusion

When using the random forest model (RF) for feature selection, the features and labels of the training set are fed into the model for training. The trained RF model can give the value of each feature based on its performance on the decision tree branches. The specific parameters of RF are shown in Tab.3.

Tab.3 Parameters of random forest and autoencoder

Fig.6 shows the importance of each feature from four different feature sets including time domain features (T), the frequency domain consisting of frequency spectrum (FFT) and envelope spectrum (HT), and the time frequency domain from the IMFs (EMD). The histogram of feature importance is shown in Fig.6(b).For the convenience of observation, the first picture has features removed that are less than 0.000 1 in importance.

(a)

(b)

As can be seen from Fig.6(a), the features extracted from the original waveform and frequency spectrum are generally more important than those from the envelope spectrum and IMFs. To more clearly reveal the difference between states, it is necessary to take the important features and discard useless ones based on feature importance.

It is evident in Fig.6(b) that only a few features are of high importance. However, the optimization of the number of selected features still needs to be executed since a small number of important features may fail to guarantee a good reconstruction of fault characteristics. The optimization of feature selection is shown in Fig.7.

Fig.7 Relationship between the number of selected features and the accuracy of SVM classification

To find the optimal number of the selected features, the model is evaluated by setting the validation set,where the training set is further divided into a small training set and a validation set. The validation size is set to be 0.33. It can be found in Fig.7 that the classification accuracy is not the best when the 20 most important features are selected. Therefore, in order to improve the accuracy of the classifier, it is necessary to select some features that are not very important. In this experiment, the number of selected features is 120.

The random forest (RF) can eliminate the irrelevant features which fusion methods cannot remove completely and the autoencoder can further reduce the redundancy of the selected features.

The autoencoder model used in this experiment has a hidden layer and uses L1 regularization to prevent model overfitting. Specific parameters are shown in Tab.3.

As an unsupervised machine learning model, the autoencoder has two important parameters to optimize, the number of hidden layer nodes and the number of epochs. The results optimized by grid search are shown in Fig.8.

Fig.8 Grid research optimization results

As shown in Fig.8, there are three combinations that allow the SVM to obtain good classification results, marked as pointsA,BandC. The number of hidden layer nodes is 10, 20, and 30, and the epoch is 450, 250, and 450, respectively. To maximize the accuracy, pointCis selected. Therefore, the number of hidden layer nodes and the number of epoches are determined as 30 and 450. The errors of autoencoder on the training set and the validation set are shown in Fig.9.

Fig.9 Autoencoder errors

3.5 SVM classification results

To prove the superiority of the proposed feature extraction and fusion method, the SVM classification results of the proposed method and other methods are listed in Tab.4.

From the perspective of error generation, the error mainly comes from the outer race fault classification. The differences of these four outer race faults are merely the angle or depth of the crack. The time domain features cannot fully reveal the difference between the faults. After introducing the features of the frequency domain and the time-frequency domain, the classification error is reduced.

According to the results in Tab.4, compared with the SVM performance of the original features of the time domain and all the original features, the improvement of the classification accuracy is not obvious. The increase in the features in the frequency domain and the time-frequency domain will introduce a large number of irrelevant features, which cause the difference between faults to be obscured. Therefore, when random forest is used to remove a large number of less important features, the accuracy of the classifier has been greatly improved. The accuracy rate is increased from 89% to 97%. Further-more, when using the autoencoder to reduce the redundancy of the feature set, the classification accuracy is improved by 2% and achieves 99%.

To further reveal the superiority of feature dimension reduction using the RF and autoencoder, comparisons are made between three different feature fusion methods, including PCA, kernel PCA and locally linear embedding (LLE) with the proposed methods. The results are shown in Tab.4, whereNrepresents the dimension of the fused features.

Tab.4 SVM classification accuracy of different feature sets

KPCA selects a better performing polynomial kernel function instead of the Gaussian kernel. As for the PCA and KPCA, parametern_components is set to be 40. Limited by the algorithm, LLE integrates the original features into 10 dimensions andn_neighbors is set to be 100.

It can be observed that the features of PCA fusion are the worst in SVM classification, and they even fail to achieve the accuracy before fusion. In contrast, the KPCA shows the difference clearly and the accuracy increases by 2.3% compared to that of the original features. LLE demonstrates a great improvement in accuracy, mainly due to its good nonlinear mapping ability. Among these models, the proposed RF+AE model has the highest accuracy, which further illustrates the robustness of the methods for extracting effective information and reducing feature redundancy.

4 Conclusions

1) Multi-view features can fully grasp the fault state of the bearing. After feature selection and fusion, features from multiple views can clearly reveal the state difference between normal and fault conditions. Experiments show that the fault feature set can be constructed well when the features of the vibration signals are extracted from the time domain, frequency domain and the time-frequency domain.

2) Combined with the feature selection and fusion method of the random forest and autoencoder in this paper, the accuracy of bearing fault classification can be effectively improved. The classification accuracy reaches 99.10%, which exceeds the accuracy of the feature set from the single perspective and outperforms other feature fusion methods.

3) In future studies, more features will be added to achieve better classification results and the performance of the fused features can be enhanced by using a deeper autoencoder. In addition, the proposed method can be applied to the fault diagnosis of gearboxes and life prediction of rotating machinery.

主站蜘蛛池模板: 精品撒尿视频一区二区三区| 无码乱人伦一区二区亚洲一| 国产综合精品日本亚洲777| 中文天堂在线视频| 亚洲大尺码专区影院| 久操中文在线| 国产女人在线观看| 日本不卡视频在线| 久久久久久久蜜桃| 女人天堂av免费| 日本欧美一二三区色视频| 日韩高清无码免费| 亚洲第一国产综合| 精品视频在线一区| 漂亮人妻被中出中文字幕久久| 夜精品a一区二区三区| 婷婷色中文网| 午夜激情婷婷| 亚洲天堂网站在线| 亚洲不卡无码av中文字幕| 一区二区三区国产| 在线观看国产精品一区| 高清国产va日韩亚洲免费午夜电影| 精品国产网| 99精品在线看| 亚洲无线视频| 亚洲国产理论片在线播放| 日韩 欧美 小说 综合网 另类| 青青极品在线| 丰满人妻中出白浆| 国产精品香蕉在线观看不卡| 国产精品区网红主播在线观看| 亚洲色无码专线精品观看| 亚洲制服丝袜第一页| 亚洲人成网站色7799在线播放| 亚洲精品国产自在现线最新| 青青草原国产一区二区| 日韩久久精品无码aV| 亚洲不卡网| 欧美亚洲国产精品久久蜜芽| 蜜芽一区二区国产精品| 欧美精品亚洲精品日韩专区va| 伊伊人成亚洲综合人网7777| 麻豆国产精品| 日韩经典精品无码一区二区| 国产成人毛片| 欧美色香蕉| 久久国产毛片| 国产精品蜜臀| 久久国产精品嫖妓| 日韩精品成人在线| 国产精品久久久久久久久久98 | 久久国产精品影院| 香蕉久人久人青草青草| 亚洲欧美另类视频| 欧美精品二区| 最新国产成人剧情在线播放| 亚洲黄网视频| 日韩无码视频播放| 亚洲天堂视频网站| 欧美日韩一区二区三区在线视频| 国产在线精品网址你懂的| 久久精品国产电影| 国产真实乱了在线播放| 日韩精品免费一线在线观看| 91精品国产无线乱码在线| 国产精品hd在线播放| 米奇精品一区二区三区| 免费一级无码在线网站| 国产成人精品优优av| 91毛片网| 亚洲第一黄色网址| 国产一区二区三区在线精品专区| 试看120秒男女啪啪免费| 欧美精品xx| 国产精品xxx| 久久一本日韩精品中文字幕屁孩| 日韩av在线直播| 欧美在线精品一区二区三区| 波多野吉衣一区二区三区av| 国产精品无码翘臀在线看纯欲| 国产成人久视频免费|