,,,,
College of Civil Aviation/College of Flight,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,P.R.China
(Received 8 June 2020;revised 25 June 2020;accepted 3 July 2020)
Abstract: Anomalous trajectory detection and traffic flow classification for complicated airspace are of vital importance to safety and efficiency analysis. Some researchers employed density-based unsupervised machine learning method to exploit these trajectories related to air traffic control(ATC)actions. However,the quality of position data and the tiny density difference between traffic flows in the terminal area make it particularly challenging. To alleviate these two challenges,this paper proposes a novel framework which combines robust deep auto-encoder(RDAE)model and density peak(DP)clustering algorithm. Specifically,the RDAE model is utilized to reconstruct denoising trajectory and identify anomaly trajectories in the terminal area by two different regularizations. Then,the nonlinear components captured by the encoder of RDAE are input in the DP algorithm to classify the global traffic flows. An experiment on a terminal airspace at Guangzhou Baiyun Airport(ZGGG)with anomaly label shows that the proposed combination can automatically capture non-conventional spatiotemporal traffic patterns in the aircraft movement. The superiority of RDAE and combination are also demonstrated by visualizing and quantitatively evaluating the experimental results.
Key words:ADS-B data;robust deep auto-encoder;anomaly detection;trajectory clustering
The foreseen air traffic demand[1]of sustainable and limited capacity will impose new challenges on efficiency,bringing security risks to the already congested terminal areas,especially busy airports like Beijing and Guangzhou. The Civil Aviation Administration of China is planning to improve the current air traffic management system in an intelligent way[2]. Particularly,the rise of machine learning technology and accessible data from the automatic dependent surveillance broadcast system(ADS-B)can accelerate the improvement by data extraction of operational anomalies and air traffic control(ATC)action patterns in terminal maneuvering areas(TMAs).
Currently,several tactical initiatives on aircraft route focus on maintaining TMA situations,such as circumnavigation,holding,and direct fly. These tactical ATC actions can associate to the recorded ADS-B data. The actual trajectories that are composed of spatiotemporal points have different characteristics and meanings[3]. In detail,trajectories with rare occurrence or a certain degree of anomaly can present significant events in TMAs. Anomalous trajectory data can be regarded as valuable analysis materials for airspace safety. And trajectories that are already patterned but different from standard routes correspond to controllers’preference of frequent tactical ATC actions on the standard procedure[4]. The extraction of non-conventional patterns can enhance the accuracy of the controller workload assessment,design of flight procedure and calculation of airspace complexity. Both two types of traffic flows have respective data features different from standard routes,so that unsupervised machine learning algorithms are able to classify them automatically.
In previous literature,anomaly detection based on clustering(ADCluster)identified nominal trajectories and regarded the rest not belonging to any cluster as anomalous flights. There were two main methods in the research of ADCluster. The first focused on clustering through statistical or priori features,such as the local average velocity of the trajectory,the density of the aircraft,and the distance between the aircraft and the reference point[5]. Eckstein[6]decomposed the trajectory into various segments,which was defined as the spatial measurement of significant changes in the trajectory position or direction,and then used clustering results to monitor the anomalous behaviors of the aircraft. Gariel et al.[7]summarized and improved the first methods.First,the spatial turning points were clustered to obtain discrete sequence. Second,the main traffic flow was identified by clustering the sequence contained spatial variation. Since these methods do not directly use trajectory data,they may have limited application in the development of real-time trajectory based operations(TBO)tools in the future.
Gariel proposed the second widely used method for clustering trajectories at the level of position measurement. It consists of two well-differentiated steps. Firstly,principal component analysis(PCA)was used to eliminate redundant information on the resampling data. Secondly,density-based clustering algorithm(DBSCAN)identified anomalous trajectories and clustered them at the linear dimensionality reduced vector. The position-based data can be utilized to classify traffic flows in real time[8],which is the principle behind air traffic flow modeling[9].Some researchers used position measurement and the differentiated method to identify the nominal trajectory pairs in en-routes or terminal areas[10-12].
In spite of the good quality results,limitations in ADCluster with the position measurement have been pointed out in studies[13-14]:The poor sensitivity to short duration anomalies. These researchers utilized recurrent neural networks(RNNs)and vector auto-regressive(VAR)method to model the surveillance data and identify the anomaly trajectories. Alternatively,Olive et al.[15]utilized auto-encoder networks,a kind of deep learning which has been proved successful at anomaly detection,to identify anomalous trajectories in en-route. However,aircraft trajectories in terminal areas have shorter duration anomalies than those in en-routes. Without the enhancement in anomaly detection,the standard DAE model cannot carry out this work fairly well. And the noise in ADS-B data will further impair anomaly in trajectories. Thus,this paper introduces two regularization methods based on robust deep auto-encoder(RDAE)model to smooth the data noise and detect anomaly trajectories.
Moreover,other research direction of the trajectory clustering has not been neglected. A state-ofthe-art application by trajectory clustering is extracting the non-conventional pattern trajectories from the actual aircraft operations. Conde et al.[16]separated the weather-related non-conventional patterns from all non-conforming behaviors in en-routes,and then provided a reference for airspace resource management.However,Andrienko et al.[17]found classical DBSCAN algorithm cannot simultaneously extract nominal and non-conventional trajectories in en-routes with only one set of parameters. Theoretically,hierarchical DBSCAN-based algorithms can better cluster dataset with different densities than classical DBSCAN. But in fact,Gallego et al.[4]found that quantitative cluster values went down though hierarchical DBSCAN-based algorithms when hyper-parameters were finely tuned to enable discrimination between small clusters. He attributed this phenomenon to special nature of the data samples being a result of the PCA application to actual flight trajectories. The dimensionality reduction method was not improved. On the contrary,a parameter adjustment model has been established to customize the extraction of clusters with RDBSCAN.
Note that,there are two limitations in the trajectory clustering for the non-conventional traffic patterns:(1)Using the PCA method to reduce position data dimension will damage the necessary details.(2)DBSCAN-based methods are sensitive to the hyper-parameters when clustering the non-conventional traffic patterns. We further utilize the encoder of trained RDAE model as substitute for PCA method due to the preferable dimensionality reduction ability to extract key information. Then,a DP clustering algorithm is combining with RDAE encoder to classify the nominal trajectories and nonconventional tactical traffic flows simultaneously.The evaluation results show that the proposed framework of the combination of RDAE model and DP algorithm outperforms baseline models.
The aircraft trajectory dataset used in this paper is denoted as:τ1,…,τNtraj. Each trajectory consists of a set of time and position ordered data,i.e.τ={(ti,x?i,y?i,z?i)}(i=1,…,Npoint),where the time isti∈Z+and the measurements arex?i,y?i,z?i∈R.
The ADS-B data contain the following information:timestamp,flight number,departure airport,target airport,current location(using the longitude and latitude of WGS-84 coordinates),sea level altitude (m),heading,ground speed,and vertical speed. The experiment trajectory data come from the TMA of Guangzhou Baiyun Airport,and it is separated from the national flights data by the following methods.
First,the essence of the deep learning model is to construct a main traffic flow model in the terminal area through trajectory data. The actual trajectory timestamp cannot form traffic flows,so we shifttime so that the first time measurement in each trajectory is always att=0. Second,by the Mercator projection method with the center of the ZGGG runway,the latitude and longitude coordinates are converted into east-north-up(ENU)coordinates(represented byxyz). Finally,we intercepte trajectory points within a rectangle of -120 km ≤x≤120 km and -120 km ≤y≤120 km(the rectangle is greater than TMA). And the target airport of all trajectories is ZGGG. However,some aircraft do not update the target airport in time after landing,and the ADS-B receiver does not revise it when recording,so the intercepted data contain aircraft trajectories in other operation states. Therefore,for the intercepted data,a heuristic method is used to further distinguish the landing,takeoff,and overflying trajectories in the terminal airspace.
For each trajectory,we set the index of the closest point to the center of the runway asc=arg mini‖pi‖2,and theindex of the farthest point asf=arg maxi‖pi‖2. The time of the last measurement in the trajectory isT=maxiti,and the average descent speed isz?avg. The trajectory is then divided as:

The length of the runway at ZGGG airport is about 3 km,and the longest distance between the surrounding approach sector boundary and the runway center is about 100 km. This method works well with ADS-B data in Central South China.
For these three types of trajectories,the overflying aircraft have only a slight effect on the capacity in the terminal area. The departure aircraft use lots of performance on the initial climb process.Even if the terminal area capacity is limited due to the increased traffic complexity or severe weathers,the control methods of the departure aircraft are often limited to ground delay procedure(GDP)or predeparture,and the probability of ocurrence of anomalous trajectories is extremely low.
In order to keep sustainable operations in the terminal area,the approaching aircraft are usually tactically controlled based on the standard procedure. The control behavior reflects the complexity of the airspace[14]. Therefore,we choose the approach trajectory as the experimental data. Since the trajectories containing ATC actions occur in the airborne phase,the point after the approach trajectory data indexcis deleted. So far,it is assumed that the experimental data has been separated independently.An example of ZGGG area navigation(RNAV)standard arrival procedure in the terminal area is shown in Fig.1(available on the eAIP China[18]),in which the black lines with arrows represent the aircraft standard routes.
For the part of the anomalous trajectory detection,the key criterion is the reconstruction error,which is related to the number of trajectories. Therefore,the time is divided equally to get the same amount of trajectory points. The other preprocessing is to scale each measurement by a standardization. The data set is restored and rescaled separately in each experiment to eliminate the effect of noise and anomalous trajectories on the remaining data. In the experiment of trajectory clustering,in order to clarify the difference between trajectories,Gariel’s strategy is duplicated to add three dimensions to each trajectory point[7]

whereR=is the top left corner with coordinates ofα=arctanis the angular position in cylindrical coordinates.
This section outlines the steps of anomaly detection model and traffic flow classification. The steps are to reconstruct the trajectories,identify anomalies,and then cluster the rest trajectories.
There are multi-deficiencies with the trajectories in ADS-B data. First,the measurements are noisy. Second,the trajectories can have any number(including zero) measurements at a given time range. Third,trajectories have to share the same number of points. If we used linear interpolation to solve the third problem directly,the influence of noise would be increased. The deep auto-encoder(DAE)is widely used for denoising and anomaly detection[19-20]through compressive reconstruction of raw data. A standard DAE structure is shown in Fig.2,composed of encoder and decoder,which has the same number of units in the output layer and the input layer. And the number of middle hidden layer units is less than the number of both layer units.

Fig.2 Structure of standard DAE network
In the encoding stage,the input dataXis compressed into the hidden layerhto eliminate outliers.The connection function follows the standard practice[21],which uses the logit function to connect units

whereWis the weight connecting the input layer and the hidden layer andbEthe bias of the input layer.Similarly,the decoding function is defined as

whereWTis the weight connecting the hidden layer and the output layer andbDthe bias of the hidden layer. So,the objective function of DAE model is denoted as

However,the DAE model needs to train clear data,which is difficult to achieve due to the error of the hardware system. Thus,the RDAE model is used to reconstruct the trajectories to smooth noisy data. In the RDAE model,the outlierSand the part of the input dataLDthat is well represented by the hidden layer model are separated into two parts:X=LD+S. The outlierScontains the noise and anomalous vectors that are difficult to reconstruct.Then,the RDAE model can execute denoising and anomaly detection through employing different regularization in the loss function. In order to sparse the noise as much as possible,the loss function should be set as the sum of the reconstruction errorLDand the normalizati onterm of the‖S‖0. But for the calculation reason,one can relax the combinatorial term of the optimization by replacing it with a convex relax ati on ‖S‖1. The objective function can be denoted as

where ‖ ? ‖Fi s t he Frobenius norm of a matrix. As the parameter ofL1is regularized,a smallerλ1will promote more data to be isolated intoSas noise,and therefore minimize the reconstruction error.
We construct theL1to regularize RDAE(L1-RDAE)model. The number of input layer units and output layer units is the dimension of the data. The number of units in the middle of the hidden layer is determined by the method of eigen-dimensional estimation. The sameλ1value is selected for all input trajectories to maintain the consistency of denoising.We choose appropriate parameters in RDAE model that have the lowest reconstruction loss on the test data. Fig.3 shows theL1-RDAE reconstructed trajectories that are generated by varyingλ1,and the result of the standard DAE network with the same layers and units of RDAE.

Fig.3 Trajectory reconstruction
Fig.3(a)is the original noisy trajectories. A lowλ1leads to smooth trajectories (Fig.3(c))whereas highλ1leads to intact trajectories(Fig.3(b)). These trajectories are too smooth withλ1=0.1,and several anomaly trajectories are eliminated in the bottom right corner of Fig.3(c),which increases the reconstruction loss value to 8.85. The optimal validation loss value is 6.71,achieved by
λ1=2. Fig.3(d)is the result of DAE,which faithfully,but inappropriately,reproduces the noise.
The anomalous trajectories have different formats with the noise removed in Section 2.1. In Fig.4,a few trajectories interfered by noise are represented in blue,noise position in red circles,anomalous trajectories in green,and the remaining trajectories in light gray. It can be intuitively seen that noise appears as a break point in the trajectories,and anomaly trajectories have some different segments with nominal trajectories.

Fig.4 Visualization of noise and anomaly
The reconstructed trajectory data is processed to an×300 matrix through linear interpolation.Each row has 75 equal time position points of a flight,and each column is the measurement value of the trajectory in time and spatial dimensions. Anomalous trajectories reflect aircraft behaviors caused by unhealthy airspace[7],corresponding to the row vector of the data matrix. And the noise is distributed in each trajectory,corresponding to the column vector of the matrix. According to this feature,this paper combineL2,1regularization and RDAE model(L2,1-RDAE) to identify anomalous trajectories. TheL2,1regularization formula is as follows

It can be regarded as aL2-norm regularizing member of each column,which amplifies the influence of anomalies in the group to make anomalous vectors obvious. ThenL1regularization is used between each row to reduce the anomalous vectors' effect on low dimensional manifold,which enhances the robustness of the deep network. The anomalous trajectory data in this paper is reflected on the row vectors,soXneeds to be transposed. The objective function is as follows

However,the objective is not convex,and it is difficult to guarantee to converge the method to a global minimum. So we employ the block-wise softthresholding function[20]

wheregis a group index,ja within-group index,andλ2the regularization parameter of the objective functionL2,1. There are anomalous labels marked by the controller. We use false alarm rates to tune parameters. Fig.5 shows the experiment steps of anomaly detection.

Fig.5 Anomaly detection method based on RDAE
In fact,such a semi-supervised training method makes properλ2chosen according to the already pattern instances. Thus,the RDAE model trained onLDcontains enough information to reconstruct the main traffic flows.Foreshadowing our results in Section 3.1,we note that the linear dimension reduction of PCA on the data may lead to insufficient trajectory details. Meanwhile,deep auto-encoder model can abstract low-dimensional codes that work much better than PCA as a tool to reduce the dimensionality of data[22]. Using the trainedL2,1-RDAE encoder to output nonlinear vectors may produce better result of traffic flow classification than PCA.Thus,in the following clustering section,the combination of RDAE+DBSCAN is added as a comparative experiment to examine this opinion. For the density difference in the trajectory data,we introduce the DP clustering algorithm as a substitute for DBSCAN.
DP algorithm is a recently published method[23]for density clustering. The density calculation of each point is related to the density of neighbors. The steps are as follows.
First,the neighbors can be recognized by a soft threshold like the Gaussian kernel function or a hard threshold as defined in Eq.(8). In order to reduce the computational complexity,we employ a hard threshold to calculate the local density

Suppose that a descending orderrepresents the subscript orderthat is:ρq1≥ρq2≥…≥ρqN.
Second,the distance parameter of each data point is calculated,which is measured by the minimum distance between the point and other high-density points. However,the distance of the trajectory point with the highest density is the maximum value of its distance from all the other high-density points,that is

Therefore,each point is given two quantities:Local density and distance. Points with high local density and distances far greater than the threshold(ρ0,δ0)can be identified as density peaks or cluster centers. The cluster centers in non-conventional pattern should correspond to a smallρ0and a largeδ0.Thus,we can set the reasonable parameters to identify the non-conventional trajectories near the edge of the nominal trajectory data. After these density peaks are found,other points are assigned to the same cluster as their nearest neighbor of higher density.
By this way,DP method can cluster the data contained various densities,which is the weakness of the DBSCAN-cored clustering algorithm[24]. Furthermore,the qualitatively setting of local density and distance can also identify anomaly trajectories.This is a direction of future work.
It is worth mentioning that the combination of RDAE+DP could improve the performance in the trajectory clustering. Due to the presence of two regularizations,the effect of noise and anomalous trajectories is constrained. Thus,there is a trend that the trajectory data would converge in the low-dimensional manifold,so that the non-conventional trajectories have higher density locally. This makes small clusters that are finely different from the classical data easier to be identified. Fig.6 shows the traffic flow classification step of RDAE+DP algorithm.

Fig.6 Trajectory clustering method based on RDAE+DP
We carried out our method on 524 real trajectories at the Guangzhou terminal airspace on 2019-06-16. Twenty-seven anomalous trajectory labels,marked by licensed controllers,were collected for evaluating the performance of our method. This is typical imbalance dataset in the binary classification.We used F1-score to verify the anomaly detection performance.Its definition is as follows

where precision is the number of correct anomalous results divided by the number of all anomalous results identified by the algorithm of anomaly detection,and recall is the number of correct anomalous results divided by the number of all anomalous samples marked by controllers. A single high precision or recall cannot correctly reflect the anomaly detection performance. Thus,we visualized the optimal F1-score results of each algorithm to analyze the performance difference.
In the traffic flow classification,we used the same data set and selected the silhouette criterion(SC)value as the evaluation metrics. The SC value definition is as follows

wherea(i) is the mean distance betweeniand all other data points in the same cluster,andb(i) the smallest mean distance ofito all points in any other cluster.
In order to compare the RDAE performance in anomaly detection,we repeated Gariel’s strategy(PCA+ DBSCAN)and trained a standard deep auto-encoder(DAE)model that had the same structure with RDAE. Fig.7 shows the precision,recall and F1-score with different parameters for RDAE and PCA+DBSCAN. Table 1 shows the anomaly detection among three algorithms with the optimal F1-score. The optimal F1-score of RDAE model was 0.809 byλ2=7×10-12. The F1-score of DAE model that had same structure with RDAE was 0.632. The optimal F1-score of PCA+DBSCAN model was 0.593 under the parameters ofε=1.5 andφ=11,whereεwas the maximum distance between two samples,andφthe number of samples in a neighborhood. The F1-score of PCA+DBSCAN and DAE model was approximately 36.4% and 28% worse than the value achieved by the RDAE.At the respective optimal F1-score,we found that the detection results of both PCA+ DBSCAN and DAE models were a subset of RDAE.

Fig.7 Anomaly detection results

Table 1 Anomaly recognition results of three algorithms
Fig.8 displays the complementary set between the results of PCA+DBSCAN and RDAE. They are represented by green trajectories. These aircraft routes shared a large number of segments with standard procedure. The anomaly degree in these trajectories that PCA+DBSCAN could not recognize were relatively small. The dimension reduction process used PCA further decreased the difference between anomaly and nominal trajectories,which made DBSCAN difficult to find these anomaly trajectories. Fig.7(c)indicates that the standard DAE model have the similar F1-score with PCA+DBSCAN in this trajectory data set. The RDAE model obtained a better result than DAE by amplifying the effect of anomalous vectors throughL2,1regularization.

Fig.8 Anomaly complementary set of two algorithms
However,there were three trajectories that still cannot be detected by RDAE. These trajectories are displayed in red in Fig.8. These aircraft flied into the terminal area near the west GYA waypoint at night. The TMA was in a low-density state. The controller carried out a radar vectoring procedure for saving time and assiged a direct route for these aircraft to the final approach point. Unsuccessful identification of the three trajectories may be attributed to that there are a large number of convergence behaviors from different directions before GYA waypoint.The corresponding data vectors had a large fluctuation range,which reduced the anomaly detection accuracy of the RDAE model near this waypoint. The situation simultaneously led to the poor anomaly detection performance of the PCA+DBSCAN model on the west trajectories. The RDAE result of anomaly detection on the west trajectories was better than the PCA+DBSCAN result,which seems to prove the preferable robustness of the RDAE model. The problem could be refined by using the trajectory points before the convergence behavior. Andrienko elaborated that different interception ranges would affect the traffic flow classification[15]. It is feasible to extract ADS-B data according to the spatial measurement boundary of the terminal area.
The quantitative results for the dataset are shown in Table 2. The SC value which compares the ratio of intra- and inter-cluster distances for evaluating compactness and separation between them,is the most common metric for clustering validation[4]. The SC value indicated that RDAE+DP had the best clustering performance among the three algorithms. But the SC value could not reflect the separation of non-conventional traffic flows. So we introduced visualization qualitative analysis method in Fig.9.

Table 2 Trajectory clustering results
Although PCA+DBSCAN had the optimal performance in identifying anomalous trajectories(the optimal F1-score),the quantitative SC value was not the optimal result of the algorithm. The visualization is shown in Fig.9(a). The trajectories with obvious differences on the north side of the airport are identified as gray. At the optimal SC value,the north trajectories were divided into three clusters,with three standard RNAV arrival procedures in Fig.1. But a large number of trajectories were categorized as the anomaly,which penalized the F1-score value. There may be two reasons:One is that linear dimension reduction process by PCA decreased necessary details in the trajectory,and the other is that DBSCAN was difficult to precisely separate dataset with tiny density differences. The visualization of RDAE+DBSCAN is shown in Fig.9(c),where a large number of trajectories were not misidentified to anomaly(DBSCAN would reidentify the anomaly,even if RDAE had excluded the anomaly trajectories),and the north trajectory set was divided into four categories. The RDAE+DBSCAN model was able to discriminate the standard RNAV arrival route ATAGA-1B in Fig.1,which could not be separated in PCA+DBSCAN.Meanwhile,the SC value was slightly better than PCA+DBSCAN. The RDAE+DBSCAN model had better clustering performance of identifying the nominal trajectory than PCA+DBSCAN.
Compared with RDAE+DBSCAN, the RDAE+DP combination further improved the quantitative SC value,and separated non-conventional pattern(orange trajectories in Fig.9(d))from standard RNAV route (dark blue trajectories in Fig.9(d)). This was a direct route only open at night through which these aircraft flied over a restricted area. So the number of cluster trajectories was relatively few,corresponding to low density at microscopic level. In the global trajectory clustering,DBSCAN either clustered it of the same cluster of the nominal trajectories(Figs.9(a)and(c)),or identified it as an anomaly (Fig.9(b)). The RDAE+DP combination could better capture this actual non-conventional pattern and the nominal trajectory,which would contribute to the designing and improvement of flight procedures in the terminal area and provide support to traffic classification in free airspace[25].

Fig.9 Visualization of three algorithms
(1)The combination of RDAE andL1-regularization can well eliminate the noise in trajectories,and it is easy to train. In fact,the performance of denoising can potentially be improved by the Huber loss function orL1/2-regularization[26]. This is a feasible improvement in the future. And we believe that the trajectory reconstruction procedure could be useful for trajectory smoothing a wide variety of vehicle data.
(2)The method of combiningL2,1-regularization and RDAE can effectively identify the anomalous trajectories in the airspace. Compared with the PCA+DBSCAN and DAE algorithms,the method has 36% and 28% performance in F1-score.
(3)The RDAE model can simultaneously capture non-linear components containing sufficient details in trajectories,so that the combination with DBSCAN can accurately cluster all standard RNAV aircraft routes. The last but not least,the combination of RDAE+DP can identify both the non-conventional spatiotemporal patterns and the nominal trajectories,which could be further used to train machine learning techniques aiming at improving the state-of-the-art of tactical deconfliction and prediction algorithms.
Transactions of Nanjing University of Aeronautics and Astronautics2020年4期