,*,,
1.College of Civil Aviation,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,P.R.China;2.CAAC Central and Southern Regional Air Traffic Management Bureau,Guangzhou 510403,P.R.China
(Received 1 April 2021;revised 28 May 2021;accepted 17 December 2021)
Abstract:A data-driven method for arrival pattern recognition and prediction is proposed to provide air traffic controllers(ATCOs)with decision support.For arrival pattern recognition,a clustering-based method is proposed to cluster arrival patterns by control intentions.For arrival pattern prediction,two predictors are trained to estimate the most possible command issued by the ATCOs in a particular traffic situation.Training the arrival pattern predictor could be regarded as building an ATCOs simulator.The simulator can assign an appropriate arrival pattern for each arrival aircraft,just like real ATCOs do.Therefore,the simulator is considered to be able to provide effective advice for part of the work of ATCOs.Finally,a case study is carried out and demonstrates that the convolutional neural network(CNN)-based predictor performs better than the radom forest(RF)-based one.
Key words:air traffic management;decision support;arrival scheduling;deep learning;convolutional neural networks
Rapid growth of air traffic and limited spacetime resources have led to traffic congestion in terminal control areas(TMAs),resulting in flight delays,energy waste,and air pollution.Improving runway utilization and scheduling arrival aircraft will mitigate these problems.Arrival management(AMAN)[1]is one particular tool in this field and has been widely used around the world.
AMAN could allocate the runway for arrival aircraft[2],as well as schedule the arrival aircraft according to the optimization objectives and operational constraints[3-4].Recent studies on AMAN focus on model construction and algorithm design.For optimization objectives,Rosenthal et al.aimed to minimize the total cost[5],i.e.,the sum of delay cost.Takeichi minimized the operational cost that was defined as the linear sum of the fuel cost and time cost[6].Ji et al.optimized the scheduled time of the last aircraft[7].Mokhtarimousavi et al.evaluated ground staffs’workload[8].Zhao et al.chose the average schedule time,the maximum flow time,and the maximum delay time as multiple objectives[9].For solution algorithms,Vadlamani et al.applied CPLEX to solve small-scale AMAN problems[10];Lieder et al.implemented dynamic programming to solve large-scale AMAN problems[11];and S?lveling et al.deployed branch and bound to solve largescale AMAN problem[12].
In previous studies,AMAN only outputs the optimal landing runway and the scheduled time of arrival(STA).These cannot provide air traffic controllers(ATCOs)with advices on how to arrange arrival aircraft to achieve such STAs.In other words,they failed to notice the feasibility of optimal arrival scheduling,and ATCOs had to spend extra time finding out potential tactical commands to maintain the landing sequence and guarantee safe separation for arrival aircraft.The point merge system(PMS),initialized by Eurocontrol,could help ATCOs to meet the requirements of AMAN outputs[13-14].However,implementing PMS needs a large area of airspace,which will further worsen the lack of airspace resources in China[15].
To address such issues,this paper aims to develop a decision support tool for arrival pattern assignment.Inspired by multi-step prediction[16-17],this paper provides a data-driven method for arrival pattern recognition and prediction.The arrival patterns are those commonly used by the arrival aircraft in horizontal trajectories,which could reflect the clearance delivered by the ATCOs.Recognizing the arrival pattern could be viewed as identifying the control instruction(clearance delivered by the ATCOs).To predict the arrival pattern could be viewed as an estimate of the control intention,which means the most possible clearance will be delivered by the ATCOs in a particular traffic situation.Therefore,exploring the relationship between the arrival pattern and air traffic situation is the main task of this study.
The rest of this paper is organized as follows.The whole scheme for arrival pattern recognition and prediction is described in section 1.Section 2 provides a clustering-based arrival pattern recognition method.Two classification-based arrival pattern prediction methods are proposed in section 3.The results and discussion are illustrated in section 4,followed by concluding remarks in section 5.
The process of building the decision support model for ATCOs in TMA is shown in Fig.1.This framework includes data preprocessing,arrival pattern recognition,and arrival pattern prediction.

Fig.1 Illustration of building decision suggestion models
After the four-dimensional(4D)trajectories are decoded from the raw radar data,data preprocessing is applied to clean the data.Trajectories that neither pass the boundary nor land on runways are removed from the dataset since they are incomplete;and trajectory segments outside of the boundary are also filtered out.
After cleaning data and removing outliers from the raw data,arrival pattern recognition is applied to identify the arrival patterns step by step.Trajectories are divided into different groups according to entry fixes,landing runways,and landing modes.For the entry fix,once the aircraft enters the termirial area(TMA),we can capture its position,so the aircraft’s entry fix is known.For the landing runway,since the runway selection is affected by its entry fix and its stands assignment,and this is out of the scope of this paper,we assume that the landing runway is known.For the landing mode,aircraft fly differently in the TMA for different traffic situations.Therefore,the landing mode is unknown and needs to be recognized.In this work,we apply the trajectory clustering method to recognize the landing modes.
The task of arrival pattern prediction is to predict the landing mode that the aircraft is likely to adopt according to the traffic situation.Firstly,we divide trajectories into different groups according to their entry fixes and landing runways.Secondly,we build a classification model for each group to predict the aircraft’s landing mode.When conducting the arrival pattern prediction for a new arrival aircraft,we choose the corresponding classification model according to the entry fix and the landing runway.
As an unsupervised method,trajectory clustering[18]can group trajectories so that the trajectories in the same group share similar flight intentions.To cluster trajectories efficiently,a new clustering method is proposed that consists of a new form of trajectory representation and a step-by-step clustering framework.
For the new form of trajectory representation,a control-intention-oriented format is introduced to represent the trajectories.This format focuses on the heading of the aircraft rather than their geometric positions since ATCOs rely on radar vectoring for resolving conflicts and establishing the landing sequence.
The step-by-step clustering framework is introduced in this paper for not only mitigating information loss in the process of one-time similarity measurement but also reducing the calculation complexity of similarity measurement,thus the speed of clustering could be greatly improved.
Radar data used in this paper includes the recorded timestamp,and the aircraft position.Suppose there is a set of radar trajectoriesR,whereR={r1,r2,…,ri,…,rm}.Each trajectoryriconsists of a series of points:,whereL(i)represents the number of points of the trajectoryri.Each record of trajectoryricontains the time stampt,headingψ,latitudeLa,longitudeLo,and altitudeA.
Heading and flight distance can represent control intention directly since ATCOs rely on radar vectoring for resolving conflicts and establishing the landing sequence.Therefore,this paper denotes trajectories by adjusted headings and distance-to-go(DTG)to capture more details about control intention[19].
The way of representing trajectories by adjusted headings and DTG is described as follows.
Step 1Find the difference in the heading between two consecutive points.

wherejis the particular point of trajectoryriandL(i) the number of points of the trajectoryri.
Step 2ModifyΔψaccording to the left-or right-turning

Step 3Calculate the adjusted heading

Then,the trajectory is represented by a series of adjusted headings:,whereis the initial adjusted heading of trajectoryriandthe final adjusted heading(landing direction).
In this subsection,details of how to recognize arrival patterns are illustrated step by step.
Step 1Eliminate trajectories with holding patterns.As shown in Fig.2,AC#4 has a holding pattern,and its adjusted heading has a continuous change when its DTG between 130 and 160 km.Therefore,AC#4 is separated from other aircraft.

Fig.2 Trajectory clustering diagram
Step 2Divide trajectories into groups according to their entry fixes.As shown in Fig.2(a),according to the aircraft’s entry fix,AC#5 can be separated from other aircraft.
Step 3Divide trajectories into groups according to their landing runways.
After the above steps,only AC#1 and AC#2 are in the same group since both of them have the same entry fix and landing runway.Then,the initial adjusted heading(θi1)of AC#1 and AC#2 can be used to distinguish AC#1 and AC#2 conveniently since they are different.
Further clustering can be conducted if necessary.For trajectories with the same entry fix,landing runway,and landing mode,the K-means algorithm can be used to cluster those trajectories directly.K-means assigns trajectories to exactly one ofkclusters defined by centroids,wherekis a predefined parameter.Further,squared Euclidean distance is adopted to measure the similarity between trajectories in this paper.However,such a measure requires that the number of trajectory points should be equal.Thereupon,it is necessary to resample the points of each trajectory by interpolation.
Two different classification methods are adopted to predict the most likely command issued by the ATCOs in a particular traffic situation.The first one is based on random forest(RF)[20],which requires efficient feature engineering to extract features from data.The second one is based on convolutional neural network(CNN)[21],which could learn features from data automatically.
Ensemble learning methods,such as RF,could achieve a better performance than other methods,both in Kaggle competitions and academic research.RF builds multiple decision trees and merges them to get a more accurate and stable prediction.Table 1 shows the feature set and the corresponding description for RF.The features,used in previous studies[22-23],are also included in this feature set.
The sine value and cosine value of the heading(#Fea1 and #Fea2)are used to replace the aircraft’s heading.#Fea3 and #Fea4 are used to reflect the control intentions of ATCOs.Since the change of altitude and speed are the most common tactical commands.#Fea5 and #Fea6 are used to represent the position information of the aircraft.#Fea7 and#Fea8 are used to capture periodic information of TMA.The values of #Fea7 and #Fea8 range from 0 to 1.For example,while the arrival time of the aircraft is 8:30,#Fea7=0.354(8.5×3 600/86 400≈0.354).

Table 1 Features for arrival pattern classification
3.2.1 Air traffic situation image construction
One of the contributions of this study is to extract features about the air traffic situation from images to predict arrival patterns.To this end,we construct the images to reflect the air traffic situations.The traffic situation image is composed of three channels,each channel is an image of 28×28 pixels,as shown in Fig.3.
Fig.3(a)presents the historical information layer,which records trajectories of all the arrival aircraft that land on the runway in the past 30 min.From the historical information layer,we can find the ATCOs’working habits and the current landing direction.The process is as follows:First,the TMA is converted into an image of 28×28 pixels;second,for each pixel in this image,if there is no landing aircraft that has passed this pixel in the past 30 min,the value of this pixel is set to 0,otherwise,set to 1.

Fig.3 Traffic situation image
Fig.3(b)presents the current information layer with records of current positions of all the arrival aircraft within the TMA.From the current information layer,we can find the relative position of each aircraft.The process is as follows:First,the TMA is converted into an image of 28×28 pixels;second,for each pixel in this image,if there is no arrival aircraft,the value of this pixel is set to 0,otherwise,set to a number greater than 0(when the aircraft reaches higher altitude,the value is closer to 1).Since each pixel is a square with 7.14 km of each side.Therefore,two aircraft are unlikely to appear in the same pixel.
Fig.3(c)presents the dynamic information layer,which records the movement information of all the arrival aircraft within the TMA.From the dynamic information layer,we can find aircraft’s movements in the past period.The specific process is as follows:Firstly,the TMA is coverted into an image of 28×28 pixels;secondly,for each pixel in this image,if there is no arrival aircraft that has passed this pixel in the past 5 min,the value of this pixel is set to 0,otherwise,set to 1.
So far,we have defined three layers with different information,then we combine them into a color image,as shown in Fig.3(d).The red,the green,and the blue layers denote the historical information layer,the current information layer,and the dynamic information layer,respectively.
3.2.2 CNN configuration
CNN is used to extract features from a traffic situation image.
The CNN configuration proposed in this paper is simple,in order to improve the speed of training and prediction and reduce the risk of overfitting.Table 2 lists the CNN configuration used in this paper.

Table 2 Network configuration of CNN
If there are only two arrival patterns in a group of trajectories,the problem of arrival pattern prediction could be regarded as a binary classification problem.Precision,recall,and accuracy are used to measure the performance of the binary classifier.

where true-positive(TP)is the number of positive arrival patterns correctly predicted;false-positive(FP)the number of positive arrival patterns incorrectly predicted;true-negative(TN)the number of negative arrival patterns correctly predicted,and false-negative(FN)the number of negative arrival patterns incorrectly predicted.
If there are more than two arrival patterns in a group of trajectories,the problem of arrival pattern prediction could be viewed as a multi-classification problem.Purity and rand index(RI)are used to measure the performance of the multi-classifier.

whereΩis the classification resultsΩ={ωi|i=1,2,…,K};ωitheith class in the classification results;Cthe actual classC={cj|j=1,2,…,L};cjthejth label in the actual class;mthe number of data;NSSthe number of pairs of arrival patterns in the same class inCand the same cluster inΩ;NSDthe number of pairs of arrival patterns in the same class inCbut not in the same cluster inΩ;NDSthe number of pairs of arrival patterns in the same cluster inΩbut not in the same class inC;andNDDthe number of pairs of arrival patterns in the different class inCand the different cluster inΩ.
In this section,we first introduce the dataset and the procedure of data preprocessing.Then,we offer two arrival pattern recognition cases.The number of arrival patterns of those two cases are two(ATAGA-RWY01)and three(GYA-RWY19).Finally,we present the results of arrival pattern prediction.The two cases,ATAGA-RWY01 and GYA-RWY19,are used to verify the proposed classification method in both binary and multivariate classification problems.
The radar data used in this paper is from Guangzhou Baiyun International Airport(ZGGG),where there are three parallel runways 01/19,02L/20R,02R/20L,and six entry fixes ATAGA,IGONO,P270,IDUMA,GYA,P71.Fig.4 presents the heatmap of radar tracks with a two-month operation(November and December in 2019).In Fig.4,the research scope is defined as a green circle with the airport reference point as the center and the radius of 100 km.Most of the terminal areas are surrounded by the scope defined in this paper.

Fig.4 Heatmap image of radar tracks and entry fixes
The pseudo-code of preprocessing is described as follows.
[ValidTra]=Preprocessing(RadarData)
ValidTra=[]
foriflightin RadarData.ArrFlight:
iflight.zip()#Eliminate invalidate or redundancy points
pdist=greatcircledistance(iflight,AirportRef-Point)
l1st100=(pdist>100 km).index.head(1)
iflight.remove(l1st100:end)# Confirm start point
infnl=isincuboidiflight.index.tail(1)
iflight.remove(1:infinal)#Confirm end point
if~(empty(l1st100)|empty(infnl)):
ValidTra.append(iflight)
For each flight,we eliminated the invalidate or redundancy points.Then,we found the last point whose distance from airport reference point(ARP)was larger than 100 km and removed all the points before that point.The above step ensured that all the following points were within the research scope.Next,we found the first point passed through the predefined cuboids and removed all the points after that point.The cuboids were predefined to determine whether the aircraft had joined the final.Finally,if a trajectory had both the start and the endpoint,it was a valid trajectory.
After data preprocessing,there are 38 771 arrival flights from the original raw data.
A step-by-step recognition method was adopted to recognize arrival patterns from the cleaned data.
Firstly,we divided trajectories into different groups according to their entry fixes.For example,Fig.5 presents the trajectories’entry from ATAGA,where the northbound trajectories are denoted by red while the southbound by blue.

Fig.5 Trajectories entry from ATAGA
Secondly,for each group of trajectories with the same entry fix,we divided them into different groups based on the landing runways.Fig.6 displays the trajectories entered from ATAGA and landed on RWY01,where the right-turning trajectories are denoted by red while the left-turning by blue.However,these two kinds of trajectories,which reflect different control intentions,are difficult to separate,since a large part of them overlap.That is why the following step is needed.
Thirdly,for each group of trajectories with the same entry fix and the same landing runway,we divided them into different groups according to their landing modes.To this end,we denoted trajectories by their DTG and adjusted heading.Then,the trajectories with different turning directions can be divided easily.As shown in Fig.7,trajectories with different landing modes are significantly different according to the new representation.

Fig.7 Trajectories via ATAGA to RWY01(new representation)

Fig.8 Clustering results of arrival trajectories via GYA to RWY19(Right turning)
Further clustering can be done if necessary.As shown in Fig.8,although we got trajectories with the same entry fix,the same landing runway,and the same slanding mode,there were two different control intentions.To divide these trajectories into two groups,Euclidean distance was adopted to measure the similarity between trajectories,and Kmeans was used to assign these trajectories to exactly one of two clusters defined by centroids.
Fig.9 presents the final clustering results of the trajectories entered from GYA and landed on RWY19.Compared with trajectories in Fig.6 that entered from ATAGA and landed on RWY01,there are three different landing modes in the trajectories in Fig.8.This means the clearance delivered by the ATCOs for these trajectories can be more diverse than that for trajectories in Fig.6.

Fig.9 Clustering results of arrival trajectories via GYA to RWY19
As mentioned above,for an aircraft when entering the TMA,its entry fix and landing runway are known,and its landing mode is unknown.Therefore,the task of arrival pattern prediction is to predict the landing mode that the aircraft is likely to adopt.This paper uses data set ATAGARWY01 to refer to the trajectories in Fig.6 that entered from ATAGA and landed on RWY01,and data set GYA-RWY19 to refer to the trajectories in Fig.9 that entered from GYA and landed on RWY19.
Arrival pattern prediction for trajectories in ATAGA-RWY01 could be viewed as a binary classification problem since there are only two arrival patterns.Precision,recall,and accuracy were adopted to measure the performance of the predictor.Arrival pattern prediction for trajectories in GYARWY19 could be viewed as a multi-classification problem since there are three arrival patterns.Purity and RI were adopted to measure the performance of the predictor.The results of arrival pattern prediction are shown in Table 3.
For ATAGA-RWY01,the CNN-based arrival pattern predictor performed better than that of the RF-based arrival pattern predictor.Specifically,the CNN-based prediction precision for right-turning trajectories(93.8%)was higher than the RF-based one(72.1%).The CNN-based prediction recall for right-turning trajectories(91.0%)was higher than the RF-based one(75.6%).Besides,the CNNbased prediction accuracy(92.2%)was higher than the RF-based one(74.4%)
For GYA-RWY19,the CNN-based arrival pattern predictor performed better than that of the RF-based arrival pattern predictor.Specifically,the CNN-based prediction purity(89.1%)was higher than the RF-based one(60.9%).Besides,the RI value for the CNN-based predictor was 0.853,which was 0.288 larger than the RF-based predictor.

Table 3 RF-based arrival pattern prediction results
The accuracy in binary classification and the purity in multi-classification can be calculated in the same way,so the two were comparable.By comparing the two data sets,we can see that as the number of landing modes increased,the prediction performance decreased.
This paper aims to provide decision support for ATCOs when providing service for the arrival aircraft.One contribution of this work is to recognize control intentions from historical data,and the other contribution is to build a classification model to predict ATCOs’intentions.
First,this paper proposes an arrival pattern recognition method to divide arrival patterns hierarchically by control intentions.Compared with traditional trajectory clustering methods,dividing trajectories step-by-step can diminish the loss of trajectory information.
Second,this paper proposes two arrival pattern prediction algorithms to estimate the most likely commands issued by the ATCOs in a particular traffic situation.The case study of ZGGG proves that the CNN-based arrival pattern predictor performs better than that of the RF-based arrival pattern predictor(The prediction purity could achieve 89.1%).The reason is probably that the CNN-based predictor extracts more information about the traffic situation in the TMA.That means the traffic situation has a non-negligible effect on trajectory prediction in most cases.Therefore,this paper may be a reference for future trajectory prediction.
In this paper,we have to reduce the number of arrival patterns to ensure that the predictor could achieve satisfactory performance.However,the limited number of flight patterns means that it is impossible to provide detailed decision support for ATCOs.Therefore,increasing the number of arrival patterns while maintaining the prediction performance is our future work.
Transactions of Nanjing University of Aeronautics and Astronautics2021年6期