Yuji Saito,Keigo Yamada,Naoki Kanda,Kumi Nakai,Takayuki Nagata,Taku Nonomura and Keisuke Asai
ABSTRACT A vector-measurement-sensor-selection problem in the undersampled and oversampled cases is considered by extending the previous novel approaches:a greedy method based on D-optimality and a noise-robust greedy method in this paper.Extensions of the vector-measurement-sensor selection of the greedy algorithms are proposed and applied torandomly generated systems and practical datasets of flowfields around the airfoil and global climates to reconstruct the full state given by the vector-sensor measurement.
KEYWORDS Sparse sensor selection;vector-sensor measurement
Optimal sensor placement is an important challenge in the design,prediction,estimation,and control of high-dimensional systems.High-dimensional states can often leverage a latent lowdimensional representation,and this inherent compressibility enables sparse sensing.For example,in the applications of aerospace engineering,such as launch vehicles and satellites,optimal sensor placement is an important subject in performance prediction,control of the system,fault diagnostics and prognostics,etc.This is because there are limitations of installation,cost,and downlink capacity for transferring measurement data.Reduced-order modeling has been gathering a lot of attention in various fields.A proper orthogonal decomposition(POD)[1,2]is one of the effective methods for decomposing high-dimensional data into several significant modes.Here,POD is a data-driven modal decomposition method that gives the most significant and relevant structure in the data and exactly corresponds to principal component analysis and Karhunen-Loève(KL)decomposition,where the decomposed modes are orthogonal to each other.The POD analysis for a discrete data matrix can be carried out by applying singular value decomposition,as is often the case in the engineering fields.Although there are several advanced data-driven methods,dynamic mode decomposition[3,4],empirical mode decomposition,and others which include efforts by the authors[5,6],this research is only based on the POD which is the most basic data-driven method for reduced-order modeling.If the data can be effectively expressed by a limited number of POD modes,limited sensors placed at appropriate positions will give the information for full state reconstruction.Such effective observation might be one of the keys for flow control and flow prediction.Therefore,the study of optimal sensor placement is important in this field.Such sparse point sensors should be selected considering the POD modes.Although compressed sensing can recover a wider class of signals,the benefits of exploiting known patterns in data with optimized sensing are utilized.Drastic reductions in the required number of sensors and improved reconstruction can be expected in this case.This idea has been adopted by Manohar et al.[7],and the sparse-sensor-placement algorithm has been developed and discussed.The idea is expressed by the following equation:

Although Joshi et al.proposed a convex approximation method for this objective function[8]as discussed above,the proposed convex approximation methods suffer from a long computational time.Manohar et al.[7]proposed a QR method that provides an approximate greedy solution for the optimization,which is known to be a submatrix volume maximization,and improved the computational time required for the sensor selection problem,relative to the convex approximation method,by using the QR method.The QR method is based on the QR-discrete-empiricalinterpolation method(QDEIM)[11,12].The QR method works well for the sensor selection problem when the number of sensors is less than that of state variables(undersampling).However,the QR method does not seem to connect straightforwardly with the problemy=Czwhen a number of sensors is greater than that of state variables(oversampling).Peherstorfer et al.proposed an oversampling-point selection method in the framework of a DEIM-based reduced-order model;the method is based on lower bounds of the smallest eigenvalues of certain structured matrix updates in the case of the number of sensors being greater than that of state variables[13].They showed that their proposal is almost the best choice among the existing oversamplingpoint selection methods[7,11,14–16]reported in DEIM studies.Clark et al.extended the idea of the QR method and developed an optimization method for placing sensors under a cost constraint[17–19].Manohar et al.[20]developed a sensor optimization method using balanced truncation for linear systems.Saito et al.[21]redefined the under/oversampling sensor optimization problems based on the QR method for the scalar-sensor measurement proposed by Manohar et al.[7].Their method will be called “DG”(D-optimality-based greedy method)in the present study.The objective function of the DG method is based on the D-optimal design of experiments.The DG method is mathematically proved to be the same as the QR method in the undersampled case.In addition,Saito et al.[21]showed that the DG method for the undersampled case is equivalent to the conventional regularized greedy method[22]in the limit where a priori information approaches zero.Nakai et al.investigated the effect of the objective functions on the performance of the sensor selection using the greedy methods[23],that are similarly constructed to that for D-optimality by Saito et al.The objective functions based on D-,A-,and E-optimality,that maximize the determinant,minimize the trace of inverse,and maximize the minimum eigenvalue of the Fisher information matrix,respectively,are adopted to the greedy methods.Yamada et al.proposed the noise-robust greedy method[24].Their method includes the data-driven noise covariance matrix and Bayesian priors in the objective function and can select the optimal sensor location while considering the correlated noise.Their method will be called “BDG”(Bayesian D-optimality-based greedy method)in the present study.Thus far,the sensor selection problem has been solved by the convex approximation and greedy methods,where the greedy method was shown to be much faster than the convex approximation method[7].Those ideas are recently applied to several advanced fluid experiments[25–27].
The previous studies introduced so far consider the scalar-sensor measurement that the selected sensors obtain a single component of data at the sensor location.There are several applications of vector-sensor measurement,such as two components of velocity,or simultaneous velocity,pressure,and temperature measurements used in weather forecasting.For example,the real-time particle-image-velocimetry(PIV)measurement of the flowfield is required for the feedback control of a high-speed flowfield in laboratory experiments.The velocity field is calculated from the cross-correlation coefficient for each interrogation window of the particle images in the PIV measurement,but the number of windows that can be processed in a short time is limited because of the high computational costs of the calculation of cross-correlation coefficients.We have been developing a sparse processing particle-image-velocimetry(SPPIV)-measurement system[27].The key point of this SPPIV-measurement system is the reduction of the amount of processing data and the estimation of the velocity field by a limited number of sparsely located windows,which allows the real-time PIV measurement.The development of an appropriate vector-measurement-sensor selection method is required for the highly accurate SPPIV.
Here,the difference between scalar and vector-sensor measurements is the number of components of data:scalar and vector-sensor measurements obtain a single and multiple components of data at the sensor location,respectively.The extension of the vector-measurement-sensor selection of the convex approximation method has already been addressed in Section C,Chapter V of the original paper[8].However,the convex approximation method suffers from a long computational time as well as the scalar-measurement-sensor selection problem.Saito et al.straightforwardly extended the greedy algorithm based on the QR method[7]to vector-sensor problems such as fluid dynamic measurement applications[28],but its applications are only limited to undersampled sensors.Therefore,it is necessary to consider the oversampled case and improve the reconstruction accuracy for the high-dimensional data such as fluid dynamic measurement data.In the present study,the more general greedy algorithms based on the D-optimality-based greedy(DG)and Bayesian D-optimality-based greedy(BDG)methods,that were designed for both under and oversampled sensors,are extended to the vector-measurement-sensor selection problem.The effectiveness of these algorithms on the randomly generated systems and practical datasets related to the flowfield around an airfoil and the global climate is demonstrated by comparing the results to those using previously proposed algorithms.
Fig.1 and Algorithm 1 show the outline of the present study.The data matrix has two or more components as input.For example,in the case of the two-component velocity fields in Fig.1,the data matrixis produced from the lateral velocity data matrix(X1)and the vertical velocity data matrix(X2).After that,a POD analysis is applied to the data matrixX,and a reduced-order model is predefined by the firstrlow-order POD modes.In addition,the vector-measurement-sensor selection is performed using the POD mode matrix as the sensor-candidate matrixUinstead of the scalar that has been studied in previous studies.Although the vector-measurement-sensor selection based on the DC method was proposed by the previous study[8],it suffers from a long computational time.Therefore,the vector-measurementsensor selections based on the DG and BDG methods are newly developed in the present study.As a result,the calculation time required for vector-measurement-sensor selection is shown to be significantly reduced.Finally,the reconstruction of the multi-components vector fields,such as velocity fields,using the vector-sensor measurement and reduced-order model are demonstrated and evaluated.Tab.1 summarizes the application range in the present study.

Figure 1:Concept of the present study

Algorithm 1:Methodological framework of the present study Input:Data matrix Output:Error,computational time 1:Perform the POD 2:Select the sensor position and measure the computational time of the selection 3:Calculate the state estimation,?z(Tab.3)4:Evaluate the quality of the sensor position by considering the error,e(Eq.(20))between the original and estimated data.

Table 1:Scalar and vector-measurement-sensor selection methods
The main novelties and contributions of this paper are as follows:
? The present study extends the DG and BDG methods proposed for scalar-sensor measurements to vector-sensor measurements in the undersampled and oversampled cases.
? The extensions of the vector-sensor measurement of the DG and BDG methods for the undersampled and oversampled cases have significant novelties for the sparse-sensor measurement in terms of reconstruction error and computational time.
? The present study compares the performance of the DG and BDG methods with the random selection and DC methods under the conditions:p=1 ?20 atr=10(undersampled and oversampled cases)although the previous study[28]considered the vector-sensor selection problem undersampled the undersampled cases.
? The present study is applied to randomly generated systems and practical datasets of flowfields around the airfoil and global climates.These results illustrate that the proposed DG and BDG methods extended to the vector-measurement-sensor-selection problem are superior to the random selection and DC methods in terms of the accuracy of the sensor selection and computational cost in the present study.
2.1.1 D-Optimality-Based Greedy Algorithm for Scalar-Measurement-Sensor-Selection[21]
A D-optimal design corresponds to maximize the determinant of the Fisher information matrix.Therefore,maximization of the determinant ofCCTandCTCfor undersampled and oversampled cases are equivalent to minimizing the determinant of the error covariance matrix,resulting in minimizing the volume of the confidence ellipsoid of the regression estimates of the linear model parameters[21,23].Basically,the exact solution of the scalar-measurement-sensorselection problem can be obtained by searching all the combinations ofpsensors out ofnsensor candidates,which takes enormous computational complexityO((n!/(n?p)!/p!)≈O(np).Instead,greedy methods for obtaining the suboptimized solution by adding a sensor step by step have been devised.For the D-optimal criterion,the objective function for the greedy method is demonstrated in the previous study[21].The subject is summarized in Algorithm 2.

Algorithm 2:Outline of D-optimality-based Greedy Algorithm for Scalar-Measurement-Sensor-Selection[21]Input:r,p,U Output:Sp Set sensor candidate indices and selected indices S={1,...,n},S0=?for k=1,...,r,...,p do if k ≤r then ik=argmaxi ∈SSk detCkCTk■■else if k>r then ik=argmaxi ∈SSk det■CTkCk■end if

Algorithm 2:(continued)Update selected sensor indices Sk ←Sk?1 ∪ik end for return Sp
2.1.2 Bayesian D-Optimality-Based Greedy Algorithm for Scalar-Measurement-Sensor-Selection[24]
Improved D-optimality-based greedy algorithm was presented in the previous study[24],and two more priors are exploited for a more robust sensor selection than the D-optimality-based greedy algorithm:one is expected variance of the POD mode amplitudes,and the other is spatial covariance of the components that are truncated in the reduced order-modeling.The former one can be estimated fromΣas:




Algorithm 3:Outline of Bayesian D-optimality-based Greedy Algorithm for Scalar-Measurement Sensor-Selection[24]Input:r,p,U,Σ Output:Sp Set sensor candidate indices and selected indices S={1,...,n},S0=?Set amplitudes variance matrices Q=Σ21:r R=U(r+1):mΣ2(r+1):mUT(r+1):m for k=1,...,p do ik=argmaxi ∈SSk detC(i)Timages/BZ_8_811_1383_842_1428.png■images/BZ_8_1328_1383_1358_1428.pngk R(i)■k■?1C(i)k +Q?1 s.t.R(i)k =H(i)k RH(i)Tk■images/BZ_8_608_1590_633_1636.pngimages/BZ_8_776_1590_800_1636.pngCk?1 uikimages/BZ_8_477_1590_501_1636.pngHk?1 hikimages/BZ_8_903_1590_927_1636.png,Rk=HkRHTk Update selected sensor indices Sk ←Sk?1 ∪ik end for return Sp Hk=,Ck=
A vector-measurement-sensor-selection problem in the undersampled and oversampled cases is considered by extending the previously explained DG and BDG methods.Again,the number of components of data is different for scalar and vector-sensor measurements:scalar and vectorsensor measurements obtain a single and multiple components of data at the sensor location,respectively.In the vector-sensor measurement,the following equation is considered:



2.2.1 D-Optimality-Based Greedy Algorithm for Vector-Measurement-Sensor Selection

In the undersampled case,the DG method is the same as the vector-measurement sensor selection method[28]which was developed based on the QR method,though the mathematical proof was omitted for brevity.Many previous studies have been conducted with the aim of reduction in the numerical instability and the computational cost when the QR decomposition is applied to the “tall-skinny matrices”[30].Although there may be a possibility of numerical instability related to the QR decomposition for tall-skinny matrices,the numerical instability of the DG method proposed in the present study has not been observed in our experiences.
The maximization of the determinant ofDTDis simply considered in the case ofk≥r/s(see,e.g.,[29]for detailed derivation):

The size of the matrix iss×s,and the computational cost of the proposed method is significantly lower than that of the original objective functions:sp×spandr×rin the undersampled and oversampled cases.The sensor-candidate matrix is assumed to be a full-column-rank matrix in the DG method.This is because the unobservable subspace in the latent variables does not exist,and any latent variables are assumed to be observed by at least one of the sensor candidates in oversampled cases.It will be difficult to apply the method proposed above,when the number ofsis so large that the vector components of the sensor matrix could not be linearly independent.In that case,sensors can be selected by considering the determinant ofCTC+?Ifor both undersampled and oversampled sensors in modified Eq.(13),though the sufficiently small number?should be determined by try-and-error processes.Although the use of?I,which has been discussed in Refs.[21,22]for the scalar-sensor measurements,is not recommended because it introduces the hyperparameter?,it makes the computation stable in the difficult situations of the vector sensors that are not linearly independent.When the number ofsis relatively small and the vector components of the sensor matrix are linearly independent as in the case discussed in the present study,the proposed method works well as described later.

Algorithm 4:D-optimality-based Greedy Algorithm for Vector-Measurement-Sensor Selection Input:r,p,s,U Output:Sp Set sensor candidate indices and selected indices S={1,...,n},S0=?for i=1,...,r/s,...,p do ui=[Ui,1 Ui,2 ...Ui,r]Wi=images/BZ_11_453_1635_474_1681.pnguTi,1 uTi,2 ...uTi,s?1 uTi,simages/BZ_11_1004_1635_1025_1681.pngTimages/BZ_11_511_1728_532_1773.pngDk?1=WTi1 WTi2 ...WTik?2 WTik?1images/BZ_11_1143_1728_1164_1773.pngT if k=1 then i1 ←argmaxi ∈SSk?1 detimages/BZ_11_880_1858_898_1904.pngWiWTi■else if k ≤r/s then ik ←argmaxi ∈SSk?1 det■■■■Wi I ?DTk?1images/BZ_11_1178_1974_1196_2019.pngDk?1DTk?1■?1 Dk?1 WTi else if k>r/s then ik ←argmaxi ∈SSk?1 detI+Wi■images/BZ_11_1046_2098_1064_2143.pngDTk?1Dk?1■?1 WTi■images/BZ_11_477_2194_495_2239.pngDTkDk■?1 ←images/BZ_11_750_2194_768_2239.pngDTk?1Dk?1■?1images/BZ_11_1032_2202_1055_2248.pngimages/BZ_11_1041_2166_1071_2212.pngI ?WTi■I+Wiimages/BZ_11_1390_2194_1408_2239.pngDTk?1Dk?1■?1 WTi■?1Wiimages/BZ_11_1900_2194_1918_2239.pngDTk?1Dk?1■?1images/BZ_11_2184_2166_2215_2212.pngend if Update selected sensor indices Sk ←Sk?1 ∪ik end for return Sp
2.2.2 Bayesian D-Optimality-Based Greedy Algorithm for Vector-Measurement-Sensor Selection
A fast implementation is considered as Saito et al.demonstrated in their determinant calculation using rank-one lemma[21].First,the covariance matrix generated by theith sensor candidate in thekth sensor selection,is:



Once the sensor is selected,andVkare updated.The algorithm of the BDG method for vector-measurement-sensor selection is described in Algorithm 5.The previous study in the scalar-measurement-sensor-selection problem proposes the implementation reducing the order of the noise covariance from(m?r)tor2and reduces the computational complexity[24].This process is called ther2truncation in the previous study.Although ther2truncation will reduce the computational complexity in the vector-measurement-sensor-selection problem,ther2truncation is not addressed in the present study.This is because it additionally needs to conduct the parametric study for the reasonabler2,and the extensions of the vector-measurement-sensor selection of the DG and BDG methods are focused in the present study.

Algorithm 5:Bayesian D-optimality-based Greedy Algorithm for Vector-Measurement-Sensor Selection Input:r,p,s,U,Σ Output:Sp Set sensor candidate indices and selected indices S={1,...,n},S0=?ui=■Ui,1 Ui,2 ...Ui,r?1 Ui,r■images/BZ_13_453_1983_474_2028.pngWi=uTi,1 uTi,2 ...uTi,s?1 uTi,simages/BZ_13_991_1983_1012_2028.pngT Set amplitudes variance matrices Q=Σ21:r Ti=H(i)s1■U(r+1):mΣ2(r+1):mUT(r+1):m■■■T i1=argmaxi ∈SS0 det■H(i)s1 WTi T?1 i Wi+Q?1■=argmaxi ∈SS0 det■I+T?1 i WiQWTi■Set observation matrix D1=Wi1 Set sensor-covariance matrix R1=Ti1 Update selected sensor index

Algorithm 5:(continued)S1 ←S0 ∪i1 for k=2,...,r/s,...,p do Set amplitudes variance matrix S(i)k =H(i)■■ΠTk?1 T(i)sk U(r+1):mΣ2(r+1):mUT(r+1):m k =H(i)■■■■T sk U(r+1):mΣ2(r+1):mUT(r+1):mimages/BZ_14_954_758_985_803.pngH(i)sk images/BZ_14_1369_758_1400_803.pngik=argmaxi ∈SSk?1 detD(i)■k TR?1 k D(i)k +Q?1■=argmaxi ∈SSk?1 det(I +T(i)k ?S(i)k R?1k?1S(i)Tk■?1 ■■■Set sensor-location and observation matrix Dk=■S(i)k R?1k?1Dk?1 ?Wi V?1 k?1×DTk?1R?1k?1S(i)Tk ?WTiimages/BZ_14_496_1088_520_1133.pngDk?1 Wikimages/BZ_14_667_1088_691_1133.pngSet noise-covariance matrix Rk=images/BZ_14_493_1244_518_1290.pngRk?1 STk SkTkimages/BZ_14_717_1244_741_1290.pngUpdate sensor location matrix Πk=images/BZ_14_500_1398_524_1444.pngΠk?1 Hskimages/BZ_14_675_1398_700_1444.pngUpdate selected sensor indices Sk ←Sk?1 ∪ik end for return Sp
The numerical experiments are conducted and the proposed methods are validated.The random sensor,PIV,and NOAA-SST/ICEC problems are applied in the present study.The vector sensor-measurement matrixUfor each case is predefined by the POD bases.Hereinafter,four different implementations are compared:the random selection,DC,DG,and BDG methods.The random selection and convex approximation methods[8]are evaluated as the references.All numerical experiments are solved under the computational environments listed in Tab.2.

Table 2:Computational environments
The quality of the sensors is evaluated by considering the error between the original and estimated data.The erroreis defined as below:


Table 3:Details of soil properties and pile material



Figure 2:Errors of the results using the DG(scalar)-LSE,DG(vector)-LSE,BDG(scalar)-BE and BDG(vector)-BE methods against the number of sensors in the random sensor problem:n=1000,m=100,r=10,s=2

Figure 3:Errors of the results using the random(vector)-LSE,DC(vector)-LSE,and DG(vector)-LSE methods against the number of sensors in the random sensor problem:n = 1000,m =100,r=10,s=2

Figure 4:Errors of the results using the random(vector)-BE,DC(vector)-BE,DG(vector)-BE,and BDG(vector)-BE methods against the number of sensors in the random sensor problem:n=1000,m=100,r=10,s=2
Fig.5 shows the relationship between the computational time of the results using the random(vector),DC(vector),DG(vector),and BDG(vector)methods and the number of sensorspin the random sensor problem.The random(vector)and DC(vector)methods take a certain amount of time regardless of the number of sensors,and the DC method has a particularly high calculation cost.The computational times of the DG(vector)and BDG(vector)methods increase monotonically as the number of sensors increases since the sensors are determined greedily.In this random problem,the computational time of the DG(vector)method is much shorter than that of the DC(vector)method,and the computational time of the BDG(vector)method is shorter than that of the DC method in the case ofp≤13.The BDG(vector)method takes longer time to calculate than the DC(vector)method in the case ofp>13,though the BDG(vector)method has the smallest error in the random sensor problem.Although ther2truncation proposed in the previous study for the scalar-measurement-sensor-selection problem[24]is not addressed in the random sensor selection problem,ther2truncation for the BDG(vector)method will reduce the computational time.The DG(vector)method which has the same error as the DC method is computationally fast,and the two methods extended in the present study are generally better methods in the random sensor problem.

Figure 5:Computational times of the results using the random(vector),DC(vector),DG(vector),and BDG(vector)methods against the number of sensors in the random sensor problem:n=1000,m=100,r=10,s=2
The particle image velocimetry(PIV)for acquiring time-resolved data of velocity fields around an airfoil was conducted previously[31].The effectiveness of the present method for the PIV data is demonstrated hereafter.Here,the overview of the experimental data is briefly explained.The wind-tunnel test was conducted in the Tohoku-university Basic Aerodynamic Research Wind Tunnel(T-BART)with a closed test section of 300 mm × 300 mm cross-section.The airfoil of the test model had an NACA0015 profile,the chord length and span width of which were 100 and 300 mm,respectively.The freestream velocityU∞and attack angle of the airfoilαwere set to be 10 m/s and 16 degrees,respectively.The chord Reynolds number was 6.4 × 104.Time-resolved PIV measurement was conducted with a double-pulse laser.The sampling rate at which the velocity fields are acquired,the particle image resolution,and the total number of snapshots were 5000 Hz,1024 × 1024 pixels,andm=1000,respectively.The tracer particles were 50% aqueous solution of glycerin with estimated diameter of a few micrometers.The particle images were acquired by using the double pulse laser(LDY-300PIV,Litron)and a high-speed camera(SA-X2,Photron)which were synchronized to each other.These PIV data with noise were obtained through real wind-tunnel experiments.
The vector-measurement-sensor selection problem for the reconstruction of the lateral and vertical components of the velocities measured by PIV(Figs.6 and 7)is solved using the same computational environments listed in Tab.2 as those used in the random sensor problem.Fig.8 shows the relationship between the errors of the results using the DG(vector)-LSE,DG(scalar,lateral)-LSE,DG(scalar,vertical)-LSE,BDG(vector)-BE,BDG(scalar,lateral)-BE and BDG(scalar,vertical)-BE methods in the PIV problem(n= 6096,m= 1000,r= 10,ands= 2),where the lateral and vertical represent the lateral and vertical components of the velocity fields.The errors of the results using the DG(vector)-LSE and BDG(vector)-BE methods,which are extended to the vector-sensor measurement in the present study,are smaller than those of the results using the DG(scalar,lateral and vertical)-LSE methods and the BDG(scalar,lateral and vertical)-BE methods,respectively in the PIV problem.Fig.9 shows the snapshots of the original and reconstructed data by the random(vector)-LSE,DC(vector)-LSE,and DG(vector)-LSE,respectively in the cases ofsp=r=10.The black open circles show the sensor positions selected by each method.The random(vector)-LSE method cannot correctly reconstruct the velocity fields compared to those of the original data.On the other hand,the velocity fields reconstructed by the DC(vector)-LSE and DG(vector)-LSE methods are almost the same as the velocity fields of the original data.Figs.10 and 11 show the relationships between the errors of the results using the random(vector)-LSE,DC(vector)-LSE,and DG(vector)-LSE methods and the number of sensors for the PIV problem in which the training and validation data are the same as each other and in which the training and validation data are different(K-fold cross-validation,K=5).The error of the results using the random(vector)-LSE is averaged over 1000 trials.The error bars represent the standard deviations of the five calculations in Fig.11.All the errors increase in the case ofsp≈rbecauseCwhich is determined by the sensor selection is not a regular matrix,as shown in Figs.10 and 11 as well as that shown in Fig.3.The error of the results using the DG(vector)-LSE method is smaller than or the same as the error of the results using the DC(vector)-LSE method and is lower than that of the results using the random(vector)-LSE method.

Figure 6:Dataset of PIV:snapshots of the lateral-velocity field

Figure 8:Errors of the results using the DG(vector)-LSE,DG(scalar,lateral)-LSE,DG(scalar,vertical)-LSE,BDG(vector)-BE,BDG(scalar,lateral)-BE and BDG(scalar,vertical)-BE methods against the number of sensors in the PIV problem:n=6096,m=1000,r=10,p=5 and s=2

Figure 9:Single snapshots of the original data and reconstructed velocity fields by the random(vector)-LSE,DC(vector)-LSE,and DG(vector)-LSE methods,respectively in the PIV problem:n=6096,m=1000,r=10,p=5 and s=2

Figure 10:Errors of the results using the random(vector)-LSE,DC(vector)-LSE,DG(vector)-LSE methods against the number of sensors in the PIV problem:n=6096,m=1000,r=10,s=2

Figure 11:Errors of the K-fold cross-validation results using the random(vector)-LSE,DC(vector)-LSE,DG(vector)-LSE methods against the number of sensors in the PIV problem:n=6096,m=1000,r=10,s=2,and K=5
Fig.12 shows the snapshots of the reconstructed data by the BDG(vector)-BE,random(vector)-BE,DC(vector)-BE and DG(vector)-BE methods,respectively in the cases ofsp=r=10.The velocity fields reconstructed by the BDG(vector)-BE,random(vector)-BE,DC(vector)-BE and DG(vector)-BE methods are almost the same as the velocity fields of the original data as shown in Fig.9.Figs.13 and 14 show the relationship between the errors of the results using the random(vector)-BE,DC(vector)-BE,DG(vector)-BE and BDG(vector)-BE methods and the number of sensors for the PIV problem in which the training and validation data are the same as each other and in which the training and validation data are different(K-fold cross-validation,K=5).All the errors estimated by the BE method are smaller than those estimated by the LSE method in as shown in Figs.10 and 11,and monotonically decrease as the number of sensors increases as well as the random sensor problem.The error of the BDG(vector)-BE method,which extended to the vector-sensor measurement in the present study is the smallest in this PIV problem.Although Loiseau et al.show the two-dimensional cylinder flow reconstructed by the selected five sensors[32],the sensors are selected by the scalar-sensor measurement based on the QR method[7].Therefore,their results will be improved by the DG(vector)and BDG(vector)methods extended to the vector-sensor measurement.Fig.15 shows the relationship between the computational time and the number of sensorspin the PIV problem.As well as the random sensor problem,the random(vector)and DC(vector)methods take a certain amount of time regardless of the number of sensors.The computational times of the DG(vector)and BDG(vector)methods increase monotonically as the number of sensors increases since the sensors are determined greedily.The BDG(vector)method takes longer to calculate than the DC(vector)method in the case ofp>10,however,the BDG(vector)method has the smallest error in the PIV problem as shown in Figs.13 and 14.Ther2truncation[24]for the BDG(vector)method will reduce the computational time although ther2truncation is not addressed in the PIV problem.The DG(vector)method which has the same error as the DC(vector)method is faster than the BDG(vector)and DC(vector)methods.Therefore,the BDG(vector)and DG(vector)methods extended in the present study are better methods in the PIV problem obtained through real wind-tunnel tests.

Figure 12:Single snapshots of the reconstructed velocity fields by the BDG(vector)-BE,random(vector)-BE,DC(vector)-BE and DG(vector)-BE methods,respectively in the PIV problem:n=6096,m=1000,r=10,p=5 and s=2

Figure 13:Errors of the results using the random(vector)-BE,DC(vector)-BE,DG(vector)-BE and BDG(vector)-BE methods against the number of sensors in the PIV problem:n=6096,m=1000,r=10 and s=2

Figure 14:Errors of the K-fold cross-validation results using the random(vector)-BE,DC(vector)-BE,DG(vector)-BE and BDG(vector)-BE methods against the number of sensors in the PIV problem:n=6096,m=1000,r=10,s=2,and K=5

Figure 15:Computational times of the results using the random(vector),DC(vector),DG(vector),and BDG(vector)methods against the number of sensors in the PIV problem:n=6096,m=1000,r=10 and s=2
The data set that we finally adopt is the NOAA OISST V2 mean sea surface dataset(NOAASST/ICEC),comprising weekly global sea surface temperature in Fig.16)and ice concentrations(Fig.17)in the years between 1990 and 2000(m=520)[33].There are a total of 520 snapshots on a 360 × 180 spatial grid(Figs.16 and 17).The vector-measurement-sensor-selection problem for NOAA-SST/ICEC is solved using the same computational environments in Tab.2 as those used in the problems of the random sensor and PIV.In this NOAA-SST/ICEC problem,the locations are beforehand excluded fromSfor simplicity if their RMSs are 10?1times smaller than the maximum of those of the dataset,anddenotes a subset ofSafter this exclusion.

Figure 16:Dataset of NOAA-SST/ICEC:snapshots of the sea-surface-temperature filed

Figure 17:Dataset of NOAA-SST/ICEC:snapshots of the ice-concentrations field
Fig.18 shows the relationship between the errors of the results using the DG(vector)-LSE,DG(scalar,SST)-LSE,DG(scalar,ICEC)-LSE,BDG(vector)-BE,BDG(scalar,SST)-BE and BDG(scalar,ICEC)-BE methods in the NOAA-SST/ICEC problem(n=88438,m=520,s=2,andr=10),where SST and ICEC represent the sea surface temperature and ice concentrations,respectively.The errors of the results using the DG(vector)-LSE and BDG(vector)-BE methods,which are extended to the vector-sensor measurement in the present study,are smaller than those of the results using the DG(scalar,SST and ICEC)-LSE methods and the BDG(scalar,SST and ICEC)-BE methods,respectively in the NOAA-SST/ICEC problem.Fig.19 shows the snapshots of the original and reconstructed data by the random(vector)-LSE,DC(vector)-LSE,and DG(vector)-LSE methods,respectively in the cases ofsp=r= 10.The yellow open circles show the sensor positions selected by each method.The random(vector)-LSE and DC(vector)-LSE methods cannot correctly reconstruct the snapshots compared to those of the original data.On the other hand,the snapshots of the reconstructed fields data by the DG(vector)-LSE method are almost the same as those of the original data.Figs.20 and 21 show the relationships between the errors of the results using the random(vector)-LSE,DC(vector)-LSE,DG(vector)-LSE methods and the number of sensors for the NOAA-SST/ICEC problem in which the training and validation data are the same as each other and in which the training and validation data are different(K-fold cross-validation,K=5).The error of the results using the random(vector)-LSE is averaged over 1000 trials.The error bars represent the standard deviations of the five calculations in Fig.21.All the errors increase in the case ofsp≈rbecauseCwhich is determined by the sensor selection is not a regular matrix,as shown in Figs.20 and 21 as well as that shown in Figs.3,10 and 11.The error of the results using the DG(vector)-LSE method is smaller than or the same as the error of the results using the DC(vector)-LSE method and is further lower than that of the results using the random(vector)-LSE method.

Figure 18:Errors of the results using the DG(vector)-LSE,DG(scalar,SST)-LSE,DG(scalar,ICEC)-LSE,BDG(vector)-BE,BDG(vector)-BE,BDG(scalar,SST)-BE and BDG(scalar,ICEC)methods against the number of sensors in the NOAA-SST/ICEC problem:n = 88438,m =520,r=10,p=5,and s=2

Figure 19:Single snapshots of the original data and reconstructed fields data by the random(vector)-LSE,DC(vector)-LSE,and DG(vector)-LSE methods,respectively in the NOAASST/ICEC problem:n=88438,m=520,r=10,p=5,and s=2

Figure 20:Errors of the results using the random(vector)-LSE,DC(vector)-LSE and DG(vector)-LSE methods against the number of sensors in the NOAA-SST/ICEC problem:n =88438,m=520,r=10,and s=2

Figure 21:Errors of the K-fold cross-validation results using the random(vector)-LSE,DC(vector)-LSE and DG(vector)-LSE methods against the number of sensors in the NOAASST/ICEC problem:n=88438,m=520,r=10,s=2,and K=5
Fig.22 shows the snapshots of the reconstructed data by the BDG(vector)-BE,random(vector)-BE,DC(vector)-BE,and DG(vector)-BE methods,respectively in the cases ofsp=r= 10.The random(vector)-LSE and DC(vector)-LSE methods cannot correctly reconstruct the snapshots compared to those of the original data.On the other hand,the snapshots of the reconstructed fields data by the DG(vector)-LSE method are almost the same as those of the original data.The snapshots of the reconstructed data by the BDG(vector)-BE,random(vector)-BE,DC(vector)-BE and DG(vector)-BE methods are almost the same as the snapshot of the original data as shown in Fig.19.Figs.23 and 24 show the relationships between the errors of the results using the random(vector)-BE,DC(vector)-BE,DG(vector)-BE methods and the number of sensors for the NOAA-SST/ICEC problem in which the training and validation data are the same as each other and in which the training and validation data are different(K-fold cross-validation,K=5).

Figure 22:Single snapshots of the original data and reconstructed fields data by the BDG(vector)-BE,random(vector)-BE,DC(vector)-BE and DG(vector)-BE methods,respectively in the NOAA-SST/ICEC problem:n=88438,m=520,r=10,p=5,and s=2

Figure 23:Errors of the results using the random(vector)-BE,DC(vector)-BE,DG(vector)-BE and BDG(vector)-BE methods against the number of sensors in the NOAA-SST/ICEC problem:n=88438,m=520,r=10,and s=2

Figure 24:Errors of the K-fold cross-validation results using the random(vector)-BE,DC(vector)-BE,DG(vector)-BE and BDG(vector)-BE methods against the number of sensors in the NOAA-SST/ICEC problem:n=88438,m=520,r=10,s=2,and K=5
All the errors estimated by the BE method are smaller than those estimated by the LSE method as shown in Figs.20 and 21,and monotonically decrease as the number of sensors increases as well as the problems of the random sensor and PIV.The error of the BDG(vector)-BE method,which extended to the vector-sensor measurement in the present study is the smallest in the NOAA/ICEC problem.There is a larger difference between the upper and lower bounds of the error bars than the ones in the PIV problem shown in Fig.14.The error of the BDG(vector)-BE,which is extended to the vector-sensor measurement in the present study,is the smallest in the NOAA-SST/ICEC problem although the large difference of the error bars might implicate that the quantitative performance of the proposed methods depends on the problem.Fig.25 shows the relationship between the computational time and the number of sensorspin the NOAA-SST/ICEC problem.Note that,the locations are beforehand excluded fromSfor simplicity if their RMSs are 10?1times smaller than the maximum of the NOAA-SST/ICEC dataset,anddenotes a subset ofSafter this exclusion in the NOAA-SST/ICEC problem.Therefore,the actualnis smaller than the initialn=88438.The random(vector)and DC(vector)methods take a certain amount of time regardless of the number of sensors.The computational times of the DG(vector)and BDG(vector)methods increase monotonically as the number of sensors increases since the sensors are determined greedily.The computational time of the BDG(vector)method is shorter than that of the DC(vector)method,and the BDG(vector)method has the smallest error in the NOAA-SST/ICEC problem as shown in Figs.23 and 24.In addition,ther2truncation[24]for the BDG(vector)method will reduce the computational time although ther2truncation is not addressed in the NOAA-SST/ICEC problem.The DG(vector)method,which has the same error as the DC(vector)method,is faster than the BDG(vector)and DC(vector)methods in this NOAA-SST/ICEC problem.Therefore,the BDG(vector)and DG(vector)methods extended in the present study are better methods in the NOAA-SST/ICEC problem.

Figure 25:Computational times of the results using the random(vector),DC(vector),DG(vector),and BDG(vector)methods against the number of sensors in the NOAA-SST/ICEC problem:n=88438,m=520,r=10,s=2,and K=5
Tab.4 summarizes the numerical test settings and results of the random,PIV,and NOAASST/ICEC problems for the vector-sensor measurement.Here,all the methods excluding the random method are compared.The trends of the errors and computational time are the same in all problems in the present study.The errors of the results using the DG(vector)-LSE and BDG(vector)-BE methods are the smallest for all problems.In addition,as the number of sensors increases,the errors of the results using the DC(vector)-LSE is close to that of the results using the DG(vector)-LSE method,and the error of the results using DG(vector)-BE and DC(vector)-BE methods is close to that of the BDG(vector)-BE method.The computational time of the DG(vector)method is the shortest for all problems.These results summarize that the proposed DG(vector)and BDG(vector)methods extended to the vector-sensor measurement are superior to the random(vector)and DC(vector)methods in terms of the accuracy of the sensor selection and computational cost in the present study.

Note:?In the NOAA-SST/ICEC problem,the locations are beforehand excluded from sensor candidate matrix for simplicity if their RMSs are 10?1 times smaller than the maximum of those of the dataset.
A vector-measurement-sensor-selection problem in the under and oversampled cases is considered by extending the previous novel approaches:a greedy method based on D-optimality(DG)and a noise-robust greedy method(BDG)in the present study.Extensions of the vectormeasurement-sensor selection of the greedy algorithms are proposed and applied to randomly generated systems and practical datasets of flowfield around airfoil and global climates and the full states are reconstructed by the vector-sensor measurement.In all demonstrations,the random selection and convex approximation methods are evaluated as the references in addition to the proposed DG and BDG methods.The least squares and Bayesian estimation methods are employed as the state estimation method in the present study.
The results applied to randomly generated systems show the proposed DG and BDG methods select better the position of the sparse sensor than the random selection and convex approximation methods.The results applied to practical datasets of flowfield around the airfoil and global climates are similar to the results applied to randomly generated systems.In addition,the reconstructed fields from the selected sensor in the noise-robust greedy(BDG)method are closest to the original data in all demonstrations.These results illustrate that the proposed methods extended to the vector-sensor measurement are superior to the random selection and convex approximation methods in terms of the accuracy of the sensor selection and computational cost in the present study.
Although the vector-measurement-sensor-selection problem extended by the present study is a more realistic sensor placement problem than the traditional scalar-measurement-sensor-selection problem,there are gaps between ideal models and real-world practicalities.For example,which is the better whether to place a large number of cheap sensors having a low signal-to-noise level,a small number of expensive sensors having a high signal-to-noise level,or a mix of both.The sensor selection problem considering the cost problem[17–19]will be the subject of challenging future research.
Funding Statement:This work was supported by JST ACT-X(JPMJAX20AD),JST CREST(JPMJCR1763)and JST FOREST(JPMJFR202C),Japan.
Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.
Computer Modeling In Engineering&Sciences2021年10期