CHEN Cheng, JIN Tao, ZHOU Zhiquan
School of Information Science and Engineering,Harbin Institute of Technology (Weihai),Weihai 264209
Abstract: Ocean sound speed profile (SSP) is the key factor affecting acoustic propagation. The acquisition of SSPs in real time with high precision is meaningful for underwater activities. By means of the remote sensing method, the sea surface data could be obtained in near-real time. Typically, the subsurfacefields are correlated with the sea surface parameters. Thus, the SSPs could be obtained by means of satellite remote sensing. In this paper, the history as well as the current research over the reconstruction of subsurfacefields by means of sea surface data is introduced.Then two methods to reconstruct the SSPs with sea surface data, including the linear regression method using the empirical orthogonal function, and the self-organizing method based on the big data theory, are described in detail in the paper.
Key words: Sound Speed Profile (SSP), sea surface data, linear regression method, self-organizing method
Ocean sound speed profile (SSP) is the key dynamic factor in the underwater acousticfield[1-3]. Currently, the acquisition of SSPs were obtained mainly by means of in-situ measurements, such as the XBT, CTD, as well as with an underwater moving platform. Such methods consume considerable effort,time and money. However the data obtained using such methods cannot meet the imperative needs of underwater detection and processing due to the limitations in both the spatial and the time domain. In contrast a satellite can acquire the sea surface data with wide coverage over a long time base, but the measurements are limited to the sea surface parameters.
Based on the sound speed empirical formula[4], the inversion of sound speed could be transformed as the inversion of temperature and the salinity profiles. With the sea surface data from satellites, the in-situ temperature and the salinity profiles could be obtained by means of an underwater analytical model.The reconstruction of SSPs is based on the fact that the heat expansion and contraction would result in a sea level anomaly,with which the relation between sea surface data and the subsurface temperature and the salinityfield could be established.In Figure 1, the sea surface height (SSH) field as well as the temperature contours at different depth layers in the Kuroshio Extension region are shown. We notice that the 0.05 m SSH contours align with the 14℃ temperature contours at the 200 m depth layer and the 12℃ temperature contours at the 300 m depth layer as well as the 9℃ temperature contours at the 400-m depth layer, which means that the reconstruction of subsurfacefield by means of sea surface data were feasible.

Figure 1 The SSH contours and the temperature con-tours on different depth layers (cited from reference [5])
Researchers have been using satellite data to infer subsurface parameters for many years. Early research found that the altimetry data include the barotropical and the baroclinic signals. The baroclinic signal is produced by the variation of the sea water density, which is caused by the heat expansion and contraction of the sea water. The barotropical signal was produced by the tidal movement.
Khedouri et al. (1983) found that the subsurface temperaturefield was highly correlated with sea level anomaly in the Gulf Stream, which could be used to infer the subsurface temperaturefield[6]. Stammer (1997) found that the sea level anomaly was correlated with the variation of the thermocline layer[7]. Wunsch et al. (1997) found that barotropical and the first order mode of the baroclinic signal constitute the main signal in the vertical direction, and that thefirst order mode of the baroclinic signal dominates the variation of the sea surfacefield[8]. Carnes et al. (1990) established the relation between the steric height data and the temperature with the standard layers of the ocean, and found that they were correlated[9]. Later,they conducted empirical orthogonal function (EOF) decomposition with the temperature profiles in the north atlantic and north pacific ocean. They established the regression relation between the sea surface data and the EOF coefficients of the temperature profiles, namely the sEOF-r method, with which the temperature profiles could be reconstructed in near real time[10]. The method has been used by the US navy, as part of the operational marine environment prediction scheme[11].
Later on, most of the research has concentrated on the inversion method of the profiles in the ocean.
Chu et al. (2000) showed that the sea surface temperature observations could be used to determine the structure of the subsurface temperature profiles in the South China Sea, and they established a parametric model between the subsurface parameters and the sea surface data. The parametric model was found to perform better than the simple regression method[12]. Fischer et al. (2000) used the output from the MOM ocean model to determine the regression relation between sea level anomaly and the sea surface temperature anomaly, and the multivariate projection method was shown to provide better results compared to the univariate method[13]. Pascual and Gomis (2003) proposed a new method to reconstruct profiles without the regression database[14]. Brunno and Rosalia (2004)proposed a multivariate regression method, which couples the temperature profiles with steric height profiles[15]. They reconstructed the two kind of profiles with the sea surface temperature (SST) and the SSH data. Some researchers have identified the sea surface water density with the subsurface sea water density under the quasigeostrophic theory, with which the subsurface currentfield as well as the water density profiles could be reconstructed through sea surface data[16]. Swart et al.(2010) used a variety of data sources to calculate the time series of temperature and salinity fields using the gravest empirical mode (GEM) method[17]. Meijers et al. (2011) used altimetry and Argo floats to establish gridded, full depth, time-evolving temperature, salinity and velocity fields. They found that the combination of altimetry with the GEMfields allows the resolution of the subsurface structure of filamentary fronts and eddy features[18]. Guinehut et al. (2012) made use of the multiple linear regression method to derive the synthetic temperaturefield and the Argo profiles to correct the temperaturefields[19]. The RMSE was reduced to a large degree using this method.
In recent years, with the increase of ocean observations,the amount of ocean data is sufficient for the reconstruction method based on the big data theory, for example the machine learning method.
Ali et al. (2004) used the neural network to infer the subsurface thermal structure from surface parameters, where the input included the sea surface temperature, sea surface height,wind stress, net radiation and the net heat flux data from moorings in the Arabian Sea from October 1994 to October 1995[20].Liu et al. (2007) have used the self-organizing map (SOM)method to extract the variation pattern of the current field from high frequency radar and the ADCP data[21]. Wu et al.(2012) used the self-organizing map (SOM) method to estimate the subsurface temperaturefields with the sea surface observed data[22]. Charantonis et al. (2015) made use of the hidden markov model to reconstruct sea water profiles with sea surface data[23]. Later (2015) they reconstructed the temperature/salinity profiles from the sparse database of the sea glider with the SOM method[24]. Based on the work of Charantonis et al.,Chapman et al. (2016) used the SOM method to reconstruct the subsurface currentfield through satellite data[25]. Compared to the traditional method, the machine learning method has the advantage to extract nonlinear relations from the database,which is more suitable for the nonlinear case in the ocean. Su et al. (2015) have used the support vector machine to estimate the subsurface temperaturefield with sea surface data[26].
In section 2, two methods to reconstruct the SSPs are described, of which one was the traditional linear regression method and another one was the SOM method.
SSPs could be calculated by means of the sound speed empirical formula with temperature and salinity profiles. Water temperature profiles could be reconstructed with sea surface data. Typically, there exists stable relations between salinity profiles and the temperature profiles. Thus, the SSPs could be reconstructed with the sea surface data. In this section, two methods were introduced to reconstruct the SSPs with sea surface data. One was the sEOF-r method and another was the SOM method. Data sources include the Argo floats data[27], the WOA09 data[28]as well as sea surface height[29]and sea surface temperature data[30].
Typically, the EOF function can determine the main variation pattern, andfilter out low level noise, which is suitable for the case of ocean observations.
The ocean was divided into 2°×2° grids worldwide.The same EOF modes were assumed within each grid, and a regression database was established in each grid. Figure 2 presents the flow diagram of the SSP reconstruction. In Figure 2, the SSP anomaly was obtained by removing the climatology SSP, which was provided under the WOA09 data. In each grid,EOF decomposition was performed over the Argo profiles to obtain the EOF vectors and the corresponding coefficients.Then the regression data base were established between the EOF coefficients and the sea surface data. Finally, with the EOF coefficients as well the EOF vectors, the SSPs could be reconstructed in real time through satellite data.

Figure 2 Flow diagram of the SSP reconstruction (cited from Ref [31])
Chen et al. (2018) have investigated the reconstruction performance in different parts of the world oceans, and found that the SSP anomalies were highly correlated with eddy kinetic energy[31]. In ocean areas where eddy occurs frequently,Argo SSP anomalies are relatively large. The absolute error of the SSP reconstruction would be larger while the performance of the sEOF-r method remains relatively stable.
Chen et al. (2018) have proposed a method based on the SOM method to reconstruction SSPs, and the method was shown to perform well in the highly dynamic Kuroshio extension region[32]. The method was similar to that of the method proposed by Chapman et al[25]. The method employs the correlation of data space, with no assumption on the structure of the water profiles.
Figure 3 presents the scheme of the reconstruction method. In Figure 3, the left part was used to train the samples, and map them onto different groups. Each node refers to the eigenvector of the clustering group. The reconstruction of SSPs was achieved by finding out the best matching eigenvector with the available data. The available data included the SSH,SST, longitude/latitude and the month. Typically, the profiles were continuous in the vertical direction, and the temperaturefield at different depth layers were in correlation with each oth-er, which would impose in fluence on the reconstruction result.

Figure 3 The scheme of the SOM method (SSH: Sea surface height; SST: Sea surface temperature; LON: Longitude; LAT:Latitude; MON: Month; S1…Sn: Parameters to be determined) (cited from Ref [32])
This paperfirst describes the history of the research on reconstructing subsurfacefields using satellite data. There exists two kinds of means to reconstruct the subsurfacefield. One is based on the physical relation between the sea surface data and the subsurface parameters. The other one is based on the machine learning theory, which makes use of the relativities between the surfacefields and the subsurfacefields. Then the sEOF-r method and the SOM method to reconstruct the SSPs was described in detail.
Because the SSP is a key factor affecting underwater activities, it is of great importance to obtain SSP with high accuracy in real time. The accumulation of the ocean data has enabled the use of machine learning methods in the subsurface field reconstruction with sea surface data. Given that the machine learning methods were capable of extracting nonlinear relations in the ocean, it is more suitable for complex situations in highly dynamic regions.
Currently, the spatial and time resolution of the sea surface data has limited the use of the subsurfacefield reconstruction method, which can hardly determine the mesoscale phenomenon. However, with the development of space technology,the resolution can no longer pose a limitation on the reconstruction method. In this case, more detailed methods should be investigated to establish the specific structures of the ocean mesoscale phenomenon, hence SSPs would be obtained with higher accuracy.