A Survey of Image Information Hiding Algorithms Based on Deep Learning

2019-01-11 07:29:46RuohanMengQiCuiandChengshengYuan

Computer Modeling In Engineering&Sciences 2018年12期

Ruohan Meng , Qi Cui and Chengsheng Yuan

Abstract: With the development of data science and technology, information security has been further concerned. In order to solve privacy problems such as personal privacy being peeped and copyright being infringed, information hiding algorithms has been developed. Image information hiding is to make use of the redundancy of the cover image to hide secret information in it. Ensuring that the stego image cannot be distinguished from the cover image, and sending secret information to receiver through the transmission of the stego image. At present, the model based on deep learning is also widely applied to the field of information hiding. This paper makes an overall conclusion on image information hiding based on deep learning. It is divided into four parts of steganography algorithms, watermarking embedding algorithms, coverless information hiding algorithms and steganalysis algorithms based on deep learning. From these four aspects, the state-of-the-art information hiding technologies based on deep learning are illustrated and analyzed.

Keywords: Steganography, deep learning, steganalysis, watermarking, coverless information hiding.

1 Introduction

With the advent of the information age, more and more people use mobile devices to communicate, work and create. It brings great convenience in our life and work. But the increasingly safety problems are exposed. For example, personal privacy is being snooped, spread, stolen works, copyrighted ownership and so on. In solving this kind of problem, information hiding has been paid much attention to protect privacy and copyright. Information hiding means that secret information is hidden in the cover image by utilizing some characteristics of the cover image, and in the process of transmission of the cover image, no anomalies are found by the detector, so that the stego image can be safely transmitted to the receiver. The receiver extracts secret information through a certain algorithm to realize secret communication. Among them, secret information can be a piece of text, an image and so on. In the process of hiding, the usual method is to convert secret information into a bit stream and hide the bit information in the cover media. In addition, the image can be directly hidden in another image, or the secret information can be corresponded to the mapping dictionary, and the information containing the mapping objects can be transmitted to the receiver. Steganography is a means of covert communication, and has great significance in national security and military affairs.However, steganography is also used by people who are not good intentions while safeguarding the security of network communication. In fact, information hiding has also been used in espionage, terrorist attacks, crimes and other activities recent years. Under such circumstances, how to effectively supervise steganography and prevent and block its malicious or illegal application has become an urgent need of military and security departments in various countries. So steganalysis has been widely concerned and developed in the development of information hiding. Steganalysis refers to the process in which the detector determines whether the stego image contains secret information or not after publishing the stego image. Steganography and steganalysis are two kinds of algorithms which restrict each other and oppose each other.

2 Related works

2.1 Deep learning

In 2006, G.E. Hinton et al. [Hinton and Salakhutdinov (2006)] proposed the method of unsupervised pre-training to optimize the initial value of network weights, and then finetune the weights, which opened the prelude of deep learning. Deep learning is essentially divided into three types: Supervised learning, unsupervised learning and reinforcement learning. Supervised learning refers to machine learning with both characteristic value and label values in input data. By calculating the error between the network output value and label value, it is expected to train the network iteratively to find the best output value.The problems that need to be solved in supervised learning can be divided into two categories: regression [Fu, Gong, Wang et al. (2018)] and classification [Gurusamy and Subramaniam (2017); Yuan, Li, Wu et al. (2017)]. As an essential classification task,image classification is a research field that attracts much attention. The classification of 1,000 categories on ImageNet [Russakovsky, Deng, Su et al. (2014)] contributed to the development of CNN such as VGG [Simonyan and Zisserman (2014)] and ResNet [He,Zhang, Ren et al. (2016)]. Currently, some popular supervised learning algorithms are represented by convolutional neural network (CNN) and deep belief network (DBN).Extreme learning machine [Gautam, Tiwari and Leng (2017)] is a machine learning based on feedforward neuron network. It is also a kind of supervised learning. It is used for prediction [Dutta, Murthy, Kim et al. (2017)], classification and so on. The goal of unsupervised learning is to find some common features, structures, or correlations between the characteristic value of input data through machine learning. Unsupervised learning methods such as auto-encoder [Kingma and Welling (2013)], deep boltzmann machine [Montavon and Müller (2012)] (RBM) and current popular generative adversarial networks (GAN) [Goodfellow, Pouget-Abadie, Mirza et al. (2014)].Reinforcement learning [Mnih, Kavukcuoglu, Silver et al. (2015)] emphasizes how to act on the environment to maximize the expected benefits. In application, deep learning has been greatly developed in the fields of video [Feichtenhofer, Fan, Malik et al. (2018);Wichers, Villegas, Erhan et al. (2018); Wang, Liu, Zhu et al. (2018)], image [Xie, He,Zhang et al. (2018); Barz and Denzler (2018); Wang and Chan (2018)], voice [Yang,Lalitha, Lee et al. (2018); Arik, Chen, Peng et al. (2018); Qian, Du, Hou et al. (2017)],semantic understanding [Qin, Kamnitsas, Ancha et al. (2018); Zhuang and Yang (2018);Sanh, Wolf and Ruder (2018)], and has been further applied in object detection [Roddick,Kendall and Cipolla (2018); Jaeger, Kohl, Bickelhaupt et al. (2018)], image forensics [Yu,Zhan and Yang (2016), Cui, McIntosh and Sun (2018)], intelligent management [Liang,Jiang, Chen et al. (2018); Le, Pham, Sahoo et al. (2018); Duan, Lou, Wang et al. (2017)]and medicine [Mobadersany, Yousefi, Amgad et al. (2018); Rajpurkar, Irvin, Zhu et al.(2017); Akkus, Galimzianova, Hoogi et al. (2017)].

In the field of supervised learning, image classification methods that based on deep learning have been mature, which can be applied to object detection and image retrieval.Object detection is to detect the categories of objects (such as dogs, vehicles or people) in digital images or videos. Faster R-CNN [Ren, He, Girshick et al. (2015)], R-FCN [Dai, Li,He et al. (2016)], YOLO [Redmon, Divvala, Girshick et al. (2016)] and SSD [Liu,Anguelov, Erhan et al. (2016)] are the four most widely used object detection models based on deep learning. Compared with traditional methods, CNN can handle tasks better when traditional methods can not recognize features effectively.

In the field of unsupervised learning, GAN is a typical representative. The basic principle of GAN is that it has two models: A generator and a discriminator. The task of a discriminator is to determine whether a given data looks “natural”, in other words,whether it is generated by machine beings. The generator's task is to constantly capture the data in the training database, so as to generate seemingly “natural” data, which requires the distribution of the original data as consistent as possible. At present, GAN is widely used in many fields, such as image, vision and language. In addition, GAN can also be combined with reinforcement learning. The WGAN (Wasserstein GAN) proposed by Arjovsky et al. in 2017 effectively optimized GAN [Arjovsky, Chintala and Bottou(2017)]. It solves the problem of unstable GAN training, proposes effective methods to ensure the diversity of generated samples, uses specific cross-entropy function to indicate the training process, and uses multi-layer neural network to complete training without designing a specific network structure. Least squares GAN (LSGAN) [Mao, Li, Xie et al.(2017)] optimizes GAN by using a smoother and non-saturating gradient loss function in the discriminator. Hjelm et al. [Hjelm, Jacob, Che et al. (2017)] improve GAN model,which is boundary-seeking GAN. It can be used to train generators with discrete output.Maximum-Likelihood Augmented Discrete GAN [Che, Li, Zhang et al. (2017)] uses the corresponding output following logarithmic likelihood to derive new and low variance targets. Mode regularized GAN [Che, Li, Jacob et al. (2016)] can help to achieve a fair probability quality distribution in the data generation and distribution mode at the early stage of training, thus providing a unified solution to the problem of missing the mode. In Brock et al. [Brock, Donahue and Simonyan (2018)], there is a new achievement of highresolution results, making impressive progress. The latest research progress in image style transfer is designed based on GAN [karras, Laine and Aila (2018); Zhu, Park, Isola et al. (2017); Yi, Zhang, Tan et al. (2017); Kim, Cha, Kim et al. (2017)]. The aim of training process is reducing the transferring loss between the two transformation targets.

2.2 Information hiding

Information hiding technology has a long history. Initially, people used their hair to cover up secret information by hiding secret information in their scalp and waiting for hair to grow long, so as to transmit military information.

In the era of rapid development of computer, the field of information hiding has developed rapidly. In the field of image information hiding, in the early stage, the most representative information hiding algorithm is the spatial least significant bit (LSB)steganography algorithm. In this algorithm, the secret information is hidden to the lowest significant bit of each pixel by using the color insensitivity of human eyes, so as to transmit the secret information. For color images, they are generally composed of three channels: Red (R), Green (G) and Blue (B), each of which occupies 8 bits, ranging from 00x00 to 0Xff. LSB steganography refers to modifying the lowest significant bit of RGB color component. For example, for R channel, assume that R(x,y)=11011010, the lowest significant bit is the last bit 0, if the hidden secret bit is 1, the lowest effective bit 0 is changed to 1, and the final R(x,y)=11011011; if the hidden secret bit is 0, the lowest significant bit is not modified,R(x,y)=11011010. The hidden capacity of LSB steganography algorithm is very impressive, but it is difficult to resist statistical characteristics. Among the steganography methods in spatial domain, secret information is concealed mainly by calculating the pixel values. Typical methods include LSB replacement [Wu, Wu, Tsai et al. (2005)], LSB matching [Mielikainen (2006); Xia,Wang, Sun et al. (2014); Xia, Wang, Sun et al. (2016)], Multi Bit Plane Image Steganography (MBPIS) [Nguyen, Yoon and Lee (2006)] histogram-based algorithm [Li,Chen, Pan et al. (2009)], color palette [Johnson and Jajodia (1998)], Multiple-Based Notational System (MBNS) [Zhang and Wang (2005)] Quantization index modulation(QIM) [Chen and Wornell (2001)] and so on. For frequency-domain steganography,secret information is hidden mainly by modifying some specified frequency coefficients.The transformation algorithms are discrete cosine transform (DCT), Fourier transform,discrete wavelet transform (DWT) and so on. The steganography methods are usually divided into two categories, which are JPEG steganography [Westfeld (2001); Provos and Honeyman (2001); Sallee (2003); Provos and Honeyman (2003)] and discrete wavelet transform steganography [Al-Ataby and Al-Naima (2008); Yang and Deng (2006); Chen and Lin (2006); Talele and Keskar (2010)].

In order to improve the security of secret information transmission, image adaptive steganography algorithms such as S-UNIWARD [Holub, Fridrich and Denemark (2014)],WOW [Holub and Fridrich (2012)], HUGO [Pevny, Filler and Bas (2010)] are proposed.These algorithms select regions with complex texture by setting distortion threshold according to the embedding distortion of image pixels. As can be seen in Fig.1, there is the comparison of effect diagrams in HUGO, S-UNIWARD and WOW. From the comparison of stego images and cover images, it is difficult to visually detect the anomalies of stego images.

Figure 1: The comparison of effect diagrams in HUGO，S-UNIWARD and WOW

2.3 Watermarking

Nowadays, digital media has been widely used, and the problems of modification and reuse have also been paid too much attention. In order to solve the copyright problem of digital media, digital watermarking has been proposed and has been developed [Fridrich(1999); He, Zhang and Tai (2009); He, Chen, Tai et al. (2012)]. Digital watermarking refers to embedding the information of digital, serial number, text or image logo into audio, video or image, thus playing the role of copyright protection, authenticity identification, secret communication and so on. At the same time, after the watermark is embedded in the multimedia data, it can be detected and extracted. Digital watermarking includes two aspects: Watermark embedding, watermark detection and extraction.Although the digital watermark has not been proposed for a long time, it has also developed. The watermarking algorithms are divided into six aspects, blind watermarking[Dorairangaswamy (2009); Eggers and Girod (2001); Kang and Lee (2006)], semi-blind watermarking [Liu, Zhang, Chen (2009); Rahman, Ahammed, Ahmed et at. (2017)], nonblind watermarking [Gunjal and Mali (2015)], robust watermarking [Deng, Gao, Li et al.(2009)], fragile watermarking [Chang, Fan and Tai (2008), Nazari, Sharif and Mollaeefar(2017)] and semi-fragile watermarking [Wu, Hu, Gu et al. (2005)]. Recently, there has been a mutual application between deep learning and digital watermarking. In the deep learning model, the use of digital watermarks to protect the attributes of deep learning models is also a relatively new research method. And there has been some progress in using deep learning method to enhance the robustness of digital watermarking.

2.4 Coverless information hiding

Information hiding is the hiding of secret information by modifying the pixels of the cover image. Modifying the cover image means that it may be detected by detector. In order to avoid this problem completely and resist steganalysis fundamentally, coverless information hiding was proposed by experts in 2014. The coverless information hiding means that the secret information is used as a driver to find or generate a stego image corresponding to the secret information [Gunjal and Mali (2015); Zhou, Sun, Harit et al. (2015); Zhou, Cao and Sun (2016); Yuan, Xia and Sun (2016)]. There is no need to change the cover image during this process. The general approach is to construct a mapping dictionary to form a mapping relationship between secret information and feature information. The sender shares the mapping dictionary with the receiver. By transmitting a natural image or text having a mapping relationship with the secret information to the receiver, the receiver extracts the secret information according to the mapping relationship, thereby resisting the steganalysis.There are two main branches in the field of coverless information hiding. One is coverless text information hiding that is on the transits of text. Due to the prosperity of natural language processing (NLP), the effectiveness of the algorithms on word embedding is witnessed such as word2vec [Goldberg and Levy (2014)] and LSTM [Sak, Senior and Beaufays (2014)]. On the basis of the strong similarity between the mapping relationship of word embedding and text coverless information hiding, there are some achievements which are combining the advanced knowledge of NLP with coverless text information hiding[Zhang, Huang, Wang et al. (2017); Long and Liu (2018); Zhang, Shen, Wang et al.(2016)]. The other is on the transits of image, which is called image coverless information hiding [Duan and Song (2018)].

2.5 Steganalysis

Steganalysis is used to determine whether the stego image contains secret information or not, which means it is to perform a binary classification task, that is, to determine whether the image is stego images or cover images. Early steganographic algorithms can be detected by human perceptual organs, but with the further development of steganographic algorithms, human perceptual organs cannot distinguish the stego image,so steganalysis can effectively distinguish by analyzing the statistical features of the image. At present, the high-order statistical features based on the complex correlation of image neighborhoods have become the mainstream features in steganalysis, such as SRM/SRMQ1 (Spatial Rich Model) [Fridrich and Kodovsky (2012)] and PSRM(Projection Speciation Rich Model) [Holub and Fridrich (2013)] models based on highorder and high-dimensional features, which have achieved good detection results. There are many studies to evaluate the selection of features. Chen et al. [Chen and Shi (2008)]proposed a feature selection based on block Markov features. Pevny et al. [Pevny and Fridrich (2007)] designed PEV features for steganlysis. As a tentative work of highdimensional rich model for JPEG steganalysis, CC-C300 was proposed by Kodovsky et al. [Kodovsky and Fridrich (2010)]. It initiated the development of the high-dimensional feature of steganalysis. Kodovsky et al. [Kodovsky, Fridrich and Holub (2012)] improved their algorithm by pruning to a more compact feature selection. Considering the typical characteristics in image’s mode of DCT, Kodovsky et al. [Kodovsky and Fridrich (2012)]proposed CC-JRM**. Aiming at the detection of S-UNIWARD, Denemark et al.[Denemark, Fridrich and Holub (2014)] proposed the residual feature on content selected of images. To reduce the complexity of the steganalysis algorithm, Holub et al. [Holub and Fridrich (2015a)] proposed DCTR that extract features from residual maps of DCT domain. The phase-aware projection model (PHARM) proposed by Holub et al. [Holub and Fridrich (2015b)] is designed based on the observation on the distinguishing feature in grid of JPEG. In the consideration of selection-cannel, Denemark el al. [Denemark,Fridrich and Comesa?a-Alfaro (2016); Denemark, Boroumand and Fridrich (2016)]proposed algorithms those extracting features of images in independent cannel. In addition to traditional methods based on artificial features, steganalysis methods based on deep learning have been further developed.

3 Image information hiding technology based on deep learning

The application of deep learning in information hiding has gradually developed. Deep learning can be applied to the field of information hiding because part of the model,characteristics, and processes of deep learning correspond to information hiding. In addition, the models of deep learning also have different degrees of cross-application in each branch of classification method when applied to information hiding. Specially, the adversarial theory in GAN is naturally corresponding to information hiding and the detection. GAN based approaches can resist steganalysis more purposeful when applied to potential steganalysis algorithms. The applications related to GAN in coverless information hiding can generate qualified stego images by mapping dictionary, and avoid detection by its high-quality generation effect.

3.1 Steganographic algorithms based on deep learning

3.1.1 Steganographic algorithms of adversarial method

GAN is antagonistic by generator and discriminator, which makes the generated image by generator can resist the discriminator. In the basic application of GAN, the generator simulates the distribution of object categories, and each simulated distribution result is given to the discriminator for two classifications, that is, real images or fake images. If it is determined that the image is a generated image, the determination result is fed back to the generator, and the generator regenerates the image distribution according to the feedback. By continuously cycling through this process, a relatively realistic image is finally generated. In this process, we find that there is a certain similarity between the confrontation between the generator and the discriminator and the confrontation between steganography and steganalysis. As shown in Fig. 2, the process of steganography is to obtain stego images by hiding the secret message into cover images using the embedding algorithm. For the receiver, the secret information in the stego images is extracted by the extraction algorithm; for the steganalysis, after the stego Images are public, the detector determines whether the image is a cover image or a stego image by determining whether the stego images contain a secret message. In GAN, the generative network generates an image by inputting a piece of noise into the generator. The discriminative network inputs the fake images and real images to the discriminator, and the discriminator gives the result of whether it is a real image, that is, judges the authenticity of the generated image.

Figure 2: Comparison between steganography and corresponding steganalysis and GAN

Through analysis, it is clear that the structure of GAN completely corresponds to the structure of the steganography. The generation network corresponds to steganography to generate a stego image, and the discrimination network corresponds to the steganalysis to determine whether it is a false (stego) image. Therefore, many papers apply GAN to steganography. It is proved by experiments that the combination of steganography and GAN makes the steganography process more robust, and the obtained stego image is more concealed and safe. The total optimization function of GAN is shown in Eq. (1).

Where x represents input data, Pz(z )is a noise variable, Pdata(x)is real data, and D( x)represents the probability that x is derived from real data rather than generated data.

This method was first proposed by Hayes et al. [Hayes and Danezis (2017)]. They define a three-party game, Alice, Bob, and Eve. Alice and Bob tried to hide the secret information into the image, use it for secret communication, and Eve eavesdropped on their conversation and judged whether it contained secret information or not. In this process, all three parties are neural networks. Alice is a steganographic constructor as a generator, Eve is a steganalysis as a discriminator, and Bob is an extractor. The stego image generated by the generator is adjusted according to the feedback of the steganalysis, that is, the discriminator. Bob extracts the information on the bits from the resulting stego image. Volkhonskiy et al. [Volkhonskiy, Nazarov, Borisenko et al. (2017)]propose SGAN (Steganographic Generative Adversarial Networks), mainly adding a discriminator that is steganalysis based on GAN. As shown in Fig. 3, the structure of SGAN is a generator and two discriminators those are discriminator and steganalyzer. The role of the discriminator is used to determine whether the image is true and steganalyzer is used to judge whether the image contains secret information. The total optimization function is shown in Eq. (2), and the optimization function of the steganalyzer is added to the optimization function of the GAN model. This method reduces the detection rate of steganalysis, making information hiding more secure. SGAN has increased the discriminator based on DCGAN [Radford, Metz and Chintala (2015)]. The visual effects of SGAN are shown in Fig. 4.

Among them, Pz(z )is the noise variable, Pdata(x)is the real image, S tego( x)is the stego image, α is the weight parameter, which is used to control the importance of the loss function to D and S when G generating image.

Figure 3: The structure of SGAN

Figure 4: The visual effects of SGAN

Shi et al. [Shi, Dong, Wang et al. (2017)] improved on the basis of SGAN. By changing the basic network DCGAN of SGAN to WGAN, the generated image is more real, the image quality is higher, and the network training speed is faster. In addition, the steganalysis network has been improved to GNCNN [Qian, Dong, Wang et al. (2015)].Through the resistance between GNCNN and generator, the stego image is more concealed and the robustness is enhanced. Tang et al. [Tang, Tan, Li et al. (2017)]propose a new framework, ASDL-GAN, which realizes steganography by finding suitable steganographic locations for cover images. In addition, the network modifies the structure of the discriminator and changes the discriminator to the steganalysis model of Xu et al. [Xu, Wu and Shi (2016)]. Yang et al. [Yang, Liu, Kang et al. (2018)] make three improvements on the basis of ASDL-GAN: modifying activation function to Tanhsimulator to reduce the epoch of training; changing generator based on U-NET[Ronneberger, Fischer and Brox (2015)]; adding SCA [Denemark, Boroumand and Fridrich (2016)] to discriminator to enhance the performance of resisting SCA based steganalysis schemes. Ma et al. [Ma, Guan, Zhao et al. (2018)] propose using adversarial samples to train a network to actively attack steganalysis methods. After synthesizing several different steganalysis methods, their steganographic ability has been verified experimentally. Hu et al. [Hu, Wang, Jiang et al. (2018)] propose that secret information be mapped into a noise vector as the input of the generator, and secret image can be directly generated without modifying the image. The algorithm is divided into three steps.1. The GAN network is trained by meaningful noise vectors, so that the generator can directly generate the cover images. 2. The stego images are used as the input of the extractor, and the corresponding network with the generator is used as the extracting network to train the stego image to a one-dimensional vector, so that the recovered vector is as consistent as possible with the original input noise vector, so as to extract secret information. The extractor’s loss function is shown in Eq. (3). 3. The parameters of the generator are supplied to the sender, and the parameters of the extractor are provided to the receiver. This method takes into account the extracting part of secret information, and solves the difficult problem of extracting secret information after using GAN method to hide information.

Here, z represents the random noise of the input GAN, E( s tego)is the noise vector recovered by extractor. G(z)represents the generated by generator from noise z.

Li et al. [Li, Jiang and Cheslyar (2018)] propose the method of using GAN-synthesized texture images as the secret cover. The input noise is mapped to a small patch selected from the original image. The network can generate different textures even with the same original image which makes it difficult for a middle attacker to obtain. The synthesized texture image together with the secret message is sent to another information hiding network to realize secret communication. The information hiding network follows the auto-encoder network architecture to encode and decode the secret message at the same time. Two separate datasets are used in their experiments, one for texture generation and the other for the information concealment.

When there are potential steganalysis methods in the transmission channel, the steganography method based on GAN can effectively solve the problem by reducing the rate of detection of the specific steganalysis methods. Through the confrontation between steganalysis algorithm and generator of GAN, the resulting stego image has higher security and stronger steganalysis resistance. Although this method can resist steganalysis to some extent, the visual effect of stego image generated is not good, and can be further improved in anti-stealth analysis capabilities.

3.1.2 Steganographic algorithms of hiding entire secret image

In addition to using GAN model to hide secret messages, some people propose to hide a entire secret image into a cover image based deep learning and auto-encoder, and the receiver can recover the secret image and the cover image. Baluja [Baluja (2017)]proposes to use neural networks to find the location where is appropriate to embed secret information in the image. As shown in Fig. 5, the encoding process is trained to embed entire secret image to cover image so that the secret information can be dispersed in each bit of the image. Firstly, the secret image is normalized by preprocessing network and important features are extracted at the same time. Then the secret image and cover image with the same size are encoded through hiding network to get stego image. At the same time, the model also trains a decoder corresponding to the encoder to extract the secret image. It is the process of decoding. Although this method can realize the hiding of the entire image, and achieve double recovery of the cover image and the secret image.However, there are still certain problems. For example, after hiding the secret image in the cover image, the hidden secret image can still be seen. At the same time, this method is not resistant to steganalysis. Another method similar to this method is propose by Rahim et al. [Rahim and Nadeem (2017)], which also incorporates encoder-decoder networks and CNN. The experimental results show that the image quality of the stego image is very good, but according to the experimental results, it is found that the color of stego image is different from that of cover image. The loss function for the whole network is defined as Eq. (4):

Where Icoverand Isecretrepresent the input of cover images and secret images, Ostegoand Orecoverrepresent the output of stego images and recover images. Wencodeand Wdecoderepresent are the weights for encoder and decoder. α, βand λare the controlling parameters for the corresponding terms.

Figure 5: The main structure of Baluja [Baluja (2017)]: Encoding process and decoding process

Based on these two methods, Zhang et al. [Zhang, Dong and Liu (2018)] propose that divide the cover image into three channels: Y, U and V. A grayscale secret image is hidden into the Y channel of the cover image through the encoder. Then, using the generator of the GAN model, the U, V channel of the cover image is merged with stego Y channel. The discriminator uses the steganalysis network of Xu et al. [Xu, Wu and Shi(2016)] to resist. Finally, the stego image is generated. Receiver extracts secret images by using decoder. This method has greatly improved the effect compared to the previous algorithm. The color of the stego image and the cover image seem to be the same as the naked eye, and the effect of the residual enhancement is also invisible to naked eye. The effect diagrams of the algorithm are shown in Fig. 6, the first column is cover images, the second column is secret images, the third column is stego images, and the fourth column is recovered secret images.

Figure 6: The effect diagrams of the algorithm in Zhang et al. [Zhang, Dong and Liu(2018)]

The above algorithms can realize high capacity information hiding. The unique features of these methods are those they can hide the whole image in the same size cover image and can recovery cover images and secret images. So we classify them as information hiding classes of hiding entire secret images. This type of methods is suitable for highcapacity secret information transmission or secret image transmission. The advantages of this kind of methods are those these algorithms can improve the hiding capacity and can hide a secret image in one image instead of a small amount of bit information. However,these approaches cannot resist steganalysis on their high embedding rate, that their security is poor.

3.1.3 Steganographic algorithms of selecting embedding location and others

In addition to the above two methods, there are other algorithms that use deep learning models for information hiding. Wu et al. [Wu, Wang and Shi (2016)] used the machine learning method to realize the hiding of LSB. Atee et al. [Atee, Ahmad, Noor et al.(2017)] propose learning based on Extreme Learning Machine (ELM) and selecting the optimal embedded information position. The method can better guarantee the visual effect of the image and has better imperceptibility. Meng et al. [Meng, Rice, Wang et al.(2018)] propose to use the object detection method of faster rcnn to find the complex object area for information hiding, which is called MSA_ROI. Since there may be multiple objects in an image, multiple adaptive steganography algorithms are used to hide information in different objects areas. The structure of this algorithm as shown in Fig. 7,firstly, the cover image is used as an input image to extract image features through VGG.Secondly, feature maps get proposals through the region proposal network. Thirdly,classification and border regression of proposals are considered. Lastly, area steganography is performed using different adaptive steganography algorithms in different target areas. Although this method can be steganographic in a specific area, it cannot accurately hide secret information from foreground objects and reduce the hiding capacity. The effect of the algorithm is shown in Fig. 8.

Figure 7: The structure of MSA_ROI

Figure 8: The structure of MSA_ROI

3.2 Watermarking algorithms based on deep learning

The sharing of deep learning network greatly reduces the debugging and training burden of engineers and researchers, but the following problems, such as model tampering and copyright loss, bring security risks. Therefore, how to guarantee the intellectual property rights of deep learning network and how to guarantee the rights and interests of researchers are the problems that need to be solved in the promotion and application of deep learning. At present, the research is still in its infancy.

Yalcin et al. [Yalcin and Vandewalle (2002)] propose a fragile watermarking technique and a CNN-UM structure that can be used to generate pseudorandom noise patterns added to the host image. Mun et al. [Mun, Nam, Jang et al. (2017)] propose a method for implementing blind watermarking through the CNN model. As shown in Fig. 9, the method mainly includes three parts: watermark embedding, simulated attack and weight modification. First, mark is embedded into input image through CNN, and marked image is obtained. Second, the attack simulation is performed on the marked image, and the attack simulation includes JPEG compression, noising, Gaussian filtering, median filtering and so on, and finally the attacked image is obtained. Last, continue to attack the image through the attacked mark, so as to continuously update the weight. According to the type of attack, adaptively captures more robust regions. Compared with QDFT, this method can be hidden on a special domain through network learning. Kandi et al. [Kandi,Mishra and Gorthi (2017)] propose a CNN-based codebook for robust non-blind watermarking, which is superior to the transform domain method. In addition, in the aspect of deep neural network model protection, Uchida et al. [Uchida, Nagai, Sakazawa et al. (2017)] propose to embed the digital watermark into the trained neural network model to achieve the purpose of copyright protection. Li et al. [Li, Deng, Gupta et al.(2018)] propose a security-guaranteed image watermarking generation scenario for city applications based on CNN. Rouhani et al. [Rouhani, Chen and Koushanfar (2018)]propose deepsigns. It is a novel end to end structure in the field of systematic watermarking and IP protection based on deep learning. The advantages of the algorithm are those it proposes deepsigns, applies a set of metrics to assess the effect of watermark embedding method for deep learning models and so on.

Figure 9: The structure of watermarking algorithm in Mun et al. [Mun, Nam, Jang et al.(2017)]

3.3 Coverless information hiding algorithms based on deep learning

The application of deep learning in coverless information hiding is mainly through the combination of GAN and coverless information hiding. GAN can generate the required image according to certain requirements. Combined with the features, coverless information hiding can directly generate stego image driven by secret information. Thus,the hiding capacity is enhanced and the security is improved.

Liu et al. [Liu, Zhang, Liu et al. (2017)] propose to use the ACGAN generator directly for coverless information hiding. The method divides and expresses secret information into image category information by establishing a mapping dictionary between image categories and text. The image category information is then input into the generator to generate an image as a stego image, thereby implementing coverless information hiding.In order to ensure the security of GAN, Ke et al. [Ke, Zhang, Liu et al. (2017)] propose a generator that satisfies the Kerckhoffs principle, directly generating the stego image by directly using the key and the cover image as the input of the generator in the GAN. Duan et al. [Duan, Song, Qin et al. (2018)] propose that generate two images with same visual based on generative model.

3.4 Steganalysis algorithms based on deep learning

CNN model is widely used in steganalysis owing to the traditional steganalysis algorithm process corresponds to CNN classification process.

Prior to applying deep learning to information hiding, Qian et al. [Qian, Dong, Wang et al.(2015)] propose a steganalysis framework based on deep learning in 2015. The structure of this algorithm can be seen in Fig. 10. The main purpose of this algorithm is to enhance the noise by preprocessing the image with a high-pass filter using a traditional algorithm.The preprocessed image is input into the CNN model for image feature extraction. The activation function is a Gaussian function as shown in (5), and the activation function will have a positive feedback close to 1 only when the input is near zero. Finally, the images are classified at the fully connected layer, so that the distinguishing images are cover images or stego images.

Where σis used to determines the width of the curve. Only when the input is near zero will the activation function have an obvious positive feedback.

Figure 10: The framework of steganalysis algorithm in Qian et al. [Qian, Dong, Wang et al. (2015)]

Zheng et al. [Zheng, Zhang, Wu et al. (2017)] propose a steganography detection framework based on deep learning methods. The deep residual network is trained to distinguish between a cover image and a stego image containing a weak signal. The learning model is used to extract features from the image, and a condensed hierarchical clustering algorithm is used to find the stego image based on the maximum distance metric from the cover image. After these methods, steganalysis based on deep learning is mainly divided into two classes of method. One method is to improve or transform the network model structure, and the other is to further enhance the model expression ability and generalization performance by means of migration learning and model fusion [Dong,Qian and Wang (2017)].

3.4.1 Steganalysis method for improving or transforming network model structure

In the method of improving or transforming the network model structure, Xu et al. [Xu,Wu and Shi (2017)] propose a steganalysis model based on CNN, which still uses the high-pass filter propose by Qian et al. [Qian, Dong, Wang et al. (2015)]. However, the article improves the CNN structure by adding an activation layers (ABS), a batch normalization (BN), and modifying some activation functions to tanh activation functions.Salomon et al. [Salomon, Couturier, Guyeux et al. (2017)] propose another steganalysis model based on CNN structure model. Compared to the network model of Qian et al.[Qian, Dong, Wang et al. (2015)], the model uses only two layers of convolutional layers and increases the number of feature maps in each convolutional layer. In addition, since the scaling operation of the pooling layer is considered to smooth the noise, it is disadvantageous for the subsequent steganalysis operation, so the pooling layer is removed. For the detection of JPEG steganography algorithms, the traditional method relies on extracting features of JPEG. However, Chen et al. [Chen, Sedighi, Boroumand et al. (2017)] propose to convert the JPEG phase perception into the architecture of the CNN network, thereby improving the detection accuracy of the detector. Xu et al. [Xu(2017)] propose a method for detecting J-UNIWARD [Holub, Fridrich and Denemark(2014)] with a 20-layer CNN structure for the BOSSBase dataset [Bas, Filler and Pevny(2011)] of size 256×256 and CLS-LOC dataset of size 512×512. Tan et al. [Tan and Li(2014)] propose using a convolutional auto-encoder in the pre-training process, and multiple convolutional auto-encoders form a CNN. At present, in the method of improving the model structure, the better effect is the CNN structure improvement model proposed by Ye et al. [Ye, Ni and Yi (2017)]. Some experiments can achieve 99.9%detection accuracy. There are four main improvements in this algorithm: (1) Use the high-pass filter kernel in SRM to initialize the weight of the first layer of convolutional layer in CNN, thus replacing the way of random initialization; (2) Define a new truncated Linear Unit that allows the network to adapt well to the distribution of embedded signals;3. Combine the selected channel when inputting a stego image; 4. For steganalysis of low embedding rates, use migration learning strategies. Wu et al. [Wu, Zhong and Liu (2017)]proposed shared normalization (SN) for sharing statistics during training and test process.This approach can train the network effective by seizing the weak signal of stego image.For the transformation network model structure, Wu et al. [Wu, Zhong and Liu (2016)]use a deep residual network in the article for steganalysis. In DNA steganalysis, Bae et al.[Bae, Lee, Kwon et al. (2017)] mainly use deep recurrent neural networks to simulate the internal structure of DNA sequences by extracting hidden layers composed of circulating neural networks (RNNs). In the method of improving the CNN model to make it suitable for steganalysis, many algorithms are still being proposed [Yedroudj, Comby and Chaumont (2018); Ma, Guan, Zhao et al. (2018); Yang, Shi, Wong et al. (2017); Zhang,Zhu and Liu (2018)].

3.4.2 Steganalysis method based on migration learning

In the method of further enhancing the expressive ability of the model by means of migration learning, model fusion, etc., Qian et al. [Qian, Dong, Wang et al. (2016)]propose a new framework based on migration learning to improve the ability of feature learning of CNN model. The model firstly uses the training image composed of high payload and the corresponding coverage rate to pre-train the CNN model, and then transforms the learned feature representation into a regularized CNN model to better detect the hidden low payload. In this way, auxiliary information from high load concealment can be effectively utilized to help detect hidden tasks with low payload.Based on the domain knowledge of Steganalysis Rich Model (SRM) [Kodovsky and Fridrich (2013)]; Zeng et al. [Zeng, Tan, Li et al. (2018)] propose a general JPEG steganalysis framework for hybrid deep learning. The framework proposed by the author involves two main stages: The first stage is manufactured, which corresponds to the convolution stage and the quantization truncation stage of SRM, and the second stage contains a composite depth neural network, which learns model parameters in the training process. Xu et al. [Xu, Wu and Shi (2016)] propose a steganalysis model based on regularized CNN in this paper. It uses the priori information of traditional artificial design features (such as SRM, maxSRM [Denemark, Sedighi, Holub et al. (2014)], etc.) to regularize CNN model and reduce the over-fitting problem in CNN training, so as to improve the steganalysis performance of the model. In the traditional steganalysis feature,the effective global statistical information is not easily obtained by the convolutional network structure. Therefore, this paper proposes to use this kind of information in the training of CNN network by model regularization, so as to promote CNN to learn more effective steganalysis feature expression.

In the steganalysis methods based on deep learning, this kind of method has better detectability in traditional information hiding method, and can also detect stego images well in the steganography algorithms based on deep learning. This brings some challenges to steganography.

4 Evaluation and comparison

This section focuses on the comparison and analysis of algorithm performance in the case of quantifying some variables.

Table 1: Comparison of SGAN [Volkhonskiy, Nazarov, Borisenko et al. (2017)] and SSGAN [Shi, Dong, Wang et al. (2017)] in time costs and accuracy of steganalysis using methods of Qian et al. [Qian, Dong, Wang et al. (2015)] under 0.4bpp on CelebA dataset[Liu, Luo, wang et al. (2015)]

In Tab. 1, the comparisons are expressed as the detection accuracy of SGAN and SSGAN and training time consumption when using the same steganalysis method. Obviously,SSGAN is more effective in resisting detector detection due to its lower rate.

Tab. 2 shows the performances of anti-detection for steganalysis among the three related steganography algorithms under the circumstance of employing SRM steganalysis algorithm. It can be concluded that S-UNIWARD has obtained the most concealed effect from the results due to its experimental result. However, the GAN based approaches demonstrate their potential for development on the advantages in feature learning.

Table 2: Comparison of ASDL-GAN [Tang, Tan, Li et al. (2017)], UT-GAN [Yang, Liu,Kang et al. (2018)] and S-UNIWARD [Holub, Fridrich and Denemark (2014)] in accuracy of steganalysis using methods of Fridrich et al. [Fridrich and Kodovsky (2012)]under 0.4 bpp on BOSSBase dataset [Bas, Filler and Pevny (2011)]

In Tab. 3, the contradistinctions for value of PSNR are showed. The highest value between stego images and cover images reflects the best quality of images, which implemented from ISGAN. But the best quality on recover secret images and original secret images are obtained by ENDS. The experiments show the balance on robustness and concealment.

Table 3: Comparison of DS [Baluja (2017)], EDNS [Rahim and Nadeem (2017)] and ISGAN [Zhang, Dong and Liu (2018)] in PSNR between stego images and cover images and between recover images and secret images on Tiny-ImageNet set

As can be seen in Tab. 4, we convert the experimental data in the paper into uniform standard that is bit-per-pixel (bpp). Under this quantity, the hiding capacity of algorithm[Ke, Zhang, Liu et al. (2017)] is better than that of algorithm [Liu, Zhang, Liu et al. (2017)].

Table 4: Comparison of experimental results for the hiding capacity in Liu et al. [Liu,Zhang, Liu et al. (2017); Ke, Zhang, Liu et al. (2017)]

The comparisons of the PSNR value of watermarked images without noise and with attacked are shown in Tab. 5 and Tab. 6. Under the quantities, it is found that the average value of PSNR in Kandi et al. [Kandi, Mishra and Gorthi (2017)] is the highest without any attack. However, the anti-attack of the approach in Mun et al. [Mun, Nam, Jang et al.(2017)] is better than Kandi et al. [Kandi, Mishra and Gorthi (2017)].

Table 5: Comparison of experimental results for the PSNR value of watermarked images in Li et al. [Li, Deng, Gupta et al. (2018); Mun, Nam, Jang et al. (2017); Kandi, Mishra and Gorthi (2017)]

Table 6: Comparison of experimental results for the PSNR value of watermarked images being attcked by different methods in Mun et al. [Mun, Nam, Jang et al. (2017); Kandi,Mishra and Gorthi (2017)]

Table 7: Comparison of detection errors of steganalysis algorithms for steganographic algorithms at embedding change rate of 0.1 bpp and 0.4 bpp on BOSSBase

[Ye, Ni and Yi (2017)] 0.2442 0.0959[Qian, Dong, Wang et al. (2015)] － 0.2930[Zheng, Zhang, Wu et al. (2017)] －－[Tan and Li (2014)] －－[Xu, Wu and Shi (2016)] －－[Qian, Dong, Wang et al. (2016)] 0.3843 0.2028

Tab. 7 denotes the performance of different deep learning based steganalysis algorithms on different steganography algorithms. For the case of using S-UNIWARD steganography algorithm with 0.1 bpp, steganalysis algorithm proposed in Ye et al. [Ye, Ni and Yi (2017)]has the best detection effect, it is the lowest detection error of 0.3220. When the payload is 0.4 bpp with S-UNIWARD, the proposed steganalysis methods in Wu et al. [Wu, Zhong and Liu (2016); Zheng, Zhang, Wu et al. (2017)] show good results those are 0.0630 and 0.0000 respectively. Meanwhile, the steganalysis method of Wu et al. [Wu, Zhong and Liu(2016)] in the detections of stego images those are got by HUGO and WOW with 0.4 bpp also have low detection errors of 0.0410 and 0.0430 respectively. The steganalysis method proposed by Ye et al. [Ye, Ni and Yi (2017)] has better experimental results when stego images are steganalysed by WOW steganography algorithm.

5 Conclusion and future work

This paper describes and analyses image information hiding algorithms based on deep learning from four aspects: steganography, watermarking, coverless information hiding and steganalysis. In these fields, although some researches have been done, there are still some problems that can be further improved. For example, some steganography based algorithms have strong robustness, but the corresponding extraction methods still need to be improved. In addition, the existing algorithms can be further optimized by introducing a new optimization function suitable for algorithm of information hiding, adding or using a deep learning model that suitable for the algorithm.

Acknowledgement:This work is supported by the National Key R&D Program of China under grant 2018YFB1003205; by the National Natural Science Foundation of China under grant U1836208, U1536206, U1836110, 61602253, 61672294; by the Jiangsu Basic Research Programs-Natural Science Foundation under grant numbers BK20181407; by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAP-D)fund; by the Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET) fund, China.

Computer Modeling In Engineering&Sciences2018年12期

Computer Modeling In Engineering&Sciences的其它文章: An Image Classification Method Based on Deep Neural Network with Energy Model; Lattice Boltzmann Simulation of a Gas-to-Solid Reaction and Precipitation Process in a Circular Tube; A Method for Rapidly Determining the Optimal Distribution Locations of GNSS Stations for Orbit and ERP Measurement Based on Map Grid Zooming and Genetic Algorithm; Structural Design Optimization Using Isogeometric Analysis: A Comprehensive Review; Data Mining and Machine Learning Methods Applied to A Numerical Clinching Model; Data-Driven Upscaling o f Orientation Kinematics i n Suspensions o f Rigid Fibres