999精品在线视频,手机成人午夜在线视频,久久不卡国产精品无码,中日无码在线观看,成人av手机在线观看,日韩精品亚洲一区中文字幕,亚洲av无码人妻,四虎国产在线观看 ?

Study on Face detection method based on lightweight convolutional neural network

2020-12-23 05:16:16GaohuaYAOJianhaiYUZhenkunLU
機(jī)床與液壓 2020年18期

Gao-hua YAO,Jian-hai YU,Zhen-kun LU

(1 College of Electronic and Information Engineering,Wuzhou University,Wuzhou 543002,China)

(2 Guangxi Colleges and Universities Key Laboratory of Image Processing and Intelligent Information System,Wuzhou University,Wuzhou 543002,China)

(3 College of Information Science and Engineering,Guangxi University for Nationalities,Nanning 530000,China)

Abstract:The current convolutional neural network has some disadvantages such as a large amountof parameters,a slow detection speed,low detection accuracy in a complex environment,and it cannotbe embedded inmobile electronic devices.For improvement,this paper designs a face detection method with two layers of front-to-back separation and lightweight convolutional neural network.The first layer of the network uses a full convolutional neural network to quickly extract facial features and generate a large number of face boundary candidate frames.The second layer of the network uses a deep fully connected convolutional neural network to screen the candidate regions of the face inferred by the first layer of the network and output the face size,coordinates and confidence.The experiments show that the face detectionmethod designed in this paper has higher detection accuracy and detection speed on the Face Detection Data Set and Benchmark(FDDB),and the lightweight network designmakes it possible to transplant the algorithm tofront-end electronic devices.

Key words:Parameter quantity,Electronic equipment,Complex environment,F(xiàn)ull convolution network,F(xiàn)ace boundary candidate box,Lightweight

1 Introduction

Face detection,as a part of face recognition,has always been a hot topic in academia.Prior to the outbreak of deep learning,this papermainly extracts artificially the designed image features such as Haar,HOG,SIFT,LBP,and use the algorithms such as Adaboost,ACF[1],SVM,DPM[2]to detect faces.The common disadvantage of these algorithms is that the feature extraction is relatively simple,and it cannot detect faces under the complex factors such as pose,expression,blur,and occlusion.

With the development of integrated circuits[3],the performance of graphics cards has been greatly improved,and some algorithms around deep learning have sprung up,such as the classic deep learning face detection algorithm Cascade CNN[4]proposed by H Li,and derived from VJ[5].It uses a three-layer cascaded convolutional neural network,and can be used to achieve fast and accurate detection of faces in complex scenes to a certain extent.But because of it increase additionally three correction networks aiming at a person face frame on the foundation of the network of 3F,it consumesmore calculations.

In 2014,Girshick R proposed the algorithm RCNN[6]and the improved series of algorithms Fast-RCNN[7]and Faster-RCNN[8].These series of algorithms are proposed for general targets and can be used for face detection targets.The advantage is that the detection performance is better,the disadvantage is that the speed is slow,and it cannot meet the real-time requirements of speed for face detection.In 2016,J Redmon proposed the algorithm YOLO[9]which is an end-to-end network,and that can reducemany unnecessary calculations.The detection speed is fast enough,but the detection accuracy is low.Also appeared in the same period is the algorithm SSD[10],which has the fast detection speed as the algorithm YOLO.But the accuracy of face detection of dense small targets is poor.In 2018,Xu Tang et al.Proposed the algorithm PyramidBox[11],which detects small faces in uncontrolled environments.The backbone network module of the algorithm is VGG16,and increased low-level feature pyramid network layers,environmentally sensitive Forecasting and environmental enhancement module.The detection accuracy is extremely high,but it is still in the exploratory stage and rarely used in actual engineering at present.

Most of the deep learning face detection algorithms rely on high-performance servers and are difficult to apply to embedded electronic devices.In order to realize the algorithm based on the idea of cascade,it designs a face detectionmethod of two cascaded convolutional neural networks separated front and back.First,ituses the first layer of lightweight convolutional network to rough extract the face features and generate a large number of face candidate frames.Refer to the design of Liu[12],it designs the second layer of the network as a deep convolutional neural network to remove a lot of redundant features in the candidate face image fragments generated by the first layer of the network,and filter the face frame to return the true position of the face.

2 Network design

2.1 Overall framework

The face detection framework in this paper is designed in two levels,and its operation diagram is shown in Fig.1.The original image is an RGB image downloaded randomly on the network.Before the detected image is input to the network,the original imagemust be scaled to generate a pyramid image,so that the first-level network can cover face image fragments of different sizes.Then itextracts facial features of different scales of the image fragments inputted to network.The first-level network is used to generate face candidate frames,and the second-level is used to judge and filter non-face candidate frames,and output the final face border.

Fig.1 Face detection fIow chart

2.2 F-net network structure

F-Net is designed as a lightweight full convolutional network.The input of this network is an RGB image fragment of 18*18*3 pixels,which is obtained by sliding awindow on the image detected through awindow of 18*18 pixels.After four convolutional layers and two pooling layers,the outputs are tensors of size 2*1*1 and 4*1*1,as shown in Fig.2.These two outputs respectively represent the score of the face category and the scores of four coordinate information of the face frame.Comparing the face classification score and the preset threshold,it deconvolutes the feature points that are larger than the preset threshold,and return to their true positions on the original image toform several candidate framesR={r1,r2,r3,…,rn}.In order to obtain a larger recall rate,we preset the smaller threshold setting of F-Net.The candidate boxricontains information such as the coordinates of the candidate box,the face prediction score,and the feature value offset.rican be expressed as

Because the first-level network outputs a large number of overlapping candidate frames,so the overlapping candidate frames need to be filtered.We use NMS(Non-Maximum Suppression)algorithm tofilter the overlapping candidate frames.The filtered candidate frame image fragmentswere scaled to 34*34 pixels by a linear interpolation algorithm to input to the next-level network for screening.

2.3 B-net network structure

The input of the B-Net network is the output of the F-Netnetwork.F-Net processes the original image and scales a series of images of the original image through a sliding window to generate a number of feature maps.By comparing with the preset thresholds,the features of the face are returned to the original image position by deconvolution,and the image segment is intercepted and the image size is changed to 34*34 pixels by linear interpolation.B-Net is a deeper fullyconnected network than the F-Net network.The task of this network is to make more accurate judgments and filters on the output of the F-Netnetwork and output the final face coordinates.The network structure consists of four convolutional layers,three pooling layers,and twofully connected layers[13],as shown in Fig.3.

Fig.2 F-net Network structure diagram

Fig.3 B-net network structure diagram

3 Experim ental resu lts and analysis

3.1 Experimental data

The experimental data comes from WIDER FACE[14]and FDDB[15]which are two authoritative datasets in world.WIDER FACE is used as the initial data set for our network training,and FDDB is a setof algorithm to evaluate the test data.We design F-Net and B-Net,with inputs of 18*18 pixels and 34*34 pixels,and the two networks are trained separately.Before training F-Net,we need to randomly crop WIDER FACE into image fragments of 18*18 pixels,and calculate the size of the face image fragments and correspondingly labeled face area IOU values,and divide the training data into positive samples,negative samples,and Some face samples.The image samples with IOU values greater than 0.65 are taken as positive samples,and the image fragments with 0.35<IOU<0.65 are taken as partial face samples,and the fragmentswith IOU<0.35 are taken as negative samples.Positive and negative samples are used for the classification task of the network,and positive samples and some face samples are used for the regression task of the network.The ratio of the three samples is set to 3∶1∶1.The training samples of B-Net are generated by training to generate F-Net network model tests.Similarly,the training samples are also divided into positive samples,some face samples,and negative samples to train the B-Net network.

3.2 Loss function

F-Net and B-Net designed in this paper are trained simultaneously by combining the face classification task and the face border regression task.Assign weights to both tasks based on their importance,and the design of the loss function also varies according to the task of the network.The task of face classification is to determine that each point on the featuremap belongs to a human face and a non-human face,so its loss function is represented by a cross-entropy loss function,Such as formula(1):

The values ofare only 0 and 1,representing the true labels of the sample.0 represents the human face,and 1 represents the non-face.represents the predicted probability of the sample.This loss function represents the difference between the true sample label and the predicted probability.The task of face border regression is to calculate the distance between the four points of the face candidate window and the real face coordinates,and adjust the coordinates of the candidate window according to this distance difference.Therefore,the Euclidean distance loss function is used to represent,Such as formula(2):

In order to keep the gradient of the face classification task and the frame regression task in the training process at an order ofmagnitude,at the same time,according to the importance of the two tasks,aweight λis introduced and the value is set to 0.5.The total loss functionLtotalis as in formula(3):

3.3 Training process

The experiments in this paper are performed under the deep learning framework TensorFlow1.2.The experimental environment is shown in Table 1.The black and gray curves are used to represent F-Net and B-Netnetworks,while retaining the changes of cls_accuracy(recall rate),cls_loss(Face classification loss)and bbox_loss(face border regression loss)with the number of iterations.As shown in Fig.4,Tensorboard is used to visualize the changes of the three variables.As shown in Figs.5 and 6,the recall rates of F-Net and B-Net reached 0.96 and 0.99 respectively,the loss of face classification was reduced to 0.15 and 0.05,and the frame regression loss was reduced to 0.05 and 0.04.

Tab Ie 1 Lab environm ent

Fig.4 Reca II rate

Fig.5 Face c Iassification Ioss

Fig.6 Border regression Ioss

3.4 ROC curve analysis

ROC curve,also known as the receiver operating characteristic curve,under certain stimulus conditions,taking the false report probability P(y/N)obtained by the subjects under different judgment standards as the abscissa,and taking the hit probability P(y/SN)as the ordinate,is a line into which connected each point.

We use the authoritative benchmark dataset FDDB to judge the performance of the algorithm.The FDDB dataset is derived from news pictures and a database created by extracting their news titles.The pictures include a variety of complex poses,lighting,backgrounds,expressions,actions,and occlusion environment,closing to reality and identifying the environment.There are two types of FDDB test results,of which one is continuous counting evaluation,and the other is discrete counting evaluation.This paper uses discrete evaluationmethods,which has obvious advantages than the face detection algorithms currently such as Viola-Jones,F(xiàn)ast Bounding Box[16],SURF Cascade[17],XZJY[18],SURF Fronta[19],Boosted Exemplar[20],Joint Cascade[21],Cascade CNN,DDFD[22],CCF[23],BBFCN[24],this algorithm has obvious advantages.As shown in Fig.7,the abscissa represents a false positive number,and the ordinate represents a true rate.The accuracy of the algorithm in this paper can reach 0.917,and the accuracy can reach above 0.91 when the false positive number is less than 500.

Fig.7 ROC cu rves ofm u Itip Ie a Igo rithm s

The paper has proven the advantages of the algorithm by observing the ROC curve.It canintuitively feel the performance of the algorithm through some legends.From the FDDB data set,it selects the legends constrained by factors such as scale,skin color,expression,blur out-of-focus,pose,and occlusion to observe the performance of our algorithm.As shown in Fig.8,our algorithm works onmultiple non-controlling factors.It can still perform well under the constraints ofmultiple uncontrolled factors,and detect faces successfully in pictures in the context of complex factors.

Fig.8 Performance of the a Igorithm under m u Itip Ie com p Iex factors

4 Conclusion

This paper explores a face detection method that is robust to complex environmental factors and has fast detection speed.It designs a two-layer lightweight convolutional neural network,and the input sizes of the network are 18*18 and 34*34 respectively.The pyramidal preprocessing of the input image allows the network to detectmulti-scale faces.At the same time,the design of lightweight network and cascademethods reduces network parameters,and improves detection accuracy,and it also has obvious advantages comparing with the popular face detection algorithms.The focus of this paper is on the exploration of convolution methods,using deep separable convolutionmethods tofurther accelerate the network,and at the same time,to transplant the algorithm into the embedded devices to realize its due value.

主站蜘蛛池模板: 亚洲av无码成人专区| аⅴ资源中文在线天堂| 国产SUV精品一区二区6| 国产精品va| 91麻豆精品国产91久久久久| 欧美日韩免费观看| 99精品免费在线| 97青草最新免费精品视频| 精品91自产拍在线| 久草热视频在线| 性视频久久| 激情网址在线观看| 夜夜爽免费视频| 亚洲国产天堂久久综合| 91精选国产大片| 伊人AV天堂| 日韩av手机在线| 日韩二区三区无| 久久久久久尹人网香蕉| 国产人成在线观看| 麻豆精品在线播放| 波多野结衣无码中文字幕在线观看一区二区 | 丝袜亚洲综合| 国产成人区在线观看视频| 亚洲五月激情网| 久久精品日日躁夜夜躁欧美| 天天躁狠狠躁| 中文字幕永久在线观看| 日韩a级毛片| 人妖无码第一页| www精品久久| 国产一级小视频| 国产成人亚洲无吗淙合青草| 国产亚洲视频播放9000| 亚洲综合精品第一页| AV天堂资源福利在线观看| 毛片最新网址| 国产在线拍偷自揄观看视频网站| 久久婷婷五月综合色一区二区| 中字无码精油按摩中出视频| 无码专区在线观看| 中文字幕av无码不卡免费| www.亚洲一区| 中日韩欧亚无码视频| 麻豆精品在线视频| 人妻无码中文字幕第一区| 中文字幕欧美日韩高清| 亚洲男人天堂久久| 澳门av无码| 99精品热视频这里只有精品7| 综合成人国产| 欧美在线精品怡红院| 毛片免费在线| jizz国产视频| 丁香婷婷综合激情| 亚洲欧美综合在线观看| 成年人免费国产视频| 97久久精品人人| 成年网址网站在线观看| 国产亚洲视频中文字幕视频| 99视频在线免费观看| 美女被躁出白浆视频播放| 亚洲免费播放| 亚洲欧美日韩中文字幕一区二区三区| 国产主播喷水| 日本久久久久久免费网络| 免费不卡视频| 亚洲欧美在线综合图区| 欧美精品高清| 成人亚洲国产| 97久久免费视频| 99re这里只有国产中文精品国产精品 | 日韩在线网址| 制服丝袜国产精品| 狠狠色香婷婷久久亚洲精品| 免费网站成人亚洲| 国产成人免费观看在线视频| 91久久偷偷做嫩草影院| 99久久精品国产精品亚洲 | 尤物亚洲最大AV无码网站| av在线5g无码天天| 日本91在线|