999精品在线视频,手机成人午夜在线视频,久久不卡国产精品无码,中日无码在线观看,成人av手机在线观看,日韩精品亚洲一区中文字幕,亚洲av无码人妻,四虎国产在线观看 ?

Optimized Image Multiplication with Approximate Counter Based Compressor

2022-08-24 07:02:42MariaDominicSavioDeepaBharathirajaandAnudeepBonasu
Computers Materials&Continua 2022年8期

M.Maria Dominic Savio,T.Deepa,N.Bharathiraja and Anudeep Bonasu

1SRMIST,Chennai,603203,India

2Vel Tech Multi Tech Dr.Rangarajan Dr.Sakunthala Engineering College,Chennai,601206,India

3Intel Corportation,California,93657,USA

Abstract: The processor is greatly hampered by the large dataset of picture or multimedia data.The logic of approximation hardware is moving in the direction of multimedia processing with a given amount of acceptable mistake.This study proposes various higher-order approximate counter-based compressor (CBC) using input shuffled 6:3 CBC.In the Wallace multiplier using a CBC is a significant factor in partial product reduction.So the design of 10-4,11-4,12-4,13-4 and 14-4 CBC are proposed in this paper using an input shuffled 6:3 compressor to attain two stage multiplications.The input shuffling aims to reduce the output combination of the 6:3 compressor from 64 to 27.Design of 15-4,10-4,9-4,and 7-3 CBCs are performed using the proposed 6:3 compressor and the results obtained are compared with the existing models.These existing models are constructed using multiplexers and 5-3 CBC.When compared to input shuffled 5-3 the proposed 6:3 compressor shows better results in terms of area,power and delay.An approximation is performed on the 6:3 compressor to further reduce the computational energy of the system which is optimal for multimedia applications.The major contribution of this work is the development of two stage multiplier using various proposed CBC.All designs of the approximate compressor(AC)and true compressor(TC)are analysed with 8 x 8 and 16 x 16 image multiplication.The proposed multipliers also provide adequate levels of accuracy,according to the MATLAB simulations,in addition to greater hardware efficiency.As the result approximate circuits over image processing shows the stunning performance in many deep learning network in the current research which is only oriented to multimedia.

Keywords: Multiplier;PSNR;image processing;approximate;compressor;NED

1 Introduction

The digital signal processing(DSP)units are constructed with several arithmetic circuits[1].The DSP blocks play a major role in the operation of the processor [2].During the construction of a DSP,design complexity is observed at the multiplier.Over half the computational energy of any DSP occurs at the multipliers[3].The construction of a multiplier has been a challenging research problem for the last three-decades.Various approaches like Vedic multipliers and well-known algorithmic multipliers using shift and add,Wallace and Dadda tree multiplication,sequential multipliers,and array multipliers are adopted in the construction of a DSP [4].The processor was injected with the problem of a large data set and a heavy payload in the current scenario [5].Many studies are being conducted to process data in Cloud computing to overcome this problem.However,when it comes to stand-alone processors for critical applications,the GPU and FPGA board used in cloud computing are insufficient.As a result,research on the circuit level to reduce processor strain continues.The approximate computation is one of the strategies.Approximation is accomplished using a variety of ways,including software,architecture,and circuit hardware.This study concentrated on circuit level approximation.Image processing circuits such as multipliers,dividers,thresholding,and subtractors are investigated.When it comes to multipliers,numerous researchers have considered using a compressor.So the design of a two-stage reduction Wallace tree multiplier using an advanced compression technique is proposed in this work.The compressor architecture is used for partial product reduction in the multiplier which consists of N inputs,(N-3)cin,(N-3)cout,sum and carry[6,7].Another type of counter-based compressor (CBC) counts the number of 1’s at the input side.The full adder is a basic CBC which counts the three inputs“111”as carry=1,sum=1 the output of which is equal to“11”which is equivalent to the decimal value of three[8].Using the same approach,an N-bit CBC is constructed by stacking the half adder and full adder.Another approach for lowering the bit-length CBC using k-maps relations between the outputs is constructed using logic gates and multiplexer implementation[9].

The concept of input shuffling in 5-3 CBC was introduced to reduce the output combination,thereby improving system efficacy.The 15-4 CBC is constructed using 5-3 CBC in[10]and acquired results are compared to existing results.When compared to the prior 16 x 16 multiplier,this architecture produces better results.The issue with this literature is that the number of full-adders and 5-3 CBCs has increased.Furthermore they have only constructed a 15-4 compressor due to that the multiplier’s design complexity has increased.So the same approach is proposed in 6-3 CBC and the design of 8-4,9-4,10-4,11-4,12-4,13-4,14-4 and 15-4 CBC is performed using the proposed 6-3 CBC and shows attractive results when compared with various existing results.Utilizing the CBC’s in the Wallace tree multiplier improves the energy consumed by DSPs in the processor [11].When the processor is developed for a specific application of multimedia operation,the output approximation has been opted by many researchers to further improve the circuit performance[12].After input shuffling,few output approximations in the truth table reduces the circuit complexity.This approximation holds limited error and a trade-off is maintained between circuit parameter and multiplier accuracy.The accuracy of any arithmetic circuit is evaluated through normalized error distance(NED)[13].These approximate and true multipliers are used in 8-bit and 16-bit image multiplication to evaluate if the multipliers are suitable for multimedia applications.The approximate multiplied images of existing and proposed systems are compared with true multiplication images and the peak signal to noise ratio (PSNR) is computed and [14]provides an introduction to the PSNR parameters.The compressor constructed with 6-3 CBC shows better achievement in power and delay when compared with existing results.The construction of 16-bit multiplier with two-stage reduction using 4-3,5-3,6-3,7-3,8-4,9-4,10-4,11-4,12-4,13-4,14-4 and 15-4 CBC’s has been proposed in this paper.The rest of the section as follows:2.Literature.3.Proposed CBC’s.4.Multiplier designs.5.Image multiplication application.6.Conclusion.

2 Literture

2.1 Compressor

Normal compressors differ from CBCs according to carry andCoutweights.The famous 4:2 compressor introduced by Shen-Fu Hsiao et al.[15]made a revolutionary change in multiplier architecture over the past two decades.It is made up of two full adders and a state-of-the-art method of developing a full-adder using two XOR gates and a multiplexer,invented by Chang et al.[16].With the same approach,several compressor architectures have been developed.For example,a 6:2 compressor weight are given by Eq.(1)

From the Eq.(1)all thecoutsand carry having equal weight in the multiplier architecture the carry orcoutsare generated by the compressor in theithcolumn will be passed toi+1column.The CBC’s are quite different in weights considering the 6-3 CBC having only three outputs and the weights given in Eq.(2).

So the carry outputs of CBC’s in ithstage are given to i+1,i+2...depending on its weights.Many CBC’s blocks were involved in multipliers for the last decade.Marimuthu et al.[17]proposed 8-4,9-4 CBC’s that were constructed using a multiplexer and half adder as shown in Figs.1 and 2.

Figure 1:8-4 CBC[17]

Figure 2:9-4 CBC[17]

In[18]the author proposed 5-3 CBC and 15-4 CBC developed using 5-3 CBC.Here the 5-3 CBC is constructed using XOR and MUX.The modified 5-3 CBC used in 15-4 CBC was proposed in[19]as shown in Fig.3.The concept of input reordering in 5-3 CBC was introduced by Krishna et al.[10]shows the significant improvement in the area,power,and delay as shown in Fig.4.These 5-3 CBCs are also used in 15-4 which is used to construct 16-bit multipliers.The 6-3 and 7-3 CBC’s were proposed using full adder and parallel addition by Anup Dandapat et al.in[20].A modified architecture of the same using XOR-MUX was proposed in[21]as shown in Figs.5 and 6.

Figure 3:8-4 CBC[18]

Figure 4:15-4 CBC[10]

Figure 5:6-3 CBC[21]

Figure 6:7-3 CBC[21]

2.2 Approximation

The scheme of approximation has been introduced when the operation is concerned with image or multimedia data format.With the addition of limiting error,approximation generally reduces the hardware.Consider the OR gate,which outputs one as an output if any of the inputs is one.The OR gate has been replaced by a buffer that connects to any of the inputs,leaving the other input free.Due to both inputs turning to one two times out of four cycles,the output comes with an error of one time out of four cycles,resulting in a 75% pass rate.In [21]the approximation in done in the 4:2 compressor using probability method and utilized in 8 * 8 multiplier.The impact approximate multiplier is analyzed with Conventional Neural Network(CNN)in[22].In[23]the well know VGG deep learning network performance has been enhanced using approximate multiplier.

2.3 Multiplier

Hammad et al.[24],developed four approximate multipliers by reducing the gates in 5-3 CBC.In [25]the author proposed an approximate multiplier by using an 8:2 compressor.Another 16-bit multiplier is targeted for error-tolerant application by using a 4:2 compressor in[26].Taheri et al.[27],have approximated 4:2 and constructed an 8-bit multiplier for image multiplication.Anusha et al.[28],used an approximate full adder to construct an 8*8 multiplier that was utilized in image processing.

3 Proposed CBC

The major work contributed in this paper is the design of input shuffled 6-3 CBC.The input shuffling is made in such a way that the CBCs count “000001”and “100000”as “001”and likewise“000011”and“110000”as“010”.By the input shuffling circuit,several combinations are reduced.For example,both “000001”and “000010”are treated as “010000”.The output combination is reduced from 64 to 27 and the remaining values are considered as don’t care in the k-maps,thereby optimizing the circuit architecture.The input shuffling circuit equations are termed as a=x[1].x[0],b=x[1]+x[0],c=x[3].x[2],d=x[3]+x[2],e=x[5].x[4],f=x[5]+x[4].The output of the input shuffling is given in Tab.1.

Table 1:Input shuffling

Table 1:Continued

The reduced 27 combinations and time occurrences are shown in Tab.2.From the Tab.2,the logic forsum,Cout1,andCout2are calculated and given in Eqs.(3)-(5).The circuit for input shuffling is shown in Fig.7.

Table 2:Repeated combination

Table 2:Continued

Figure 7:Input shuffling

The proposed 6-3 CBC has been used in the construction of various higher-bit CBCs and compared with existing systems.Some of the examples include(i)8-4 CBC designed using 6-3 CBC,full adder and half adder as represented in Fig.8.,(ii)9-4 CBC designed using 6-3 CBC,full adder and half adder as represented in Fig.9.,(iii).10-4 CBC designed using 6-3 CBC,full adder,and half-adder is represented in Fig.10.(iv)15-4 CBC designed using 6-3 CBC,full adder,half adder and an XOR gate as represented in Fig.11.

For the two-stage reduction multiplier,higher bit CBCs are designed using the proposed 6-3 CBC and compared with 5-3 CBC architecture as shown in Figs.12-15.

Two design approximations have been performed in the proposed 6-3 CBCs which makes the design more efficient and suitable for image processing applications.Design-1 approximation is executed insumterm as shown in Eq.(6).Design-2 is executed with bothsumas in Eq.(6)andCout1term as shown in Eq.(7).

Figure 8:Proposed 8-4 CBC

Figure 9:Proposed 9-4 CBC

Figure 10:Proposed 10-4 CBC

Figure 11:Proposed 15-4 CBC

Figure 12:Proposed 14-4 CBC

Figure 13:Proposed 13-4 CBC

Figure 14:Proposed 12-4 CBC

Figure 15:Proposed 11-4 CBC

The proposed and existing CBCs functionality are verified through Verilog-HDL.The Cadence RTL compiler is used to calculate the power,speed,and area.All designs are compiled with 90 nm technology and the results are obtained using a typical Cadence library.Tab.3,shows the area,power,and delay of the proposed and existing CBCs.Tab.4.shows the approximate compressor area,power,and delay with its pass rate.

Table 3:Results of various compressor

Table 4:Approximate compressor

4 Multiplier

The proposed and existing CBCs are involved in partial product reduction of the 8 x 8,16 x 16 multiplier designs.The proposed 16 x 16 multiplier utilizes 5-3 to 15-4 CBCs to achieve two-stage reduction.The proposed multiplier is compared with various conventional 16 x 16 multipliers that use only specific CBCs.The 8 x 8 multiplier is designed with the proposed CBC and the performance is evaluated and compared with existing multipliers.

The proposed multipliers are shown in Figs.16 and 17.The approximate 6-3 compressor is also used in the multiplier design and is compared with related approximate works.In model-1 of 8 x 8 multipliers,approximations are performed in the middle 5 columns.Two approximation models have been developed for 16 x 16 multiplier(i)model-2 approximation on entire CBC where 6-3 is used(ii)model-3 approximation from middle to LSB side CBC where 6-3 is used.Tabs.5 and 6,give the results of 8 x 8 and 16 x 16 multipliers respectively.

Figure 16:8*8 multiplier

Figure 17:16*16 multiplier

Table 5:8*8 Multiplier results

Table 6:16*16 Multiplier results

Table 6:Continued

5 Image Multiplication

The VLSI arithmetic circuit is essential in many digital applications.In this work,both true and approximate multipliers have been designed using various CBCs.The approximate multiplier can be suitable for any multimedia application.To check the quality of the proposed approximate multiplier,8-bit and 16-bit images are multiplied using the proposed and existing multipliers.Two different 8-bit test samples were taken from Signal and Image processing Institute (SIPI) of the University of Southern California(USC)(http://sipi.usc.edu/database)data set and multiplied with the true,existing approximate,and proposed approximate multiplier.

The results of multiplication with their corresponding PSNR values are shown in Fig.18.For contrast scaling applications,the image will be self-multiplied,so the standard test image,Lena is squared using all multipliers and the PSNR values are observed as shown in Fig.19.The Performance of the 16-bit multipliers is evaluated with two test images taken from(https://sourceforge.net/projects/testimages) data set and multiplied with true,existing approximate,and proposed approximate multipliers as shown in Fig.20.The standard test image,16 bit-Lena is squared and the PSNR values are noted as shown in Fig.21.

Figure 18:8-bit clock*aeroplane

Figure 19:8-bit lena*lena

Figure 20:16-bit lion*pencil

Figure 21:16-bit lena*lena

6 Conclusion

The construction of multipliers based on various CBCs has been presented in this paper.The design of an input shuffled 6-3 counter compressor has been presented in this work.Using the proposed 6-3 CBC,several higher-length CBCs have been constructed in this paper.The proposed two-stage reduction multiplier shows an average improvement of 6%in delay and 4%in power.The proposed 6-3 CBC,when used in 15-4 CBC,shows an average improvement of 5% in area,19% in power,and 7% in delay.The method of approximate computation is used in all compressors and the proposed system shows admirable PSNR results when compared to the conventional techniques.In future the approximate circuit can be involved in much application.This work can extended to construction of convolutional layer with approximate multiplier which is used for imaging application.

Acknowledgement:Ms Anuj Chahal and Ms Antara Ghosh are gratefully appreciated for their support.

Funding Statement:The authors received no specific funding for this study.

Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.

主站蜘蛛池模板: 久久精品娱乐亚洲领先| 青青操国产| 中文精品久久久久国产网址| 激情无码视频在线看| 欧美 国产 人人视频| 亚洲无码日韩一区| 在线无码av一区二区三区| 亚洲无线国产观看| 国产免费看久久久| 国产无码制服丝袜| 国产一级视频久久| 亚洲精品久综合蜜| 日韩欧美国产成人| 日韩视频免费| 依依成人精品无v国产| 欧美亚洲香蕉| 亚洲精品777| 欧美日韩国产精品综合| 国产亚洲欧美日韩在线观看一区二区 | 国产精品第页| 国产精品污污在线观看网站| 亚洲成人播放| 国产屁屁影院| 国产精品久久久久鬼色| 欧美精品色视频| 久久99热这里只有精品免费看| vvvv98国产成人综合青青| 一区二区三区国产精品视频| 在线精品视频成人网| 91精品国产91久久久久久三级| 亚洲天堂网在线视频| 久久精品无码专区免费| 亚洲欧美综合精品久久成人网| 欧美有码在线观看| 老司机精品一区在线视频| 九九热在线视频| 国产亚洲精品91| 久久综合九色综合97婷婷| 新SSS无码手机在线观看| 国产精品无码作爱| 精品一区二区无码av| 国禁国产you女视频网站| 国产正在播放| 麻豆精品国产自产在线| jizz在线免费播放| 国产精品自在线拍国产电影| 色哟哟精品无码网站在线播放视频| 国产激情第一页| 中文字幕精品一区二区三区视频| 99久久精品国产麻豆婷婷| 人妻一本久道久久综合久久鬼色| 欧美成人午夜在线全部免费| 91青青视频| 日韩免费毛片| 中国一级毛片免费观看| 玖玖精品视频在线观看| 国产 在线视频无码| 久精品色妇丰满人妻| 22sihu国产精品视频影视资讯| 毛片网站观看| 免费国产不卡午夜福在线观看| 亚洲一区黄色| 欧美日韩另类国产| 99re在线免费视频| 国产制服丝袜无码视频| 日本三区视频| 欧美国产日韩在线播放| 久久精品aⅴ无码中文字幕| 精品久久久久久成人AV| 91精品专区| 国产精品福利尤物youwu| 欧美午夜在线视频| 日韩av无码精品专区| 波多野结衣视频一区二区| 最新亚洲人成无码网站欣赏网 | 亚洲v日韩v欧美在线观看| a亚洲天堂| 孕妇高潮太爽了在线观看免费| 亚洲天堂777| 国产精品福利一区二区久久| 无码国产伊人| 国产成人精品一区二区不卡|