摘 要 分別用常規(guī)BP神經(jīng)網(wǎng)絡(luò)、貝葉斯正則化BP神經(jīng)網(wǎng)絡(luò)及遺傳算法-貝葉斯正則化BP神經(jīng)網(wǎng)絡(luò),對(duì)多組分有機(jī)酸的滴定數(shù)據(jù)進(jìn)行主成分非線性擬合。結(jié)果顯示,貝葉斯正則化能限制網(wǎng)絡(luò)權(quán)值,避免過(guò)擬合;遺傳算法則使網(wǎng)絡(luò)的全局優(yōu)化能力和穩(wěn)健性提高。對(duì)26個(gè)測(cè)試樣本中的乙酸、乳酸、草酸、琥珀酸、檸檬酸和烏頭酸6種組分,以及檸檬酸和烏頭酸的總量進(jìn)行了15次擬合預(yù)測(cè),平均預(yù)測(cè)相對(duì)誤差(RSE)分別為10.02%, 934%, 10.66%, 12.18%, 29.81%, 31.94%和3.80%; 性質(zhì)相似的檸檬酸和烏頭酸的擬合預(yù)測(cè)能力較差,但其總量可得較好的預(yù)測(cè)結(jié)果。應(yīng)用本法對(duì)兩種糖蜜中有機(jī)酸進(jìn)行了分析,并與離子色譜分析結(jié)果進(jìn)行了對(duì)比。
關(guān)鍵詞 遺傳算法;神經(jīng)網(wǎng)絡(luò);貝葉斯正則化;糖蜜;有機(jī)酸分析
1 引 言
糖蜜中有機(jī)酸一般由烏頭酸、檸檬酸、蘋(píng)果酸、琥珀酸、草酸和乳酸等小分子羧酸組成。這些組分的準(zhǔn)確測(cè)定對(duì)糖蜜的進(jìn)一步綜合利用有實(shí)際意義。目前,分析有機(jī)酸的方法主要有反相高效液相色譜法(HPLC)、離子色譜法(IC)和毛細(xì)管電泳法(CE)。這些方法都是以組分分離為前提,使其應(yīng)用受到一定限制。計(jì)算滴定法是滴定方法與計(jì)算技術(shù)相結(jié)合的方法。將多元校正與電位滴定相結(jié)合,能充分利用了整個(gè)連續(xù)滴定曲線的信息,可進(jìn)行多組分同時(shí)測(cè)定。
BP神經(jīng)網(wǎng)絡(luò)算法(BP-ANN) 具有較強(qiáng)的非線性校正能力。針對(duì)傳統(tǒng)BP神經(jīng)網(wǎng)絡(luò)易陷入局部最小的缺點(diǎn),具有并行性和全局搜索能力特點(diǎn)的遺傳算法被引入網(wǎng)絡(luò)訓(xùn)練過(guò)程, 但仍存在網(wǎng)絡(luò)節(jié)點(diǎn)數(shù)偏多,易出現(xiàn)過(guò)擬合現(xiàn)象。針對(duì)以上不足,本研究采用遺傳算法優(yōu)化網(wǎng)絡(luò)的結(jié)構(gòu)和參數(shù),通過(guò)貝葉斯正則化自適應(yīng)調(diào)節(jié)神經(jīng)網(wǎng)絡(luò)的訓(xùn)練過(guò)程參數(shù)的數(shù)量及大小,增強(qiáng)網(wǎng)絡(luò)的泛化能力,構(gòu)建一種更為穩(wěn)健的優(yōu)化算法。建立了多組分滴定的非線性校正模型,分析了復(fù)雜樣品中多種有機(jī)酸。
2 實(shí)驗(yàn)部分
2.1 儀器與試劑
809 Titrando智能電位滴定儀(瑞士萬(wàn)通)。
0.1 mol/L的乙酸、乳酸和草酸的標(biāo)準(zhǔn)溶液;0.05 mol/L的琥珀酸、檸檬酸、烏頭酸的水溶液,均由分析純?cè)噭┡c蒸餾水配制。以0.1 mol/L NaOH標(biāo)準(zhǔn)溶液滴定其含量,冷藏,一周內(nèi)使用。
2.2 實(shí)驗(yàn)方法
2.2.1 混合標(biāo)樣的測(cè)定 模擬糖蜜中有機(jī)酸組成,采用均勻混料設(shè)計(jì)方法配制161組混合有機(jī)酸標(biāo)準(zhǔn)溶液,各加入20 mL 0.50 mol/L Na2SO4作離子強(qiáng)度調(diào)節(jié)劑,用0.1 mol/L HCl調(diào)至pH 2.0,再用蒸餾水稀釋到50 mL。用0.1 mol/L NaOH 進(jìn)行滴定。
2.2.2 滴定方法參數(shù) DET動(dòng)態(tài)滴定模式;最大滴定體積增量(Max. vol. increment) 0.20 mL;滴定終止條件(Stop means. value)pH 10;攪拌速度設(shè)為第7級(jí);滴定速度由系統(tǒng)優(yōu)化。
2.2.3 電位滴定的DV-pH曲線和量測(cè)數(shù)據(jù) 將滴定儀測(cè)量得到的pH-V數(shù)據(jù)導(dǎo)出,經(jīng)體積和濃度校正后用插值方法將所有樣本調(diào)至pH 2.0~10.0,等間隔0.05 pH重構(gòu)為V-pH數(shù)據(jù)。為使滴定量測(cè)數(shù)據(jù)有可比性,定義DV表示滴定過(guò)程中酸度從某一基準(zhǔn)pH值變化至指定pH值時(shí)的滴定劑增量,得到DV-pH曲線,它綜合反映了混合體系中各弱酸堿組分的離解性質(zhì)和含量水平。
2.3 算法步驟
對(duì)滴定數(shù)據(jù)集進(jìn)行預(yù)處理,經(jīng)主成分變換進(jìn)行變量壓縮;以主成分為輸入變量,待測(cè)組成為輸出變量,構(gòu)建三層BP神經(jīng)網(wǎng)絡(luò),設(shè)定神經(jīng)網(wǎng)絡(luò)各層的輸出函數(shù),引入貝葉斯正則化方法為網(wǎng)絡(luò)訓(xùn)練函數(shù)并初始化;以若干次訓(xùn)練網(wǎng)絡(luò)的權(quán)值和閥值作為遺傳算法的初始種群;運(yùn)行遺傳算法對(duì)由網(wǎng)絡(luò)權(quán)值和閥值構(gòu)成的初始種群進(jìn)化優(yōu)化,得最優(yōu)解個(gè)體;以進(jìn)化所得最優(yōu)解逆變回網(wǎng)絡(luò)權(quán)值和閥值作為網(wǎng)絡(luò)的初值,進(jìn)行二次貝葉斯正則化BP網(wǎng)絡(luò)擬合;以訓(xùn)練好的網(wǎng)絡(luò)對(duì)預(yù)測(cè)樣本進(jìn)行濃度預(yù)測(cè),評(píng)價(jià)方法可靠性;以確定的網(wǎng)絡(luò)對(duì)樣品測(cè)量數(shù)據(jù)進(jìn)行擬合。
分 析 化 學(xué)第39卷
第5期曹家興等: 遺傳算法-貝葉斯正則化BP神經(jīng)網(wǎng)絡(luò)擬合滴定糖蜜中有機(jī)酸
3 結(jié)果與討論
3.1 滴定量測(cè)數(shù)據(jù)集及其主成分分析
圖1為若干混合有機(jī)酸標(biāo)準(zhǔn)溶液以pH 1.5為基準(zhǔn)的DV-pH曲線。對(duì)測(cè)量數(shù)據(jù)進(jìn)行主成分分析,交叉驗(yàn)證的均方根值(RMSECV)隨主成分?jǐn)?shù)的變化情況如圖2所示。
考慮到樣品的復(fù)雜性、模型的通用性,選擇建模的主成分?jǐn)?shù)為8。由各主成分載荷隨pH值的分布可知,在pH≥7.5時(shí),各主成分變化趨同,故選取pH 2.5~7.5的數(shù)據(jù)進(jìn)行分析。
圖1 若干混合有機(jī)酸滴定的DV-pH曲線
Fig.1 DV-pH titration curves for some mixtures of organic acids
圖2交叉驗(yàn)證的均方根值隨主成分?jǐn)?shù)變化趨勢(shì)圖
Fig.2 Trend plot of determining the number of factors by root mean square error of cross validation(RMSECV)
3.2 GA-Bayesian-BP模型參數(shù)設(shè)置
以8個(gè)主成分為BP神經(jīng)網(wǎng)絡(luò)的輸入,6個(gè)待測(cè)組分附加1個(gè)由檸檬酸和烏頭酸組成的總量,共7個(gè)組分為輸出層節(jié)點(diǎn)。采用單隱層的網(wǎng)絡(luò),隱含層的節(jié)點(diǎn)數(shù)(m)由經(jīng)驗(yàn)公式:m=N#8226;(l+3)+1(N和l分別表示輸入和輸出層節(jié)點(diǎn)數(shù))確定。輸入層與隱層、隱層與輸出層之間的傳遞函數(shù)分別用tansig函數(shù)和purelin函數(shù),優(yōu)化學(xué)習(xí)算法選用的是貝葉斯正則化學(xué)習(xí)算法,用均方誤差與權(quán)值的線性組合值作網(wǎng)絡(luò)性能評(píng)價(jià)函數(shù),即:
msereg=γ1N∑Ni=1(yi-ti)2+(1-γ)#8226;1m∑mj=1w2j(1)
式中,yi為網(wǎng)絡(luò)預(yù)測(cè)值, ti為網(wǎng)絡(luò)目標(biāo)值,γ為比例系數(shù)。可見(jiàn),貝葉斯正則化能自動(dòng)限制網(wǎng)絡(luò)權(quán)值的規(guī)模,避免節(jié)點(diǎn)數(shù)過(guò)多造成對(duì)訓(xùn)練數(shù)據(jù)過(guò)擬合,增強(qiáng)神經(jīng)網(wǎng)絡(luò)的泛化能力。
對(duì)網(wǎng)絡(luò)初始權(quán)值w和閥值b順序?qū)崝?shù)編碼,組成n維實(shí)數(shù)向量(n為所有連接權(quán)和閥值的個(gè)數(shù)),構(gòu)成遺傳算法的個(gè)體。結(jié)果(圖3)表明:遺傳的交叉概率(Pc )為0.85 ,變異概率(Pm)為0.2,權(quán)值和閥值取值范圍為(-1,1),優(yōu)化目標(biāo)為最小化msereg,即適應(yīng)度函數(shù)為Fitness=1/mesreg時(shí),較適宜初始種群為60 ,進(jìn)化終止代數(shù)為500 代。
圖3 適應(yīng)度與種群大小的關(guān)系(a); 適應(yīng)度與進(jìn)化代數(shù)的關(guān)系(b)
Fig.3 Relationship between fitness and population size(a); Relationship between fitness and genetic
algorithm(GA) generations(b)
為提高GA算法效率,先由網(wǎng)絡(luò)對(duì)校正集訓(xùn)練60次的權(quán)值和閥值構(gòu)成初始種群,再運(yùn)行GA進(jìn)化搜索,當(dāng)進(jìn)化計(jì)算或網(wǎng)絡(luò)誤差滿足一定要求時(shí),以其最優(yōu)個(gè)體逆變?yōu)閣和b傳遞回網(wǎng)絡(luò)進(jìn)行二次訓(xùn)練。
3.3 網(wǎng)絡(luò)訓(xùn)練效果比較
對(duì)161組量測(cè)樣本,隨機(jī)取26組作為測(cè)試集,其余135組作為訓(xùn)練集。分別用BP神經(jīng)網(wǎng)絡(luò)(BP-ANN)、貝葉斯正則化BP神經(jīng)網(wǎng)絡(luò)(Bayesian-BP)及遺傳算法-GA-Bayesian-BP模擬。用15次重復(fù)訓(xùn)練結(jié)果的預(yù)測(cè)殘差平方和RMS、相對(duì)預(yù)測(cè)誤差(RSE)、預(yù)測(cè)值對(duì)目標(biāo)值的相關(guān)系數(shù)r對(duì)擬合效果進(jìn)行評(píng)價(jià),各次訓(xùn)練結(jié)果間的標(biāo)準(zhǔn)偏差記為s。不同方法所建校正模型對(duì)校正集樣本的預(yù)測(cè)能力比較結(jié)果見(jiàn)表1。由表1可知,BP-ANN模型對(duì)校正集樣本的平均預(yù)測(cè)能力最好,Bayesian-BP與GA-Bayesian-BP的結(jié)果比較接近。但單次訓(xùn)練結(jié)果波動(dòng)不同,BP-ANN單次訓(xùn)練結(jié)果間的差異最明顯; 而GA-Bayesian-BP的各次訓(xùn)練結(jié)果則相對(duì)比較穩(wěn)定。性質(zhì)相似的檸檬酸和烏頭酸組分的預(yù)測(cè)能力較差,擬合誤差過(guò)大。當(dāng)將兩者合并計(jì)算其總量時(shí)則得到比較好的預(yù)測(cè)結(jié)果, 說(shuō)明組分性質(zhì)越相似,模型的分辨能力越下降。
3.4 預(yù)測(cè)結(jié)果比較
各方法所建校正模型對(duì)測(cè)試集樣本預(yù)測(cè)15次的平均評(píng)價(jià)指標(biāo)及各次預(yù)測(cè)結(jié)果變動(dòng)性的結(jié)果見(jiàn)表2。
Total of citric and aconitic acids0.02375.920.9700.00510.02536.340.9650.01700.01523.800.9880.0008
由表2可見(jiàn),BP-ANN 法的RMS, RSE及r等預(yù)測(cè)指標(biāo)最差,各次訓(xùn)練結(jié)果間的變異也較大,說(shuō)明用該法建立的校正模型存在不穩(wěn)定和“過(guò)擬合”現(xiàn)象。Bayesian-BP和GA-Bayesian-BP的15次平均預(yù)測(cè)指標(biāo)相當(dāng),但后者穩(wěn)定性更好,總體評(píng)價(jià)指標(biāo)均優(yōu)于其它兩種方法,表明GA-Bayesian-BP可改善網(wǎng)絡(luò)的過(guò)擬合訓(xùn)練,提高了對(duì)測(cè)試樣本的預(yù)測(cè)穩(wěn)健性。
由采用GA-Bayesian-BP校正模型對(duì)測(cè)試集的預(yù)測(cè)值與目標(biāo)值的相關(guān)性曲線(圖4)可見(jiàn),預(yù)測(cè)結(jié)果與實(shí)驗(yàn)結(jié)果之間具有很好的線性相關(guān)關(guān)系,能夠滿足多組分的分析。 圖4 模型預(yù)測(cè)值與實(shí)際值的相關(guān)性
Fig.4 Correlation between predicted contents and real contents by GA-Bayesian-BP calibration
3.5 樣品分析
準(zhǔn)確稱(chēng)取約10 g糖蜜樣品,用水定容至100 mL,取20 mL加入交換柱(100 mm×15 mm, φ, 717陰離子交換樹(shù)脂,常規(guī)預(yù)處理后用4% NaOH溶液轉(zhuǎn)變?yōu)镺H型,洗至中性晾干,研磨至粒度100目)中,棄去流出液,用10 mL水淋洗一次,再用20 mL 0.50 mol/L Na2SO4溶液將保留于小柱上的有機(jī)酸組分洗脫,收集洗脫液,用稀HCl調(diào)至pH 2,用水定容至50 mL,滴定。將得到滴定pH-V數(shù)據(jù)轉(zhuǎn)化為主成分值,用GA-Bayesian-BP的訓(xùn)練模型進(jìn)行預(yù)測(cè)。對(duì)來(lái)自廣東某糖廠和臺(tái)灣某糖廠的兩種糖蜜中有機(jī)酸進(jìn)行分析,分析結(jié)果與相應(yīng)的離子色譜分析結(jié)果進(jìn)行了比較(表3)。鑒于檸檬酸與烏頭酸的酸堿離解性質(zhì)過(guò)于相似,擬合效果不佳,在此將兩者合并只分析其總量(以檸檬酸計(jì))。其單個(gè)組分的預(yù)測(cè)結(jié)果僅供參考。
References
1 CHEN Wei-Jun(陳維鈞). Sugar Crystallizing and Formation(蔗糖結(jié)晶與成糖). Beijing(北京): China Light Industry Press(中國(guó)輕工業(yè)出版社), 2000: 332
2 CAO Jia-Xing, HANG Yi-Ping, LU Jian-Ping, TONG Zhang-Fa(曹家興, 杭義萍, 陸建平, 童張法). Chinese Journal of Chromatogr.(色譜), 2010, 28(9): 893~897
3 ZHANG Yun(張 云). Journal of Analytical Science(分析科學(xué)學(xué)報(bào)), 2006, 22(6): 731~736
4 Shamsipur M, Hemmateenejad B, Akhond M. Anal. Chim. Acta, 2002, 461(1): 147~153
5 Shamsipur M, Tashkhourian J, Hemmateenejad B, Sharghi H. Talanta, 2004, 64(3): 590~596
6 ZHANG Yun, YU Xue-Tao(張?jiān)疲谘? Journal of Analytical Science(分析科學(xué)學(xué)報(bào)), 2004, 20(6): 631~633
7 TANG Shou-Peng, YAO Xin-Feng, YAO Xia, TIAN Yong-Chao, CAO Wei-Xing, ZHU Yan(湯守鵬,姚鑫鋒, 姚霞, 田永超, 曹衛(wèi)星,朱 艷). Chinese J. Anal. Chem.(分析化學(xué)), 2009, 37(10): 1445~1450
8 FANG Li-Min, LIN Min(方利民, 林 敏). Chinese J. Anal. Chem.(分析化學(xué)), 2008, 36(6): 815~818
9 SUN Guang-Min, ZHANG Can-Hui, WANG Zhan, WANG Chun, YU Guang-Yu, LIU Xiao-Peng, CUI Yan-Jie(孫光民, 張燦輝, 王 湛, 王 純, 于光宇, 劉曉鵬, 崔彥杰). Journal of the Chemical Industry and Engineering Society of China(化工學(xué)報(bào)), 2009, 60(9): 2237~2242
10 ZHANG Cheng-Yan, MA Wei-Xing, MAO Ying-Ming, ZHOU Kai-Jing(張成燕, 馬衛(wèi)興, 毛應(yīng)明, 周開(kāi)靜). Computers and Applied Chemistry(計(jì)算機(jī)與應(yīng)用化學(xué)), 2009, 26(10): 1300~1302
11 Nasseri M, Asghari K, Abedini M J. Expert Systems with Applications, 2008, 35(3): 1415~1421
12 CHEN Ping, LI Jing-Hui, YU Hong-Mei, DONG Lin, CHU Ning, ZHANG Guo-Min(陳 平, 李井會(huì), 于洪梅, 董 林, 褚 寧, 張國(guó)民). Metallurgical Analysis(冶金分析), 2008, 28(01): 31~34
13 Lau K T, Guo W M, Kiernan B, Slater C, Diamond D. Sensors and Actuators B: Chemical, 2009, 136(1): 242~247
14 K. Hirschen, M. Schfer. Computer Methods in Applied Mechanics and Engineering, 2006, 195(7-8): 481~500
15 Prasoon Kumar, S.N. Merchant, U.B. Desai. Digital Signal Processing, 2004, 14(5): 438~448
16 FANG Kai-Tai, MA Chang-Xing(方開(kāi)泰,馬長(zhǎng)興). Orthogonal and Uniform Experimental Design(正交與均勻試驗(yàn)設(shè)計(jì)). Beijing(北京): Science Press(科學(xué)出版社), 2001: 132
Titration Analysis of Multi-Organic Acids in Sugarcane Molasses by
Back-Propagation Neural Network Integrated with Bayesian
Regularization and Genetic Algorithm
CAO Jia-Xing, LU Jian-Ping
(College of Chemistry and Chemical Engineering, Guangxi University, Nanning 530004)
Abstract Based on a back-propagation neural network (BP) integrated with Bayesian regularization and genetic algorithm, a nonlinear fitting of the principal component for the data obtained from titrating multi organic acids was proposed. Results reveal that the combination of the advantages from Bayesian regularization to adjust the effectively network parameters (weights and biases) adaptively for improving generalization and genetic algorithm to find the optimal initial weights and thresholds of neural network ensures global optimum solution with good performance. The method was applied to simultaneously determine acetic acid, lactic acid, oxalic acid, succinic acid, citric acid and aconitic acid in a converted titration data set. It was found that the more the similarities among the organic acids, the worse their predictive performance by models when they are treated individually, however, the result was good when they were treated together. For above six organic acid in sample set, their average relative mean square root errors of predicting results were 10.02%, 9.34%, 10.66%, 12.18%, 29.81%, 30.94%, respectively, and 3.8% for total amount of citric acid and aconitic acid. Some organic acids in two sugarcane molasses samples are determined and compared with results from ion chromatography.
Keywords Genetic algorithm; Neural network; Bayesian regularization; Sugarcane molasses; Mixed organic acids
注:本文中所涉及到的圖表、注解、公式等內(nèi)容請(qǐng)以PDF格式閱讀原文