





摘要:為提高Copula變分推理(CVI: Copula Variational Inference)的近似性能, 提出了一種Copula層次化變分推理方法(CHVI: Copula Hierarchical Variational Inference)。該方法的主要思想是將CVI方法中的Copula函數(shù)與層次化變分模型(HVM: Hierarchical Variational Model)特殊的層次變分結(jié)構相結(jié)合, 使HVM的變分先驗服從CVI方法中的Copula函數(shù)。CHVI不但繼承了CVI中的Copula函數(shù)較強的捕獲變量相關性的能力, 而且還繼承了HVM的變分先驗結(jié)構能獲取模型隱變量依賴關系的優(yōu)勢, 使CHVI可以更好地捕獲隱變量之間的相關性, 提高近似精度。利用基于經(jīng)典的高斯混合模型驗證CHVI方法, 在合成數(shù)據(jù)集和實際應用數(shù)據(jù)集上的實驗結(jié)果表明, CHVI方法的近似精度相較于CVI有較大提升。
關鍵詞:變分推理; Copula函數(shù); 層次化; 相關性
中圖分類號: TP391 文獻標志碼: A
Copula Hierarchical Variational InferenceOUYANG Jihonga,b, CAO Jingyuea,b, WANG Tenga,b
(a. College of Computer Science and Technology; b. Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China)
Abstract:In order to improve the approximate performance of CVI(Copula Variational Inference), the CHVI(Copula Hierarchical Variational Inference) method is proposed. The main idea of this method is to combine the Copula function in the CVI method with the special hierarchical variational structure of the HVM(Hierarchical Variational Model), so that the variational prior of the HVM obeys the Copula function in the CVI method. CHVI not only inherits the strong ability of the Copula function in CVI to capture the correlation of variables, but also inherits the advantage of the variational prior structure of HVM to obtain the dependencies of the hidden variables of the model, so that CHVI can better capture the relationship between hidden variables. correlation to improve the approximation accuracy. The author validates the CHVI method based on the classical Gaussian mixture model. The experimental results on synthetic datasets and practical application datasets show that the approximate accuracy of the CHVI method is greatly improved compared to the CVI method.
Key words:variational inference; Copula function; hierarchy; correlation
0 引 言
變分推理(VI: Variational Inference)[1-3]是一種確定性近似推理, 在機器學習中發(fā)揮著核心作用。由于VI計算速度快、 形式簡捷、 易于并行化, 適用于大規(guī)模數(shù)據(jù)集[4], 因此近年來被廣泛應用于許多領域, 如大規(guī)模文檔分析[5-7]、 語音識別[8-9]和計算機視覺[10-12]等。VI的一個關鍵問題是如何選擇足夠靈活的變分分布, 以捕獲真實后驗分布的信息[4]。傳統(tǒng)VI假設變分分布是完全因子形式的, 即平均場形式變分推理(MFVI: Mean-Field Variational Inference), 但這是一個很強的獨立性假設, 忽略了隱變量之間的后驗相關性。在現(xiàn)實應用中, 變量之間的相關性有時可能具有內(nèi)在的意義, 如交通事故可能和天氣條件有關, 醫(yī)學中人的身高體重等, 而MFVI常常會導致對這些包含變量間相關性的模型進行錯誤地近似。為緩解這一問題, 研究人員開發(fā)了各種方法對后驗相關性進行建模, 以達到近似更加精確的目的。……