LAN Mei-hui,GAO Wei
(1.School of Computer Science and Engineering,Qujing Normal University,Qujing 655011,China; 2.School of Information,Yunnan Normal University,Kunming 650500,China)
?
A Newk-Partite Ranking Ontology Algorithm via Wavelet Filtering Technology
LAN Mei-hui1,GAO Wei2
(1.School of Computer Science and Engineering,Qujing Normal University,Qujing 655011,China; 2.School of Information,Yunnan Normal University,Kunming 650500,China)
Objective The ontology vertices are usually divided intokparts,and then the score function is obtained byk-partite ranking algorithm,so the similarity between two concepts is determined by the difference of their scores.Methods This paper tend to study thek-partite ranking based ontology algorithm under AUC criterion.Combined the wavelet filtering technology with the ontology iterative algorithm, and the vertex partition is controlled byNterm approximant.Results New algorithms are applied in gene ontology and physics education ontology,and we use P@N to measure the result data and compare them to the results from former algorithms.It is implied that the accuracy of our algorithm is significantly higher than that of other algorithms as the increase ofN.Conclusion The experimental results show that new algorithms are effective for ontology similarity computation and ontology mapping.
Ontology;similarity measure;ontology mapping;k-partite ranking;filtering trick;wavelet analysis
As a knowledge representation and conceptual shared model,ontology has been applied in image retrieval,knowledge management and information retrieval search extension.Acting as an effective concept semantic model,ontology is employed in disciplines beyond computer science,such as social science (for instance,see[1]),biology science(for instance,see[2]) and geography science(for instance,see[3]).
In fact,the ontology model is a graphG=(V,E),where each vertexvin an ontology graphGrepresents a concept and each edgee=vivjin an ontology graphGrepresents a relationship between conceptsviandvj.The target of ontology similarity measure is to find a similarity function Sim:V×V→+∪ {0} which maps each pair of vertices to a real number.The aim of ontology mapping is to bridge the link between two or more ontologies.LetG1andG2be two ontology graphs corresponding to ontologyO1andO2respectively.For eachv∈G1,find a setSv?V(G2) where the concepts correspond to vertices inSvare semantic close to the concept correspond tov.One method to get the mapping is,for eachv∈G1,computing the similarityS(v,vj)wherevj∈V(G2) and choose a parameter 0 For ontology similarity measure and ontology mapping,there are several effective learning tricks.Huang[4]presented fast ontology algorithm to calculate the ontology similarity in a short time.Gao and Liang[5]raised that the optimal ontology function can be determined by optimizing NDCG measure,and then they applied this idea to physics education.Gao and Gao[6]deduced the ontology function using the regression approach.Huang et al.[7],obtained ontology similarity function based on half transductive learning.Lan et al.[8],explored the learning theory approach for ontology similarity computation. Ink-partite ranking algorithm,the vertices can be divided into k parts which correspond to the k classes of rates.The rate values of all classes are decided by experts.Then,the value of the vertex in a rateais larger than the value of any vertex in rateb(where 1≤a The signification and value ofk-partite ranking in ontology application are shown clearly and intensively in the following two samples. Example 1 Fig 1 shows the structure of“University”ontologyO1.We observe that ontology graph is tree,and there are three branches—“Courses”,“Student”and“Academic Staff”.For similarity measuring on“University”ontologyO1,we setk=3,and each branch corresponds to a class of rate.Because “Student”and “Academic Staff”are people,the order of classes can be arranged as:“Courses”(rate 1),“Student”(rate 2)and“Academic Staff”(rate 3). Figure 1. “University” ontology O1 Figure 2.“University”ontology O2 Example 2 Fig 2 shows the structure of other“University”ontologyO2.This ontology has tree structure and two branches“Courses”,“People”.We want to get an ontology map betweenO1andO2.For this purpose,we setk=4,and branch“Course”inO1andO2corresponds to the same class.The order of classes can be arranged as:“Courses”(rate 1),“Student”(rate 2),“Academic Staff”(rate 3),“People”(rate 4). From the two examples presented above,we see thatk-partite ranking is suitable for ontology similarity measuring and ontology mapping,when the structure of ontology graph is tree.Since most ontologies in engineering applications are tree,the trick ofk-partite ranking can be well used in practical implement. Several theoretical analyses ofk-partite ranking based ontology algorithm are obtained.Gao et.al.[9],studied the strong and weak stability ofk-partite ranking ontology algorithm.Gao and Xu[10]learned the characteristics of ontology function ink-partite ranking setting.Gao et.al.[11],yielded some linear statistical results aboutk-partite ranking ontology algorithm.Furthermore,Gao et.al.[12],inferred the minimax learning rate for ontology algorithm ink-partite ranking setting. Although there have been several recent advances in developing algorithms about various settings of thek-partite ranking ontology problem,the study of implementation technologies ofk-partite ranking ontology algorithms has been largely limited to the special setting.For this reason,it has attracted tremendous academic and industrial interests to research thek-partite ranking ontology algorithm from a new perspective.In this paper,we determine the new ontology similarity computation and ontology mapping algorithms based on wavelet filtering tricks.The paper is structured as follows.In the first section,we introduce the ontology problem based onk-partite ranking,and the setting of wavelet filtering is also given;in section 2,we present the main idea of our paper and the new ontology iterative algorithm is designed;in section 3,we describe the procedures of new algorithms;in the last section,we concentrate on its implementation.Results of illustrative numerical experiments show the effectiveness of our new algorithms. In this section,we present the detailed setting of wavelet filtering andk-partite ranking based ontology algorithm. 1.1 Ontology Algorithm Based onk-Partite Ranking LetL2[1,0]bethespaceofLebesguemeasureontologyfunctionsf:[0,1]→(theHilbertnormonthisspaceisdenotedas‖·‖),χ=(χ(t))t∈[0,1]beastochasticprocess,takingitsvaluesinaontologyfunctionspaceV?L2[0,1],andYbearandomlabelin{1,2,…,k}. The stochastic processχis considered as a random observation for the labelY.For each pair of (a,b),setPa,b=P{Y=a|Y∈{a,b}andηa,b(v)=P{Y=a|V=v,Y∈{a,b}astheposteriorprobability.Byslightlyabusingtheterminologyandnotation,η,a regression function,expressed that η=ηa,bestablishedforeachpairof(a,b). Gao et.al.[13],raised AUC criterionk-partite ranking based ontology algorithm,and its computational model can be expressed as whereFis set of ontology functions,f∈F,and(V’,Y’)denotesanindependentcopyof(V,Y).SeveraltheoreticalconclusionsforontologyalgorithmwithAUCcriterioncanrefertoZhuet.al.[14]. 1.2 Wavelet Based Filtering Filtering approach is regarded as a good technology for dimension reducing.Let {gn}n≥1be an orthonormal basis ofL2[0,1].Consider the projection of the random ontology functionχonto a finite dimensional subspace,spanned by a subset {GN}N≥1withNbasis functions,the first ones are denoted byg1,…,gN, Thebasisthatminimizes foreachN≥1 is the Karhunen-Loeeve (KL) basis.In view of E[(〈χgN〉2)]=∫∫Cχ(t,s)gn(t)gn(S)dtds,we have that the KL basis can be presented to diagonalize the covariance operatorh(s)|→∫tCχ(t,s)h(t)dt,hereCχ(t,s)=E[χ(t)χ(s)]isacovariancefunction. Sincethelawoftheprocessχis unknown,we should rely on a set of realizations of the processχwhich is used to estimate KL basis.Let (ψ,φ)beapairofcompactlysupportedwaveletfunctionon[0,1]sothat∫trψ(t)dt=0 forr∈{0,…,M-1}.For any functionγ:→,).Forany(j,r)∈{0,…2j-1},set Letj≥0.Theprojectionofχonto the subspace spanned by theφjris obtained by: Nj=2jtermsareusedinsuchapproximation.LetXjbethesetofrandomcoefficients{αj,r∶0≤r≤2j}toexpressthestochasticprocessχ. Fixj0≥0.Wenowdiscussa(nonlinear)wavelet-basedapproximationoftherandomfunctionχbased on the N wavelet coefficients with the highest variance among those of resolution level larger thanj0.We abusively setφj0,r=ψj0-1,randβj0-1,r=αj0,rfor 0≤r<2j0andre-markthewaveletcoefficientsβj,r,j0-1≤jsothat Hence,the“N-term approximant”is deduced by keeping the terms corresponding to the coefficients with largest second-order moment and giving up the others: In Gao et al.[13],and Lan[15],two different iterative computationk-partite ranking based ontology algorithms are designed in AUC criterion.The main idea of iterative ontology algorithm in our paper is similar to[13] and[15].The main novelty in our paper is to present a new ontology algorithm based onk-partite ranking trick and AUC criterion associated with wavelet filtering technology. The iterative algorithm produces an oriented partition of the feature spaceV,so the top vertex in ontology graph corresponds to the whole spaceV,and each internal vertex represents a specific setC?V.And,letClandCrbe subsets which the left and right parts ofCcorrespond to,respectively. Thus,a partition ofCis formed.The technology of dividing can be found in Clemencon et.al.[16]. wherema,b=na+nb=|Sa|+|Sb|. IftheontologygraphhasfixedmaximumdepthD≥0oraminimumnumberofverticesinacell,belowwhichonestopsdividing,thefirststageoftheiterativecomputationimpliestousetheweightedalgorithmselectedinarecursiveway,intermsoftherateofverticeswithrateawithinthecellCa,btosplitforeachpairof(a,b) (1≤a Now,by means of the recursive partitioning principle andk-partite ranking trick with AUC criterion,we propose to implement the nonlinear wavelet filtering locally.Hence,we choose the indexes of the coefficients retained using the labeled data lying in the vertex to be divided. Suppose that ak-partite ranking algorithm (with asymmetric ontology loss function)Ais given. FixN≥1,D≥1andjmax≥0.The ontology iterative algorithm using wavelet filtering can be described as follows. Algorithm 1 New Iterative Ontology Algorithm Step2.Ford=0,…,D-1,r=0,…,2d-1,andeachpairof(a,b)with1≤a Step2.2.KeeptheNwaveletcoefficientswiththehighestlocalsecond-ordermoment Step3.Theoptimalontologyscorefunctionisthengivenby Asawidelyusedfunctionaldataanalysistechnology,waveletfilteringcanrarefythewaveletcoefficientsandthusplaytheroleofdimensionalityreduction.Intheaboveontologyalgorithm,thefilteringtrickreliedonwaveletanalysisisimprovedandthedecreaseofoptimalAUCcausedbythefilteringoperationisconcerningthedistortionrateobtainedbythewaveletapproximants.Usingorderedrecursivepartitioningofthespace,ournovelontologylearningalgorithmispresentedandtheadaptivefilteringisimplementedlocallyateachsplittingstepforeachpair(a,b). Thenewk-partite ranking based ontology learning algorithm can be used in ontology concepts similarity measurement and ontology mapping.The basic idea is:the ontology graph is mapped into a line consisting of real numbers by means of the iterative computation.The similarity between two concepts then can be measured by comparing the difference between their corresponding real numbers. Algorithm 2.k-partite ranking based ontology similarity measuring with wavelet filtering: Step 1.Mathematizing ontology information.For each vertex in ontology graph,we use a vector to express all its information. Step 2.By new iteration algorithm 1,we map the ontology graph to the real line and map the vertices of ontology graph to real numbers. Step 3.For eachv∈V(G),weuseoneofthefollowingstrategiestoobtainthesimilarverticesandreturntheoutcomestotheusers. Strategy 1.For eachv∈V(G).LetN∈beaparameter,and … Then,wededuce return(v)={v1,v2,…,vN}. Strategy 2.For eachv∈V(G).LetM∈+beaparameter,and return(v)={v′|v′∈{V(G)-v},|f(v)-f(v′)|≤M}. Clearly,strategy1lookslikefairerandstrategy2cancontrolthenumberofverticesthatreturntotheusers. Algorithm 3.k-partite ranking based ontology mapping with wavelet filtering: LetG1,G2,…,GmbeontologygraphscorrespondtoontologiesO1,O2,…,Om. Step1.Mathematizingontologyinformation.Foreachvertexinontologygraph,weuseavectortoexpressallitsinformation. Step2.Bynewiterationalgorithm1,wemaptheontologygraphtothereallineandmaptheverticesofontologygraphtorealnumbers. Step3.Forv∈V(Gi),where1≤i≤m,weuseoneoffollowingstrategiestoobtainthesimilarverticesandreturntheoutcometotheusers. Strategy 3.For eachv∈V(Gi),i=1,2,…,m.LetN∈beaparameter,and … Then,wededuce map(v)={v1,v2,…,vN}. Strategy 4.For eachv∈V(Gi),i=1,2,…,m.LetM∈+beaparameter,and map(v)={v′|v′?V(Gi),|f(v)-f(v′)|≤M} Also,strategy3lookslikefairerbutstrategy4cancontrolthenumberofverticesthatreturntotheusers. Inthissection,twosimulationexperimentsrelevanttoontologysimilaritymeasureandontologymappingaredesignedbelow.Inordertoadjacenttothesettingofontologyalgorithm,weuseavectorwithmultidimensiontoexpresseachvertex’sinformation.Thevectorcontainstheinformationofname,instance,attributeandstructureofvertex.Heretheinstanceofvertexreferstothesetofitsreachablevertexinthedirectedontologygraph. 4.1OntologySimilarityMeasuringonBiologyData Weuse“Go”ontologyO3which was constructed in http://www.geneontology.org.(Fig.3 shows the basic structure ofO3)for our experiment.P@N(Precision Ratio,see[20]for more detail) is used to measure the equality of the experiment.Firstly,the closest N concepts for every vertex in the ontology graph are given by experts,and then we obtain the first N concepts for every vertex in ontology graph by the algorithm 2 and compute the precision ratio.Ontology algorithms in [4],[5] and [6] are employed to“Go”ontology,and we compare the precision ratio which we get from the four methods.Several experiment results refer to Tab.1. Figure 3. “Go”ontologyO3Figure4. “PhysicalEducation”OntologyO4 Table 1 The Experiment Data for Ontology Similairty Measure WhenN=3,5,10or20,theprecisionratiobyvirtueofouralgorithm2ishigherthantheprecisionratiodeterminedbyalgorithmsproposedin[4],[5]and[6].Inparticular,whenNincreases,such precision ratios are increasing apparently.Therefore,the algorithm 2 described in our paper is superior to the method proposed by[4],[5] and[6]. Figure 5. “Physical Education”Ontology O5 4.2 Ontology Mapping on Physical Education Data We use physical education ontologiesO4andO5(thestructuresofO4andO5arepresentedinFig.4andFig.5respectively)foroursecondexperiment.ThegoalofthisexperimentistodetermineontologymappingbetweenO2andO3via similarity matrix which are deduced by Algorithm 3.P@N criterion is applied to measure the equality of the experiment.Firstly,we give the closestNconceptsforeachvertexintheontologygraphwiththehelpofexperts,andthenweobtainthefirstNconceptsforeveryvertexinontologygraphbythealgorithmandcomputetheprecisionratio.Also,ontologyalgorithmsin[4],[5]and[11]areemployedto“physicaleducation”ontology,andwecomparetheprecisionratiowhichwegetfromfourmethods.SeveralexperimentresultsrefertoTab.2. WhenN=1,3,or 5,the precision ratio by virtue of our algorithm 3 is higher than the precision ratio determined by algorithms proposed in[4],[5] and[11].In particular,whenNincreases,such precision ratios are increasing apparently.Therefore,the algorithm 3 described in our paper is superior to the method proposed by [4],[5] and [11]. Table 2 The Experiment Data for Ontology Mapping The reason why the new ontology algorithm can improve the accuracy rate can be summarized as follows:(1) thek-partite ranking fits the structure of ontology graph itself;(2) by means of wavelet filtering,the optimal ontology function has effectively learned and extracted the main features of ontology in the experiments. Wavelet filtering is a widely-used alternative approach in Functional Data Analysis (FDA).The main contribution of our paper is to address thek-partite ranking based ontology algorithm from the perspective of wavelet filtering.The essence of this algorithm is dimensionality reduction.It can be understood in two aspects:first,N terms approximation is actually a sparsity trick in the application;second,the multi dimension vector of ontology vertex is at last reduced to a dimension vector (a real number) by ontology score function.To the best of our knowledge,the new algorithms raised in our paper are the first to state in ontology applications. [1]KABIR M A,HAN J,YU J.User-centric social context information management:an ontology-based approach and platform[J].Personal and Ubiquitous Computing,2014,18(05):1061-1083. [2]IVANOVIC M,BUDIMACZ.An overview of ontologies and data resources in medical domains[J].Expert systems and Applications,2014,41(11):5158-15 166. [3]FONSECA F,EGENHOFE E,DAVIS C,et al.Semantic granularity in ontology-driven geographic information systems[J].AMAI Annals of Mathematics and Artificial Intelligence-Special Issue on Spatial and Temporal Granularity,2001(36):121-151. [4]HUANGX,XU T,GAO W,et al.Ontology similarity measure and ontology mapping via fast ranking method[J].International Journal of Applied Physics and Mathematics,2011,1(01):54-59. [5]GAO W, LIANG L.Ontology similarity measure by optimizing NDCG measure and application in physics education[J].Future Communication,Computing,Control and Management,2011(142):415-421. [6]GAO Y,GAO W.Ontology similarity measure and ontology mapping via learning optimization similarity function[J].International Journal of Machine Learning and Computing,2012,2(02):107-112. [7]HUANG X,XU T,GAO W,et al.Ontology similarity measure and ontology mapping using half transductive ranking[C]//Proceedings of 2011 4th IEEE international conference on computer science and Information technology.Chengdu,China,2011:571-574. [8]LAN M,XU J,GAO W.Ontology similarity computation usingk-partite ranking method[J].Journal of Computer Applications,2012,32(04):1094-1096. [9]GAO W,GAO Y,ZHANG Y.Strong and weak stability ofk-partite ranking algorithm[J].Information,2012,15(11A):4585-4590. [10]GAO W,XU T.Characteristics of optimal function for ontology similarity measure via multi-dividing[J].Journal of Networks,2012,7(08):1251-1259. [11]GAO W,XU T,GAN J,et al.Linear statistical analysis of multi-dividing ontology algorithm[J].Journal of Information and Computational Science,2014,11(01):151-159. [12]GAO W,GAO Y,ZHANG Y,et al.Minimax learning rate for multi-dividing ontology algorithm[J].Journal of Information and Computational Science,2014,11(06):1853-1860. [13]GAO Y,GAO W,LIANG L.A newk-partite ranking learning algorithm based on AUC metric and application in ontology[J].Scientific Journal of Computer Science,2013,3(05):136-144. [14]ZHU L,TAO W,MIN X,et al.Theoretical characteristics of ontology learning algorithm in multi-dividing setting[J].IAENG International Journal of Computer Science,2016,43(02):184-191. [15]LAN M.Calculation of optimalk-partite ranking functions under extended AUC model[J].Journal of Suzhou University of Science and Technology (Natural Science),2013,30(04):60-63. [16]CLEMENCON S,MARINED,NICOLAS V.Ranking forests[J].Journal of Machine Learning Research,2013,14:39-73. [17]CLEMENCON S,DEPECKER M,VAVATIS N.Adaptive partitioning schemes for bipartite ranking[J].Machine Learning,2011,83(01):31-69. [18]GAO W,YAN L,LIANG L.Piecewise function approximation and vertex partitioning schemes for multi-dividing ontology algorithm in AUC Criterion Setting (I)[J].International Journal of Computer Applications in Technology.2014,50(3-4):226-231. [19]YAN L,GAO W,LI J.Piecewise function approximation and vertex partitioning schemes for multi-dividing ontology algorithm in AUC criterion setting (II)[J].Journal of Applied Science,2013,13(16):3257-3262. [20]CRASWELL N,HAWKING D.Overview of the TREC 2003 web ttrack[C]//Proceeding of the Twelfth Text Retrieval Conference.Gaithersburg,Maryland,NIST Special Publication,2003:78-92. [責任編輯:劉志媛 英文編輯:劉彥哲] 國家自然科學基金項目(61262071);云南省教育廳科學研究基金資助項目(2014C131Y) 蘭美輝(1982-),女,云南宜良人,講師,碩士,主要從事信息檢索、機器學習、人工智能的研究。 TP 393.092 A 基于小波過濾方法的新k-部排序本體算法 蘭美輝1,高 煒2 (1.曲靖師范學院計算機科學與工程學院,云南 曲靖 655011;2.云南師范大學信息學院,云南 昆明650500) 目的 將本體結構圖劃分成k個部分,利用k-部排序學習得到一個得分函數,從而兩本體概念之間的相似度可通過它們之間得分的差值來計算。方法 研究AUC標準下基于k-部排序的本體算法。將小波過濾技術融入到本體迭代算法,通過小波的N項逼近來控制頂點的劃分。結果 將算法應用于基因本體和物理教育本體,利用P@N對結果進行評價并與以往算法得到的結果進行對比。發現隨著N的增大,算法的準確率明顯高于其他算法。結論 實驗結果說明新算法對于本體相似度計算和本體映射的建立是有效的。 本體;相似度計算;本體映射;k-部排序;過濾技術;小波分析 10.3969/j.issn.1673-1492.2017.09.004 來稿日期:2016-12-14
1 Setting of k-Partite Ranking Based Ontology Algorithm and Wavelet Filtering









2 Main Iterative Ontology Algorithm in k-Partite Ranking and AUC Setting Using Wavelet Filtering










3 Description of Algorithms







4 Experiments




5 Conclusion