陳婉杰 盛益強



摘 要:針對現有的社區發現算法難以解決網絡的多維性問題的現象,提出一種基于網絡表示學習的非單一維度的社區發現算法。該算法從節點屬性特征和網絡結構特征兩個維度考慮節點的差異性,首先根據節點屬性相似度計算得到節點轉移概率,結合小世界模型的六度分離理論設置網絡節點隨機游走路徑的長度。依據轉移概率選擇節點的鄰居節點,得到節點的游走路徑,然后用神經網絡模型訓練節點的游走路徑得到節點的網絡特征向量,將節點網絡特征向量的相似度重置為節點連接邊的權重,在Louvain算法的基礎上完成社區劃分。最后,在Facebook和Giraffe兩個數據集上進行了實驗,選用基于初始網絡結構的Louvain算法和基于單一維度的社區發現算法作為對比算法。實驗結果表明,在Giraffe數據集中,相比于Louvain算法,基于節點屬性的社區發現算法的模塊度指標提升了2.7%,基于網絡結構的社區發現算法的模塊度指標提升了3.0%,提出的非單一維度的社區發現算法的模塊度指標提升了3.7%。所提算法聚焦于網絡的多維性,有效提升了社區發現算法的模塊度。
關鍵詞:節點屬性;網絡結構;可擴展性;社區發現;網絡表示學習;節點差異性
中圖分類號: TP391.4模式識別與裝置文獻標志碼:A
Non-unidimensional community detection algorithm based on network representation learning
CHEN Wanjie1,2, SHENG Yiqiang 1,2*
(1. National Network New Media Engineering Research Center (Institute of Acoustics, Chinese Academy of Sciences), Beijing 100190, China;
2. University of Chinese Academy of Sciences, Beijing 100049, China)
Abstract: Focusing on the issue that it is difficult for the existing community detection algorithms to solve the multidimensionality problem of the network, a non-unidimensional community detection algorithm based on network representation learning was proposed. The algorithm considered the difference of nodes from the two dimensions of node attribute feature and network structure feature. Firstly, the node transition probability was calculated according to the node attribute similarity. The length of the random walk path of the network node was set according to the six-degree separation theory of the small world model. After obtaining the walking path of the node by selecting its neighbor nodes according to the transition probability, the walking path of the node was trained by the neural network model to achieve the network feature vectors. The similarity of the network feature vectors of the node was reset as the weight of the connected edge, and the community partition was completed based on the Louvain algorithm. Finally, experiments were conducted on two datasets, Facebook and Giraffe with the Louvain algorithm based on the initial network structure and the unidimensional community detection algorithm as comparison algorithms. Experimental results show that on the Giraffe dataset, compared to the Louvain algorithm, the community detection algorithm based on the node attribute has the modularity increased by 2.7%, the community detection algorithm based on the network structure has the modularity increased by 3.0%, and the proposed non-unidimensional community detection algorithm has the modularity increased by 3.7%. The proposed algorithm focuses on the multidimensionality of the network and improves the modularity of the community detection algorithm effectively.
Key words: node attribute; network structure; extensibility; community detection; network representation learning; node difference
0 引言
隨著互聯網的飛速發展和移動終端的普及應用,復雜網絡在生活中的很多領域均得到了快速的發展,如社交媒體領域、學術信息領域以及生物學領域等。……