

摘 要:新聞系統(tǒng)如果不能有效的進(jìn)行新聞分類和個(gè)性化推薦,勢必會(huì)影響到用戶的使用效率和使用興趣。本文通過自然語言處理技術(shù)、文本分類技術(shù)、協(xié)同過濾算法等技術(shù)構(gòu)建新聞自動(dòng)分類和推薦系統(tǒng),對(duì)發(fā)布的新聞內(nèi)容進(jìn)行分詞處理以及分類訓(xùn)練,從而自動(dòng)判斷新聞的所屬類別,如果用戶對(duì)系統(tǒng)反饋的分類結(jié)果不滿意,還可以手動(dòng)的進(jìn)行修改分類,以便后期不定時(shí)的對(duì)屬性進(jìn)行更新。再通過協(xié)同過濾算法計(jì)算出用戶間的相似度,進(jìn)一步計(jì)算出與被推薦用戶相似度較高的用戶,將該用戶瀏覽過但被推薦用戶未曾瀏覽的新聞推薦給用戶進(jìn)行查看。本文是以復(fù)旦大學(xué)李榮陸用于文本分類研究所使用的新聞?wù)Z料庫為基礎(chǔ),通過此庫來進(jìn)行文本分類準(zhǔn)確性的測試。測試結(jié)果表明,本系統(tǒng)能夠很好的服務(wù)于新聞?dòng)脩簦w現(xiàn)出新聞系統(tǒng)的個(gè)性化。
關(guān)鍵詞:推薦算法;自動(dòng)分類;協(xié)同過濾
中圖分類號(hào):TP393.09文獻(xiàn)標(biāo)識(shí)碼:A 文章編號(hào):2096-4706(2018)10-0009-03
Abstract:If the news system can not effectively classify and personalize the news recommendation,it will inevitably affect the user's use efficiency and interest. This paper constructs a news automatic classification and recommendation system by using natural language processing technology,text classification technology,collaborative filtering algorithm and so on. It can automatically judge the category of the news,if the user is not satisfied with the classification results of the system feedback. Manually modify the classification so that the attributes can be updated indefinite. The similarity between users is calculated by the collaborative filtering algorithm,and the users with higher similarity are further calculated,and the user is viewed by the user who has not been browsed by the recommended user. This paper is based on the news corpus used by Li Ronglu of Fudan University in the Institute of Text Classification to test the accuracy of text classification. The test results show that the system can serve the news users well,reflecting the personalization of the news system.
Keywords:recommended algorithm;automatic classification;collaborative filtering;
0 引 言
在21世紀(jì),人們對(duì)科學(xué)技術(shù)的不斷需求,使其對(duì)科技和自主創(chuàng)新的重視程度也在不斷增加。人們逐步將注意力從物質(zhì)轉(zhuǎn)移到了效率,這一轉(zhuǎn)變也引起了國外公司的極大關(guān)注。另外隨著電子商務(wù)的不斷發(fā)展壯大,商品的種類和數(shù)量也在不斷的增加,不論你是何種身份,用戶或者是管理員,在面對(duì)成千上萬種商品的時(shí)候,總會(huì)顯得很無措。使用者需要花費(fèi)大量時(shí)間在類別判斷和瀏覽那些無用的商品信息上,當(dāng)前在這個(gè)追求效率的時(shí)代,必然會(huì)導(dǎo)致大量用戶的流失。……