













摘" 要: 自20世紀90年代起,隨著人工智能(AI)的飛速發展及其與深度學習等機器學習方法的廣泛融合,自然語言處理(NLP)作為人工智能的核心,也取得了令人矚目的進步。而隨著國際學術交流、世界文化交融愈加頻繁,人們搜尋、閱讀他國網絡信息的現實需求也隨之增多。當信息搜尋者在搜尋非母語信息時,不僅會出現語言障礙問題,還會因錯綜復雜、層次不齊的各色信息而產生諸多不便。為了便于信息搜尋者快速高效地獲取有用信息,文中基于人工智能算法(PageRank/TextRank)設計一種信息提取?翻譯?校對(ETP)系統。系統通過AI自動搜索閱讀頁面上的重要信息和文本摘取,生成摘要,并基于機器翻譯API模塊完成翻譯;其次,采用智能校對系統完成校對審核后,將信息呈現給搜尋者,以供其對全部信息高效且準確地進行預篩選,從而節省閱讀時間和精力。最后對系統算法所實現的功能進行實驗測試,結果達到預期。
關鍵詞: AI算法; 自然語言處理; 信息提取; 機器翻譯; 翻譯校對; PageRank算法; TextRank算法
中圖分類號: TN912.3?34" " " " " " " " " " " " " 文獻標識碼: A" " " " " " " " " " " "文章編號: 1004?373X(2024)10?0111?06
Design of natural language information extraction?translation?proofreading
system based on AI algorithm
Abstract: Since the 1990s, with the blossom of artificial intelligence (AI) and its massive integration with machine learning methods such as deep learning, natural language processing" (NLP) technology has also made remarkable progress as the core of AI. With international academic exchanges and the integration of world cultures growing, people have more practical demands for searching and reading online information from other countries. While searching for information in other languages other than their native, information seekers will encounter not only language barriers but also much difficulty brought by intricate and uneven information. In order designs to help them obtain useful information quickly and efficiently, on the basis of AI algorithm (PageRank/TextRank), an information extraction?translation?proofreading (ETP) system is designed. The system will, by AI automatically finding out critical information on the reading pages and generating summary, complete the translation based on machine translation API module, and present the information to seekers after completing proofreading with the intelligent proofreading system. As such, they can efficiently and accurately pre?screen all the information with less reading time and energy. In the end, the functions realized by the system algorithm are experimentally tested, and the results meet expectations.
Keywords: AI algorithm; natural language processing; information extraction; machine translation; translation proofreading; PageRank algorithm; TextRank algorithm
0" 引" 言
隨著世界一體化趨勢深入演變,不同語言文化以及國際學術圈之間的交流日益密切[1]。在此過程中,人們難免會接觸到大量的非母語信息,而當信息搜尋者面對這類信息時,非母語的生疏感與網絡信息的錯綜復雜性給其帶來了極大不便,難以快速獲取其所需要的有用信息,這是因為通讀海量頁面信息一定會耗費大量的時間和精力。
人工智能技術的蓬勃發展使得自然語言處理技術對于解決這類問題呈現出不錯的答案。……