中文搜索引擎的设计与实现.rar 华中科技大学硕士学位论文
A Thesis Submitted in Partial Fulfillment of the Requirements
for the Degree of Master of Engineering
The Design and Implementation of Chinese
Search Engine
搜索引擎是 Web 信息检索的主要工具,Crawler 是搜索引擎的核心组件,用于
搜集 Web 页面。实现一个可扩展、高性能、大规模的中文搜索引擎,核心是设计一
个可扩展、高性能、大规模的 Crawler。
考虑到 Web 的容量以及增长速度,设计了并行 Crawler 系统,该系统由多个
Crawler 进程组成,每个 Crawler 进程运行在一台机器上,一台机器只运行一个
Crawler 进程。Crawler 进程有自己的本地页面库和本地索引库,它下载的页面以及
用CAJviewer打开-Chinese search engine design and implementation. Rar Huazhong University of Science and Master's degree thesis A Thesis S submitted in Partial Fulfillment of the Require separations for the Degree of Master of Engineering Th e Design and Implementation of Chinese Search E ngine Web search engine is the main information retrieval tools Crawler search engine is a core component for the collection of Web pages. To achieve a scalable, high-performance, large-scale Chinese search engine, the core is the design of a scalable, high-performance, massive Crawler. Consider the Web to increase capacity and speed, the design of a parallel Crawler System The system consists of multiple Crawler process, each Crawler process running on a single machine, a machine running only a Crawler process. Crawl
A Thesis Submitted in Partial Fulfillment of the Requirements
for the Degree of Master of Engineering
The Design and Implementation of Chinese
Search Engine
搜索引擎是 Web 信息检索的主要工具,Crawler 是搜索引擎的核心组件,用于
搜集 Web 页面。实现一个可扩展、高性能、大规模的中文搜索引擎,核心是设计一
个可扩展、高性能、大规模的 Crawler。
考虑到 Web 的容量以及增长速度,设计了并行 Crawler 系统,该系统由多个
Crawler 进程组成,每个 Crawler 进程运行在一台机器上,一台机器只运行一个
Crawler 进程。Crawler 进程有自己的本地页面库和本地索引库,它下载的页面以及
用CAJviewer打开-Chinese search engine design and implementation. Rar Huazhong University of Science and Master's degree thesis A Thesis S submitted in Partial Fulfillment of the Require separations for the Degree of Master of Engineering Th e Design and Implementation of Chinese Search E ngine Web search engine is the main information retrieval tools Crawler search engine is a core component for the collection of Web pages. To achieve a scalable, high-performance, large-scale Chinese search engine, the core is the design of a scalable, high-performance, massive Crawler. Consider the Web to increase capacity and speed, the design of a parallel Crawler System The system consists of multiple Crawler process, each Crawler process running on a single machine, a machine running only a Crawler process. Crawl