文件名称:ThemeCrawler
下载
别用迅雷、360浏览器下载。
如迅雷强制弹出,可右键点击选“另存为”。
失败请重下,重下不扣分。
如迅雷强制弹出,可右键点击选“另存为”。
失败请重下,重下不扣分。
介绍说明--下载内容均来自于网络,请自行研究使用
现在常见的搜索策略主要分为两种:一种是基于网页链接结构的搜索策略,另一种是基于内容评价的搜索策略。第一种是通过网页之间的链接关系来确定网页的重要性,从而决定链接访问的顺序。此方法虽然考虑了网页链接结构和网页之间的链接关系,但忽略了网页内容与主题的相关度,容易出现网页搜索“主题漂移”。第二种主要考虑网页内容,好处就是思路清晰且计算简单。但这种方法忽略了网页的链接关系,故在预测链接网页价值方面存在不足。考虑到这些问题,提出将布谷鸟搜索算法应用到主题爬虫中。-Now the common search strategy is divided into two kinds: one is based on the link structure of the search strategy, the other is based on content uation of the search strategy. The first is to determine the importance of the page through the link relationships between the pages and determine the order in which the links are accessed. Although this method takes into account the link structure between web pages and links between pages, but ignores the relevance of web content and themes, prone to web search theme drift. The second major consideration of web content, the benefits of clear thinking and calculation is simple. But this method ignores the links of the page, so there is insufficient in predicting the value of the link page. Considering these problems, the cuckoo search algorithm is proposed to apply to the crawler.
(系统自动生成,下载前可以参看下载内容)
下载文件列表
ThemeCrawler\.classpath
............\.project
............\.settings\org.eclipse.jdt.core.prefs
............\bin\gui\CrawlerFrame.class
............\...\search\Crawler$1.class
............\...\......\Crawler$Task.class
............\...\......\Crawler.class
............\...\......\Download.class
............\...\......\HttpConstants.class
............\...\......\PriorityURL.class
............\...\......\RegularTest.class
............\src\gui\CrawlerFrame.java
............\...\search\Crawler.java
............\...\......\Download.java
............\...\......\HttpConstants.java
............\...\......\PriorityURL.java
............\...\......\RegularTest.java
............\substance.jar
............\bin\gui
............\...\search
............\src\gui
............\...\search
............\.settings
............\bin
............\src
ThemeCrawler