文件名称:crawler
- 所属分类:
- 人工智能/神经网络/遗传算法
- 资源属性:
- [Java] [源码]
- 上传时间:
- 2013-07-03
- 文件大小:
- 9.64mb
- 下载次数:
- 0次
- 提 供 者:
- 蔷**
- 相关连接:
- 无
- 下载说明:
- 别用迅雷下载,失败请重下,重下不扣分!
介绍说明--下载内容均来自于网络,请自行研究使用
爬虫分布式版本实现,基于Map-Reduce进行了实现,非常有用-Reptile distributed version achieved, based on Map-Reduce was realized very useful
(系统自动生成,下载前可以参看下载内容)
下载文件列表
crawler\build.xml
.......\LISENCE
.......\README.txt
.......\seeds-csb.txt
.......\seeds-hadoop.txt
.......\seeds-hadoopcn.txt
.......\seeds-hi.txt
.......\seeds-localhost.txt
.......\seeds-nyt.txt
.......\seeds-scst.txt
.......\seeds-wiki.txt
.......\bin\crawler.sh
.......\conf\configuration.xsl
.......\....\joycrawler-csb.xml
.......\....\joycrawler-default.xml
.......\....\joycrawler-hadoop.xml
.......\....\joycrawler-hadoopcn.xml
.......\....\joycrawler-hi.xml
.......\....\joycrawler-localhost.xml
.......\....\joycrawler-nyt.xml
.......\....\joycrawler-scst.xml
.......\....\joycrawler-wiki.xml
.......\....\log4j.properties
.......\lib\commons-cli-2.0-SNAPSHOT.jar
.......\...\commons-httpclient-3.1.jar
.......\...\commons-logging-1.0.4.jar
.......\...\db.jar
.......\...\hadoop-0.20.1-core.jar
.......\...\log4j-1.2.15.jar
.......\...\lucene-core-3.0.0.jar
.......\...\lucene-smartcn-3.0.0.jar
.......\...\lucene-snowball-3.0.0.jar
.......\...\nekohtml.jar
.......\...\xercesImpl.jar
.......\...\xercesMinimal.jar
.......\...\xml-apis.jar
.......\...\native\libdb_java48.dll
.......\src\contrib\java\org\joy\analyzer\Analyzer.java
.......\...\.......\....\...\...\........\Document.java
.......\...\.......\....\...\...\........\DocumentCreationException.java
.......\...\.......\....\...\...\........\DocumentFactory.java
.......\...\.......\....\...\...\........\Hit.java
.......\...\.......\....\...\...\........\HitAnalyzer.java
.......\...\.......\....\...\...\........\Main.java
.......\...\.......\....\...\...\........\Paragraph.java
.......\...\.......\....\...\...\........\PipelineAnalyzer.java
.......\...\.......\....\...\...\........\TokenAnalyzer.java
.......\...\.......\....\...\...\........\html\Anchor.java
.......\...\.......\....\...\...\........\....\HTMLDocument.java
.......\...\.......\....\...\...\........\....\Main.form
.......\...\.......\....\...\...\........\....\Main.java
.......\...\.......\....\...\...\........\....\ParagraphSplitter.java
.......\...\.......\....\...\...\........\....\ParseException.java
.......\...\.......\....\...\...\........\....\Parser.java
.......\...\.......\....\...\...\........\....\TagWindow.java
.......\...\.......\....\...\...\........\....\TextExtractor.java
.......\...\.......\....\...\...\........\....\Utility.java
.......\...\.......\....\...\...\........\scoring\FrequencyScorer.java
.......\...\.......\....\...\...\........\.......\PWFScorer.java
.......\...\.......\....\...\...\........\.......\Scorer.java
.......\...\.......\....\...\...\........\.......\ZeroScorer.java
.......\...\.......\....\...\...\........\terms\SimpleTermExtractor.java
.......\...\.......\....\...\...\........\.....\TermExtractor.java
.......\...\.......\....\...\...\db\DB.java
.......\...\.......\....\...\...\..\DBCursor.java
.......\...\.......\....\...\...\..\DocHit.java
.......\...\.......\....\...\...\..\DocumentDB.java
.......\...\.......\....\...\...\..\DocumentEntry.java
.......\...\.......\....\...\...\..\Entry.java
.......\...\.......\....\...\...\..\Env.java
.......\...\.......\....\...\...\..\IndexDB.java
.......\...\.......\....\...\...\..\IndexEntry.java
.......\...\.......\....\...\...\..\MergedDocHits.java
.......\...\.......\....\...\...\..\Proximity.java
.......\...\.......\....\...\...\..\QueryServer.java
.......\...\.......\....\...\...\..\ResultEntry.java
.......\...\.......\....\...\...\..\SearchEntry.java
.......\...\.......\....\...\...\..\Searcher.java
.......\...\.......\....\...\...\..\query\Query.java
.......\...\.......\....\...\...\..\.....\SocketClient.java
.......\...\.......\....\...\...\..\.....\SocketServer.java
.......\...\.......\....\...\...\nlp\ChineseTokenizer.java
.......\...\.......\....\...\...\...\LuceneTokenizer.java
.......\...\.......\....\...\...\...\Word.java
.......\...\.......\....\...\...\...\WordTokenizer.java
.......\...\java\org\apache\hadoop\mapreduce\lib\input\KeyValueLineRecordReader.java
.......\...\....\...\......\......\.........\...\.....\KeyValueTextInputFormat.java
.......\...\....\...\joy\crawler\Crawler.java
.......\...\....\...\