文件名称:SinaSpider
介绍说明--下载内容均来自于网络,请自行研究使用
这是一个新浪微博爬虫,每天可以抓取新浪微博的数据1600万条,并且提供了爬虫优化策略,包括代理IP,缓存-weibo spider
(系统自动生成,下载前可以参看下载内容)
下载文件列表
SinaSpider\.git\config
..........\....\FETCH_HEAD
..........\....\HEAD
..........\....\index
..........\....\logs\HEAD
..........\....\....\refs\heads\master
..........\....\....\....\remotes\origin\master
..........\....\objects\pack\pack-3eb149d1f820b4c212128b1edef8304b3a6567ff.idx
..........\....\.......\....\pack-3eb149d1f820b4c212128b1edef8304b3a6567ff.pack
..........\....\refs\heads\master
..........\....\....\remotes\origin\master
..........\.project
..........\.pydevproject
..........\.settings\org.eclipse.core.resources.prefs
..........\README.md
..........\Sina_spider1\Begin.py
..........\............\scrapy.cfg
..........\............\Sina_spider1\cookies.py
..........\............\............\cookies.pyc
..........\............\............\items.py
..........\............\............\items.pyc
..........\............\............\middleware.py
..........\............\............\middleware.pyc
..........\............\............\pipelines.py
..........\............\............\pipelines.pyc
..........\............\............\settings.py
..........\............\............\settings.pyc
..........\............\............\.piders\spiders.py
..........\............\............\.......\spiders.pyc
..........\............\............\.......\__init__.py
..........\............\............\.......\__init__.pyc
..........\............\............\user_agents.py
..........\............\............\user_agents.pyc
..........\............\............\weiboID.py
..........\............\............\weiboID.pyc
..........\............\............\__init__.py
..........\............\............\__init__.pyc
..........\...........2\Begin.py
..........\............\scrapy.cfg
..........\............\Sina_spider2\commands\crawlall.py
..........\............\............\........\crawlall.pyc
..........\............\............\........\__init__.py
..........\............\............\........\__init__.pyc
..........\............\............\cookies.py
..........\............\............\cookies.pyc
..........\............\............\cookiesgenerator.py
..........\............\............\items.py
..........\............\............\items.pyc
..........\............\............\middleware.py
..........\............\............\middleware.pyc
..........\............\............\pipelines.py
..........\............\............\pipelines.pyc
..........\............\............\settings.py
..........\............\............\settings.pyc
..........\............\............\.piders\informationSpider.py
..........\............\............\.......\informationSpider.pyc
..........\............\............\.......\tweetsSpider.py
..........\............\............\.......\tweetsSpider.pyc
..........\............\............\.......\weiboIDSpider.py
..........\............\............\.......\weiboIDSpider.pyc
..........\............\............\.......\__init__.py
..........\............\............\.......\__init__.pyc
..........\............\............\user_agents.py
..........\............\............\user_agents.pyc
..........\............\............\weiboID.py
..........\............\............\weiboID.pyc
..........\............\............\__init__.py
..........\............\............\__init__.pyc
..........\.git\logs\refs\remotes\origin
..........\....\....\....\heads
..........\....\....\....\remotes
..........\....\refs\remotes\origin
..........\....\logs\refs
..........\....\objects\info
..........\....\.......\pack
..........\....\refs\heads
..........\....\....\remotes
..........\....\....\tags
..........\Sina_spider1\Sina_spider1\spiders
..........\...........2\Sina_spider2\commands
..........\............\............\spiders
..........\.git\branches
..........\....\hooks
..........\....\logs
..........\....\objects
..........\....\refs
..........\Sina_spider1\Sina_spider1
..........\...........2\Sina_spider2
..........\.git
..........\.settings
..........\Sina_spider1
..........\Sina_spider2
SinaSpider