文件名称:BootCaT-0.1.2.tar
介绍说明--下载内容均来自于网络,请自行研究使用
此软件是开源软件,主要用于中文信息处理,信息检索。本人主要用于网络获取双语语料库。此软件用perl编写,模块独立性强,在获得收集一些种子网址后,即可用于双语网络获取。-The perl scr ipts included in the BootCaT toolkit implement an
iterative procedure to bootstrap specialized corpora and terms from
the web, requiring only a list of ``seeds (terms that are expected
to be typical of the domain of interest) as input.
In implementing the algorithm, we followed the old UNIX adage that
each program should do only one thing, but do it well. Thus, we
developed a small, independent tool for each separate subtask of the
algorithm.
As a result, BootCaT is extremely modular: One can easily run a subset
of the programs, look at intermediate output files, add new tools to
the suite, or change one program without having to worry about the
others.
iterative procedure to bootstrap specialized corpora and terms from
the web, requiring only a list of ``seeds (terms that are expected
to be typical of the domain of interest) as input.
In implementing the algorithm, we followed the old UNIX adage that
each program should do only one thing, but do it well. Thus, we
developed a small, independent tool for each separate subtask of the
algorithm.
As a result, BootCaT is extremely modular: One can easily run a subset
of the programs, look at intermediate output files, add new tools to
the suite, or change one program without having to worry about the
others.
相关搜索: bootc
(系统自动生成,下载前可以参看下载内容)
下载文件列表
11912894BootCaT-0.1.2.tar