文件名称:segment
介绍说明--下载内容均来自于网络,请自行研究使用
segment,一个简单的中文分词程序,命令行如下:
java -jar segmenter.jar [-b|-g|-8|-s|-t] inputfile.txt
-b Big5, -g GB2312, -8 UTF-8, -s simp. chars, -t trad. chars
Segmented text will be saved to inputfile.txt.seg
java -jar segmenter.jar [-b|-g|-8|-s|-t] inputfile.txt
-b Big5, -g GB2312, -8 UTF-8, -s simp. chars, -t trad. chars
Segmented text will be saved to inputfile.txt.seg
(系统自动生成,下载前可以参看下载内容)
下载文件列表
压缩包 : 47651498segment.rar 列表 META-INF\MANIFEST.MF bothlexu8.txt segmenter.class segmenter.java simplexu8.txt tradlexu8.txt data\sforeign_u8.txt data\snotname_u8.txt data\snumbers_u8.txt data\ssurname_u8.txt data\tforeign_u8.txt data\tnotname_u8.txt data\tnumbers_u8.txt data\tsurname_u8.txt META-INF data