文件名称:segment
介绍说明--下载内容均来自于网络,请自行研究使用
segment,一个简单的中文分词程序,命令行如下:
java -jar segmenter.jar [-b|-g|-8|-s|-t] inputfile.txt
-b Big5, -g GB2312, -8 UTF-8, -s simp. chars, -t trad. chars
Segmented text will be saved to inputfile.txt.seg-segment, a simple Chinese word segmentation process, the following command line: java-jar segmenter.jar [-b |-g |-8 |-s |-t] inputfile.txt-b Big5,-g GB2312,-8 UTF-8,-s simp. chars,-t trad. charsSegmented text will be saved to inputfile.txt.seg
java -jar segmenter.jar [-b|-g|-8|-s|-t] inputfile.txt
-b Big5, -g GB2312, -8 UTF-8, -s simp. chars, -t trad. chars
Segmented text will be saved to inputfile.txt.seg-segment, a simple Chinese word segmentation process, the following command line: java-jar segmenter.jar [-b |-g |-8 |-s |-t] inputfile.txt-b Big5,-g GB2312,-8 UTF-8,-s simp. chars,-t trad. charsSegmented text will be saved to inputfile.txt.seg
(系统自动生成,下载前可以参看下载内容)
下载文件列表
bothlexu8.txt
data
....\sforeign_u8.txt
....\snotname_u8.txt
....\snumbers_u8.txt
....\ssurname_u8.txt
....\tforeign_u8.txt
....\tnotname_u8.txt
....\tnumbers_u8.txt
....\tsurname_u8.txt
META-INF
........\MANIFEST.MF
segmenter.class
segmenter.java
simplexu8.txt
tradlexu8.txt
data
....\sforeign_u8.txt
....\snotname_u8.txt
....\snumbers_u8.txt
....\ssurname_u8.txt
....\tforeign_u8.txt
....\tnotname_u8.txt
....\tnumbers_u8.txt
....\tsurname_u8.txt
META-INF
........\MANIFEST.MF
segmenter.class
segmenter.java
simplexu8.txt
tradlexu8.txt