文件名称:Chinese-Word-Segment-And-POS-Tagger
介绍说明--下载内容均来自于网络,请自行研究使用
实现了中文分词和词性标注程序。分词方法采用“三词正向最长匹配”。词性标注使用HMM方法,用Viterbi算法实现。“三词正向最长匹配”保持了“正向最长匹配算法”快速的特点,同时提高了分词的准确性。-Chinese word segmentation and implemented procedures for POS tagging. Segmentation Methods, " the longest three-match positive words." POS tagging using HMM method, the Viterbi algorithm. " Three words maximum positive match" to maintain a " positive maximum matching algorithm," Fast features, while improving the accuracy of segmentation.
(系统自动生成,下载前可以参看下载内容)
下载文件列表
SegAndTag\chnsegtager_segtag_200828016029024.py
.........\CovertToUTF-8.py
.........\dict.py
.........\dict.pyc
.........\diction.py
.........\diction.py.bak
.........\seg.py
.........\seg.pyc
.........\selecttool.py
.........\selecttool.pyc
.........\viterbi.py
.........\viterbi.pyc
.........\word.py
.........\word.pyc
.........\__init__.py
data\dict.dat
....\diction.txt
....\segoutput.txt
....\tagoutput.txt
....\testinput.txt
....\utf8train.txt
SegAndTag
data
.........\CovertToUTF-8.py
.........\dict.py
.........\dict.pyc
.........\diction.py
.........\diction.py.bak
.........\seg.py
.........\seg.pyc
.........\selecttool.py
.........\selecttool.pyc
.........\viterbi.py
.........\viterbi.pyc
.........\word.py
.........\word.pyc
.........\__init__.py
data\dict.dat
....\diction.txt
....\segoutput.txt
....\tagoutput.txt
....\testinput.txt
....\utf8train.txt
SegAndTag
data