文件名称:DataMining3rd
介绍说明--下载内容均来自于网络,请自行研究使用
评测数据在去掉停用词的
分类过程开放测试中,引入Good-Turing算法的分类性能比Laplace原则提高了3·05 ,比Lidstone方法提高
1·00 .而在交叉熵选择特征词的算法中,增加Good-Turing的贝叶斯分类方法可比最大熵分类性能高95 .通过这种数据平滑的算法,有助于克服因数据稀疏而引发的特征词缺失问题
-Evaluation data in the open test of the classification process to remove stop words, the introduction of the Good-Turing algorithm classification performance than the Laplace principle 305 , 100 Lidstone method to select feature words in the cross-entropy algorithm to increase the Good-Turing Bayesian classification than maximum entropy classification performance of 95 through this data smoothing algorithm, can help to overcome the problem of sparse data caused by the lack of feature words
分类过程开放测试中,引入Good-Turing算法的分类性能比Laplace原则提高了3·05 ,比Lidstone方法提高
1·00 .而在交叉熵选择特征词的算法中,增加Good-Turing的贝叶斯分类方法可比最大熵分类性能高95 .通过这种数据平滑的算法,有助于克服因数据稀疏而引发的特征词缺失问题
-Evaluation data in the open test of the classification process to remove stop words, the introduction of the Good-Turing algorithm classification performance than the Laplace principle 305 , 100 Lidstone method to select feature words in the cross-entropy algorithm to increase the Good-Turing Bayesian classification than maximum entropy classification performance of 95 through this data smoothing algorithm, can help to overcome the problem of sparse data caused by the lack of feature words
(系统自动生成,下载前可以参看下载内容)
下载文件列表
DataMining3rd.pdf