文件名称:Chinese-text-categorization-Study
介绍说明--下载内容均来自于网络,请自行研究使用
本文通过对Bayes、KNN、SVM 应用于中文文本分类进行比较实验研究。
应用ICTCLAS 对中文文档进行分词,在大维数,多数据情况下应用TFIDF 进行
特征选择,并同时利用它实现了对特征项进行加权处理,使文本库中的每个文本
具有统一的、可处理的结构模型。然后通过三类分类算法实现了对权值数据进行
训练和分类。-Based on the Bayes, KNN, SVM applied to compare the Chinese text categorization Study. Application ICTCLAS word segmentation of Chinese document, in large dimension, multiple-data cases where application TFIDF feature selection, and also use it to realize the characteristics of weighted items, so that each text text library with a unified, capable of handling structural model. Three types of classification algorithm then weights the data for training and classification.
应用ICTCLAS 对中文文档进行分词,在大维数,多数据情况下应用TFIDF 进行
特征选择,并同时利用它实现了对特征项进行加权处理,使文本库中的每个文本
具有统一的、可处理的结构模型。然后通过三类分类算法实现了对权值数据进行
训练和分类。-Based on the Bayes, KNN, SVM applied to compare the Chinese text categorization Study. Application ICTCLAS word segmentation of Chinese document, in large dimension, multiple-data cases where application TFIDF feature selection, and also use it to realize the characteristics of weighted items, so that each text text library with a unified, capable of handling structural model. Three types of classification algorithm then weights the data for training and classification.
(系统自动生成,下载前可以参看下载内容)
下载文件列表
Chinese text categorization Study.pdf