文件名称:Stemmer
- 所属分类:
- 人工智能/神经网络/遗传算法
- 资源属性:
- [Java] [源码]
- 上传时间:
- 2012-11-26
- 文件大小:
- 4kb
- 下载次数:
- 0次
- 提 供 者:
- rong*****
- 相关连接:
- 无
- 下载说明:
- 别用迅雷下载,失败请重下,重下不扣分!
介绍说明--下载内容均来自于网络,请自行研究使用
在英语中,一个单词常常是另一个单词的“变种”,如:happy=>happiness,这里happy叫做happiness的词干(stem)。在信息检索系统中,我们常常做的一件事,就是在Term规范化过程中,提取词干(stemming),即除去英文单词分词变换形式的结尾。
应用最为广泛的、中等复杂程度的、基于后缀剥离的词干提取算法是波特词干算法,也叫波特词干器(Porter Stemmer)。详见官方网站。比较热门的检索系统包括Lucene、Whoosh等中的词干过滤器就是采用的波特词干算法。-In English, a word often another word variants, such as: happy => happiness happy here called happiness stem (stem). Information retrieval system, we often do things Term normalization process, extract the stem (stemming), that is the end of the word transform the form of removal of English words. The most widely used, moderate complexity, stemming algorithms based on suffix stripped Porter Stemming Algorithm, also known as the Porter stemmer Porter Stemmer. For details, please refer to the official website. More popular retrieval system include the word in Lucene, Whoosh done filter is used Porter stemming algorithm.
应用最为广泛的、中等复杂程度的、基于后缀剥离的词干提取算法是波特词干算法,也叫波特词干器(Porter Stemmer)。详见官方网站。比较热门的检索系统包括Lucene、Whoosh等中的词干过滤器就是采用的波特词干算法。-In English, a word often another word variants, such as: happy => happiness happy here called happiness stem (stem). Information retrieval system, we often do things Term normalization process, extract the stem (stemming), that is the end of the word transform the form of removal of English words. The most widely used, moderate complexity, stemming algorithms based on suffix stripped Porter Stemming Algorithm, also known as the Porter stemmer Porter Stemmer. For details, please refer to the official website. More popular retrieval system include the word in Lucene, Whoosh done filter is used Porter stemming algorithm.
(系统自动生成,下载前可以参看下载内容)
下载文件列表
Stemmer.java