文件名称:adaboost
介绍说明--下载内容均来自于网络,请自行研究使用
AdaBoost元算法属于boosting系统融合方法中最流行的一种,说白了就是一种串行训练并且最后加权累加的系统融合方法。
具体的流程是:每一个训练样例都赋予相同的权重,并且权重满足归一化,经过第一个分类器分类之后,
计算第一个分类器的权重alpha值,并且更新每一个训练样例的权重,然后再进行第二个分类器的训练,相同的方法.......
直到错误率为0或者达到指定的训练轮数,其中最后预测的标签计算是各系统*alpha的加权和,然后sign(预测值)。
可以看出,训练流程是串行的,并且训练样例的权重是一直在变化的,分错的样本的权重不断加大,正确的样本的权重不断减小。
AdaBoost元算法是boosting中流行的一种,还有其他的系统融合的方法,比如bagging方法以及随机森林。
对于非均衡样本的处理,一般可以通过欠抽样(undersampling)或者过抽样(oversampling),欠抽样是削减样本的数目,
过抽样是重复的选取某些样本,最好的方法是两种进行结合的方法。
同时可以通过删除离决策边界比较远的样例。
-AdaBoost boosting systems dollar fusion algorithm is the most popular one, it plainly systems integration approach is a serial train and final weighted cumulative.
Specific process is: Each training example is given equal weight, and the weights satisfy normalization, after the first classifiers after
Calculating a first classifier weights alpha value for each sample and updates right weight training, and then the second classifier training, the same way .......
0, or until the specified error rate training rounds, wherein the label is the calculation of the final prediction system* alpha weighted and then sign (predicted value).
As can be seen, the training process is serial, and weight training examples is always changing, the right of the wrong sample weight continued to increase, the right to correct sample weight decreasing.
AdaBoost algorithm is an element, as well as other methods of boosting popular systems integration, such as bagging and random forest method.
For
具体的流程是:每一个训练样例都赋予相同的权重,并且权重满足归一化,经过第一个分类器分类之后,
计算第一个分类器的权重alpha值,并且更新每一个训练样例的权重,然后再进行第二个分类器的训练,相同的方法.......
直到错误率为0或者达到指定的训练轮数,其中最后预测的标签计算是各系统*alpha的加权和,然后sign(预测值)。
可以看出,训练流程是串行的,并且训练样例的权重是一直在变化的,分错的样本的权重不断加大,正确的样本的权重不断减小。
AdaBoost元算法是boosting中流行的一种,还有其他的系统融合的方法,比如bagging方法以及随机森林。
对于非均衡样本的处理,一般可以通过欠抽样(undersampling)或者过抽样(oversampling),欠抽样是削减样本的数目,
过抽样是重复的选取某些样本,最好的方法是两种进行结合的方法。
同时可以通过删除离决策边界比较远的样例。
-AdaBoost boosting systems dollar fusion algorithm is the most popular one, it plainly systems integration approach is a serial train and final weighted cumulative.
Specific process is: Each training example is given equal weight, and the weights satisfy normalization, after the first classifiers after
Calculating a first classifier weights alpha value for each sample and updates right weight training, and then the second classifier training, the same way .......
0, or until the specified error rate training rounds, wherein the label is the calculation of the final prediction system* alpha weighted and then sign (predicted value).
As can be seen, the training process is serial, and weight training examples is always changing, the right of the wrong sample weight continued to increase, the right to correct sample weight decreasing.
AdaBoost algorithm is an element, as well as other methods of boosting popular systems integration, such as bagging and random forest method.
For
(系统自动生成,下载前可以参看下载内容)
下载文件列表
adaboost.py
adaboost.readme