HUANG Yong,WEI Le.An SVM Decision Tree Algorithm for Unbalanced Data Sets[J].Journal of Chengdu University of Information Technology,2019,(03):274-277.[doi:10.16836/j.cnki.jcuit.2019.03.012]
一种针对不均衡数据集的SVM决策树算法
- Title:
- An SVM Decision Tree Algorithm for Unbalanced Data Sets
- 文章编号:
- 2096-1618(2019)03-0274-04
- Keywords:
- big data; natural language processing; dynamic programming; complete binary tree; support vector machine; text classification; machine learning
- 分类号:
- TP391
- 文献标志码:
- A
- 摘要:
- 针对文本分类问题中常遇到的数据分布不均的情况,提出一种新的SVM决策树算法。算法在构造分类器结点时,运用动态规划的思想,寻找类别数和样本数量同时最优的分配方案。实验结果表明,该方法比基于完全二叉树的SVM分类器准确率有明显提升。
- Abstract:
- A new SVM decision tree algorithm is proposed for the uneven distribution of data commonly encountered in text classification problems. When constructing the nodes of classifier, the algorithm uses the idea of dynamic programming to find the optimal allocation scheme of both the number of categories and the number of samples.The experimental results show that the proposed method has a significantly better accuracy than the SVM classifier based on the complete binary tree.
参考文献/References:
[1] 陶新民,郝思媛,张冬雪,等.不均衡数据分类算法的综述[J].重庆邮电大学学报(自然科学版),2013,25(1):101-110.
[2] 李诒靖,郭海湘,李亚楠,等.一种基于Boosting的集成学习算法在不均衡数据中的分类[J].系统工程理论与实践,2016,36(1):189-199.
[3] 孙晓燕,张化祥,计华.用于不均衡数据集分类的KNN算法[J].计算机工程与应用,2011,47(28):143-145.
[4] 杜红乐,张燕.密度不均衡数据分类算法[J].西华大学学报(自然科学版),2015,34(5):16-23.
[5] 崔建,李强,刘勇,等.基于决策树的快速SVM分类方法[J].系统工程与电子技术,2011,33(11):2558-2563.
[6] Segata N,Blanzieri E.Fast and Scalable Local Kernel Machines[M].JMLR.org,2009.
[7] Dorff K C,Chambwe N,Srdanovic M,et al.BDVal:reproducible large-scale predictive model development and validation in high-throughput datasets[J].Bioinformatics,2010,26
(19):2472-2473.
[8] 程凤伟.一种基于决策树的SVM算法[J].太原学院学报(自然科学版),2017(1):33-36.
[9] 王琛,王云,陈丽芳,等.哈夫曼树SVM在空气质量等级分类中的应用[J].智能计算机与应用,2016,6(1)64-67.
[10] 陈丽芳,陈亮,刘保相.基于粒计算的哈夫曼树SVM多分类模型研究[J].计算机科学,2016,43(1):64-68.
[11] 孙怀影,耿寅融,单谦.求解0-1背包问题的一种新混合算法[J].计算机工程与应用,2012,48(4):50-53.
[12] Mensch A,Blondel M.Differentiable Dynamic Programming for Structured Prediction and Attention[J].2018.
备注/Memo
收稿日期:2019-02-24 基金项目:四川省科技计划重点研发项目资助(2017GZ0309); 四川省教育厅青年基金重点资助项目(16ZA0208)