LI Bao-lin,ZHOU Kun,LI Shi-wei.Research on Mining Algorithm of Maximal Frequent Itemsets based on M-blsearch[J].Journal of Chengdu University of Information Technology,2016,(05):463-468.
一种基于M-Bisearch的最大频繁项集挖掘算法研究
- Title:
- Research on Mining Algorithm of Maximal Frequent Itemsets based on M-blsearch
- 文章编号:
- 2096-1618(2016)05 -0463-06
- Keywords:
- machine learning; data mining; association rules; frequent itemsets; maximum frequent itemsets; M-bisearch
- 分类号:
- TP311
- 文献标志码:
- A
- 摘要:
- 大数据分析的理论核心就是数据挖掘,关联规则挖掘算法是数据挖掘的重要分支,其包含频繁项集的生成和关 联规则的产生两个步骤,频繁项集的生成过程中算法开销占据很大成本。从最大频繁项集的性质入手,在改变数据 存储结构的基础上采用M-Bisearch的思想,通过对存储空间进行压缩来减少扫描次数和降低支持度计算开销,从而 达到提升算法执行效率的目的。实验表明,改进算法在处理中长模式的频繁项集挖掘问题时具有明显的优越性。
- Abstract:
- Data mining is the core of big data analysis, and association rule mining algorithm is an important branch of data miningwhich contains two steps: the generation of frequent itemsets and the generation of association rules. The process of generating frequent itemsets in overhead occupies a large cost. This paper starts with the nature of the maximal frequent itemsets, adopts the idea of M-bisearch on the basis of hanging data storage structure, reduces computation cost of the scanning times and the support degree though compressing storage space, so as to achieve the goal of improving the efficiency of the algorithm.
参考文献/References:
[1] 吴喜萍. 基于关联规则数据挖掘技术的高校学生学习成绩分析[D].成都:西南交通大学,2010.
[2] Agrawal R, Srikant S. Fast Algorithms for Mining Association Rules [C].VLDB'94. Santiago,
Chile,1994:487-499.
[3] Park J S, Chen M, Yu P S. An Effective Hash-Based Algorithm for Mining Association Rules
[C].SIGMOD'95. Sanjose, CA,1995: 175-186.
[4] Savasere A,omieeinski E, Navathe S. An efficient algorithm for mining association rules in
large databases[C]. Proceedings of the 21st International Conference on Very large Database, 1995.
[5] Brin S,Motwani R,Ullman J,et al.Dynamic Itemset Counting and Implication Rules for Market
Basket Data[C].Proc.of 1997 ACMSIGMOD Int'1 Conf. on Management of Data.ACM Press,1997:255-264.
[6] Charu C.Aggarwal,towards long pattern generation in dense databases[J].ACM SIGKDD
Explorations Newsletter,2001,3(1).
[7] 黄建明,赵文静,王星星.基于十字链表的Apriori改进算法[J].计算机工程,2009,(2):37-38.
[8] 刘华婷,郭仁祥,姜浩.关联规则挖Apriori算法的研究改进[J]. 计算机应用与软件,2009,26(1):146-
148.
[9] 栗晓聪,滕少华.频繁项集挖掘的Apriori改进算法研究[J].江西师范大学学报:自然科学版,2011,35
(5):498-501.
[10] 刘玉文.基于十字链表的Apriori算法的研究与改进[J].计算机应用与软件,2012,29(5):267-369.
[11] 郑麟.一种直接生成频繁项集的分治Apriori算法[J]. 计算机应用与软件, 2014,(4):297-301.
[12] 陈方健,张明新,杨昆.一种具有跳跃式前进的Apriori算法[J].计算机应用与软件,2015,(3):34-36,92.
[13] 宋余庆,朱玉全,孙志挥,等.一种基于频繁模式树的约束最大频繁项目集挖掘及其更新算法[J]计算机研
究与发展,2005,42(5):777-783.
[14] 颜跃进,李舟军,陈火旺.基于FP-Tree有效挖掘最大频繁项集[J]. 软件学报,2005,(2):215-222.
[15] 林佳雄,黄战.基于数组向量的Apriori算法改进[J].计算机应用与软件,2011,28(5):268-271.
[16] 付沙,宋丹.基于矩阵的Apriori改进算法研究[J].微电子学与计算机,2012,29(5):156-160.
[17] 刘红星,王崇骏,谢俊元. 基于图的最大频繁项集的生成算法[J]. 南京大学学报:自然科学版,2008,44
(5):520-526.
[18] 陈向华,刘可昂. 基于FP-Tree的最大频繁项目集挖掘算法[J]. 软件,2015,12:98-102.
[19] 杨鹏坤,彭慧,周晓锋,等.改进的基于频繁模式树的最大频繁项集挖掘算法——FP-MFIA[J]. 计算机应
用,2015,(3):775-778.
相似文献/References:
[1]黄冠英,郑皎凌.基于变长隐马尔科夫模型的维基词条编辑微过程挖掘[J].成都信息工程大学学报,2018,(01):34.[doi:10.16836/j.cnki.jcuit.2018.01.007]
HUANG Guan-ying,ZHENG Jiao-ling.Wikipedia Entries Editing Micro-process Mining based onVariable Length Hidden Markov Model[J].Journal of Chengdu University of Information Technology,2018,(05):34.[doi:10.16836/j.cnki.jcuit.2018.01.007]
[2]赵锦阳,卢会国,蒋娟萍,等.基于改进决策树的故障诊断方法研究[J].成都信息工程大学学报,2018,(06):624.[doi:10.16836/j.cnki.jcuit.2018.06.005]
ZHAO Jin-yang,LU Hui-guo,JIANG Juan-ping,et al.Research on Fault Diagnosis Method based on Improved Decision Tree[J].Journal of Chengdu University of Information Technology,2018,(05):624.[doi:10.16836/j.cnki.jcuit.2018.06.005]
[3]吴东华,常 征,何 嘉.基于用户行为序列模式的性别分析与预测[J].成都信息工程大学学报,2016,(增刊1):7.
[4]杨 頔,文成玉.结合关联规则的情感分析模型研究[J].成都信息工程大学学报,2019,(05):501.[doi:10.16836/j.cnki.jcuit.2019.05.011]
YANG Di,WEN Chengyu.Research on Emotional Analysis Model based on Association Rules[J].Journal of Chengdu University of Information Technology,2019,(05):501.[doi:10.16836/j.cnki.jcuit.2019.05.011]
[5]唐雨奇,李则辰,杨东东,等.基于机器学习的面部运动神经传导检查数据的研究及应用[J].成都信息工程大学学报,2020,35(05):519.[doi:10.16836/j.cnki.jcuit.2020.05.007]
TANG Yuqi,LI Zechen,YANG Dongdong,et al.Research and Application of Facial Motor Nerve Conduction Examination Data based on Machine Learning[J].Journal of Chengdu University of Information Technology,2020,35(05):519.[doi:10.16836/j.cnki.jcuit.2020.05.007]
[6]毛开银,赵长名,何 嘉.基于XGBoost的10 m风速订正研究[J].成都信息工程大学学报,2020,35(06):604.[doi:10.16836/j.cnki.jcuit.2020.06.004]
MAO Kaiyin,ZHAO Changming,HE Jia.A Research for 10 m Wind Speed Prediction based on XGBoost[J].Journal of Chengdu University of Information Technology,2020,35(05):604.[doi:10.16836/j.cnki.jcuit.2020.06.004]
[7]李孝涌,陈科艺,李熙晨.卷积神经网络在ENSO预报中的应用[J].成都信息工程大学学报,2022,37(01):81.[doi:10.16836/j.cnki.jcuit.2022.01.014]
LI Xiaoyong,CHEN Keyi,LI Xichen.Application of Convolutional Neural Network in ENSO Prediction[J].Journal of Chengdu University of Information Technology,2022,37(05):81.[doi:10.16836/j.cnki.jcuit.2022.01.014]
备注/Memo
收稿日期:2016-09-13 基金项目:四川省科技厅支撑资助项目(2013SZ0056)