Finding minimum confidence threshold to avoid derived rules in association rule mining


Nzar Abdulqader Ali

School of Administration and Economy, University of Sulaimani,





Abstract
Data in data warehouse often contains sensitive information, the concept of PrivacyPreserving has recently been proposed in response to the concerns of preserving
sensitive information derived from published rules. A number of privacy preserving
data publishing (PPDP) have been proposed. In this paper an algorithm proposed for
hiding published rules that leads to disclosure of sensitive information by determining
the confidence value of those rules from the raw data before running association rule
mining using prior and posterior probabilities of generated rules and pass those
confidence values to data miner to take it in his account when determining minimum
confidence threshold in association rule mining algorithms .The experimental results
show that the run time for deriving sensitive rules is stabile for different confidence
values in comparison with other methods running linear programming methods for
finding sensitive published rules. The most derived rules from goal rules (the rules
derived from sensitive rules with minimum confidence value) located between 0.5
and 0.8 and these range of confidence values are critical values for data miner, finally
experimental results shows that with support values %40,%58, and %63 still there is
amount of derived published rules appears, and these results means that even with
large minimum support threshold still derived published rules appears in association
rule algorithms.
 

Key Words: Data Mining, Association Rule, Privacy Preserving 



References

[1] S. Sumathi and S.N. Sivanandam. “Introduction to Data Mining Principles, Studies in Computational
Intelligence” (SCI) 29, 1–20 (2006).
[2] Andrei Manta. “Literature Survey on Privacy Preserving Mechanisms for Data Publishing”, Msc, November 2013,
URL: http: 1,2013, http:// cybercybersecurity.tudelft.nl/sites/default/files/
Literature_Survey_Andrei_Manta_0.pdf, accessed at: May,2015.
[3] Thi-Thiet Pham, Jiawei Lu, Tzung-Pei Hong, BayVo “An Efficient Algorithm For Mining sequential Rules with
Interestingness Measures”, International Journal of Innovative, Computing, Information and Control ICIC
International,Vol 9, Number 12, pp 4812,December 2013
[4] Soumadip Ghosh, Sushanta Biswas, Debasree Sarkar, Partha Pratim Sarkar, “Mining Frequent temsets Using
Genetic Algorithm”, International Journal of Artificial Intelligence & Applications (IJAIA), Vol.1, No.4, October
2010
[5] Deepa B. Mane ,Emmanuel M, “Review on Privacy and Utility in High Dimensional Data Publishing”,
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS), Volume 3, Issue 1,
January – February 2014.
[6] Zutao Zhu, Guan Wang, Wenliang Du. “Deriving Private Information from Association Rule Mining Results”.
pp18, ISSN :1084-4627, E-ISBN : 978-0-7695-3545-6 IEEE,(2009).
[7] Sunil Kumar, Mahaveer Singh and Nidhi Porwal, “An Algorithm for Hiding Association Rules on Data Mining”.
National Conference on Communication Technologies & its impact on Next Generation Computing CTNGC,(2012)

[8] K.Srinivasa Rao, CH. Suresh Babu, A. Damodaram and Tai-hoon Kim, “Distortion Technique for Hiding Sensitive
Association Rules”. International Journal of Multimedia and Ubiquitous Engineering Vol. 9, No. 10 (2014), pp.
57-66 (2014).
[9] UCI, Machine Learning repository, adult data set, available at:
http://www.ics.uci.edu/mlearn/mlrepository.html , accessed at: May,2015.