User menu

Optimal constraint-based decision tree induction from itemset lattices

Bibliographic reference Nijssen, Siegfried ; Fromont, Elisa. Optimal constraint-based decision tree induction from itemset lattices. In: Data Mining and Knowledge Discovery, Vol. 21, no.1, p. 9-51 (2010)
Permanent URL http://hdl.handle.net/2078.1/186620
  1. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Bocca JB, Jarke M, Zaniolo C (eds) VLDB’94, proceedings of 20th international conference on very large data bases. Morgan Kaufmann, pp 487–499
  2. Agrawal R, Mannila H, Srikant R, Toivonen H, Verkamo AI (1996) Fast discovery of association rules. In: Fayyad UM, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (eds) Advances in knowledge discovery and data mining. AAAI/MIT Press, pp 307–328
  3. Angelopoulos N, Cussens J (2005) Exploiting informative priors for Bayesian clas-sification and regression trees. In: Kaelbling LP, Saffiotti A (eds) IJCAI-05, proceedings of the 19th international joint conference on artificial intelligence. Professional Book Center, pp 641–646
  4. Asuncion A, Newman D (2007) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. http://www.ics.uci.edu/~mlearn/MLRepository.html
  5. Baixeries J, Szathmary L, Valtchev P, Godin R (2009) Yet a faster algorithm for building the Hasse diagram of a concept lattice. In: Ferre S, Rudolph S (eds) ICFCA’09, proceedings of the 7th international conference on formal concept analysis. Springer, pp 162–177
  6. Bayardo R, Goethals B, Zaki MJ (eds) (2004) FIMI ’04, proceedings ofthe IEEE ICDM workshop on frequent itemset mining implementations. CEUR-WS.org
  7. Blanchard G, Schafer C, Rozenholc Y, Müller KR (2007) Optimal dyadic decision trees. Mach Learn 66(2–3): 209–241
  8. Bonchi F, Lucchese C (2007) Extending the state-of-the-art of constraint-based pattern discovery. Data Knowl Eng 60(2): 377–399
  9. Boulicaut J-F, Bykowski A, Rigotti C (2000) Approximation of frequency queries by means of free-sets. In: Zighed DA, Komorowski HJ, Zytkow JM (eds) PKDD’00, proceedings ofthe 4th European conference on principles of data mining and knowledge discovery. Springer, pp 75–85
  10. Boulicaut J-F, Bykowski A, Rigotti C (2003) Free-sets: a condensed representation of boolean data for the approximation of frequency queries. Data Mining Knowl Discov 7(1): 5–22
  11. Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth Publishing Company, Belmont, California, USA
  12. Bringmann B, Nijssen S, Zimmermann A (2009) Pattern-based classification: a unifying perspective. In: Knobbe A, Furnkranz J (eds) LeGo ’09, proceedings of the ECML PKDD 2009 workshop ‘From Local Patterns to Global Models’
  13. Bucila C, Gehrke J, Kifer D, White WM (2003) DualMiner: a dual-pruning algorithm for itemsets with constraints. Data Mining Knowl Discov 7(3): 241–272
  14. Buntine W (1992) Learning classification trees. Stat Comput 2: 63–73
  15. Chipman HA, George EI, McCulloch RE (1998) Bayesian CART model search. J Am Stat Assoc 93(443): 935–947
  16. De Raedt L, Zimmermann A (2007) Constraint-based pattern set mining. In: Parthasarathy S, Liu B (eds) SDM’07, proceedings of the seventh SIAM international conference on data mining. SIAM Press, pp 1–12
  17. Dong G, Zhang X, Wong L, Li J (1999) CAEP: classification by aggregating emerging patterns. In: Arikawa S, Furukawa K (eds) DS’99, proceedings of the second international conference on discovery science. Springer, pp 30–42
  18. Esmeir S, Markovitch S (2007a) Anytime induction of cost-sensitive trees. In: Koller D, Schuurmans D, Bengio Y, Bottou L (eds) NIPS’07, proceedings of the 21st conference on neural information processing systems. MIT Press, pp 425–132
  19. Esmeir S, Markovitch S (2007b) Anytime learning of decision trees. J Machine Learn Res 8: 891–933
  20. Friedman A, Schuster A, Wolff R (2006) k-anonymous decision tree induction. In: Furnkranz J, Scheffer T, Spiliopoulou M (eds) PKDD’06, proceedings ofthe 10th European conference on principles and practice of knowledge discovery in databases. Springer, pp 151–162
  21. Ganter Bernhard, Wille Rudolf, Formal Concept Analysis, ISBN:9783540627715, 10.1007/978-3-642-59830-2
  22. Garey MR (1972) Optimal binary identification procedures. SIAM J Appl Math 23(2): 173–186
  23. Garofalakis MN, Hyun D, Rastogi R, Shim K (2003) Building decision trees with constraints. Data Min Knowl Discov 7(2): 187–214
  24. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Chen W, Naughton J, Bernstein PA (eds) SIGMOD’00, proceedings ofthe 2000 ACM SIGMOD international conference on management of Data. ACM Press, pp 1–12
  25. Hyafil L, Rivest RL (1976) Constructing optimal binary decision trees is NP-complete. Inf Process Lett 5(1): 15–17
  26. Imielinski T, Mannila H (1996) A database perspective on knowledge discovery. Commun ACM 39: 58–64
  27. Knobbe A, Ho E (2006) Maximally informative k-itemsets and their efficient discovery. In: Eliassi-Rad T, Ungar LH, Craven M, Gunopulos D (eds) KDD’06, proceeding ofthe 12th ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, pp 237–244
  28. Knobbe A, Cremilleux B, Furnkranz J, Scholz M (2008) From local patterns to global models: the LeGo approach to data mining. In: Furnkranz J, Knobbe A (eds) LeGo’08, proceedings ofthe ECML PKDD 2008 workshop ‘From Local Pat-terns to Global Models’, pp 1–16
  29. Lew A (1978) Optimal conversion of extended-entry decision tables with general cost criteria. Commun ACM 21(4): 269–279
  30. Li W, Han J, Pei J (2001) CMAR: accurate and efficient classification based on multiple class-association rules. In: Cercone N, Lin TY, Wu X (eds) ICDM’01, proceedings of the 2001 IEEE international conference on data mining. IEEE Computer Society, pp 369–376
  31. Liu B, Hsu W, Ma Y (1998) Integrating classification and association rule mining. In: Agrawal R, Stolorz PE, Piatetsky-Shapiro G (eds) KDD’98, proceedings ofthe 4th international conference on knowledge discovery and data mining. AAAI Press, pp 80–86
  32. Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M (2007) l-diversity: privacy beyond k-anonymity. ACM Trans Knowl Discov Data 1(1): 3
  33. Mannila H, Toivonen H, Verkamo AI (1994) Efficient algorithms for discovering association rules. In: Fayyad UM, Uthurusamy R (eds) KDD’94, proceedings ofthe AAAI workshop on knowledge discovery in databases. AAAI Press, pp 181–192
  34. Meisel WS, Michalopoulos D (1973) A partitioning algorithm with application in pattern classification and the optimization of decision tree. IEEE Trans Comput C-22:93–103
  35. Moore A, Lee MS (1998) Cached sufficient statistics for efficient machine learning with large datasets. J Artif Intell Res 8: 67–91
  36. Murphy PM, Pazzani MJ (1997) Exploring the decision forest: an empirical in-vestigation of Occam’s razor in decision tree induction. In: Greiner R, Petsche T, Jose S (eds) Computational learning theory and natural learning systems vol IV: making learning systems practical. MIT Press, pp 171–187
  37. Nadeau C, Bengio Y (2003) Inference for the generalization error. Machine Learn 52(3): 239–281
  38. Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Efficient mining of association rules using closed itemset lattices. Inf Syst 24(1): 25–46
  39. Payne HJ, Meisel WS (1977) An algorithm for constructing optimal binary decision trees. IEEE Trans Comput 26(9): 905–916
  40. Pei J, Han J, Lakshmanan LVS (2001) Mining frequent item sets with convertible constraints. In: ICDE’01, proceedings of the IEEE international conference on data engineering. IEEE Computer Society, pp 433–142
  41. Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Francisco
  42. Samarati P (2001) Protecting respondents’ identities in microdata release. IEEE Trans Knowl Data Eng 13(6): 1010–1027
  43. Schumacher H, Sevcik KC (1976) The synthetic approach to decision table conversion. Commun ACM 19(6): 343–351
  44. Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertainty, Fuzziness Knowledge-Based Syst 10(5): 557–570
  45. Turney P (1995) Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm. J Artif Intell Res 2: 369–409
  46. Uno T, Kiyomi M, Arimura H (2004) LCM ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. In: Bayardo R, Goethals B, Zaki MJ (eds) FIMI ’04, proceedings of the IEEE ICDM workshop on frequent itemset mining implementations. CEUR-WS.org
  47. Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco
  48. Yan X, Cheng H, Han J, Xin D (2005) Summarizing itemset patterns: a profile-based approach. In: Grossman R, Bayardo RJ, Bennett KP (eds) KDD’05, proceedings of the 11th ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, pp 314–323
  49. Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997a) New algorithms for fast discovery of association rules. In: Heckerman D, Mannila H, Pregibon D (eds) KDD’97, proceedings of the 3rd international conference on knowledge discovery and data mining. AAAI Press, pp 283–286
  50. Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997b) Parallel algorithms for discovery of association rules. Data Mining Knowl Discov 1(4): 343–373