Doquire, Gauthier
[UCL]
Verleysen, Michel
[UCL]
In many real-world situations, the data cannot be assumed to be precise. Indeed uncertain data are often encountered, due for example to the imprecision of measurement devices or to continuously moving objects for which the exact position is impossible to obtain. One way to model this uncertainty is to represent each data value as a probability distribution function; recent works show that adequately taking the uncertainty into account generally leads to improved classification performances. Working with such a representation, this paper proposes to achieve feature selection based on mutual information. Experiments on 8 UCI data sets show that the proposed approach is effective to select relevant features.
- Ren Jiangtao, Lee Sau Dan, Chen Xianlu, Kao Ben, Cheng Reynold, Cheung David, Naive Bayes Classification of Uncertain Data, 10.1109/icdm.2009.90
- Tsang Smith, Kao Ben, Yip Kevin Y., Ho Wai-Shing, Lee Sau Dan, Decision Trees for Uncertain Data, 10.1109/tkde.2009.175
- Bi, J., Zhang, T.: Support Vector Classification with Input Data Uncertainty. In: Advances in Neural Information Processing Systems, NIPS (2004)
- Ngai Wang, Kao Ben, Chui Chun, Cheng Reynold, Chau Michael, Yip Kevin, Efficient Clustering of Uncertain Data, 10.1109/icdm.2006.63
- Kriegel H., Pfeifle M., Hierarchical Density-Based Clustering of Uncertain Data, 10.1109/icdm.2005.75
- Kao Ben, Lee Sau Dan, Cheung David W., Ho Wai-Shing, Chan K. F., Clustering Uncertain Data Using Voronoi Diagrams, 10.1109/icdm.2008.31
- Cormode Graham, McGregor Andrew, Approximation algorithms for clustering uncertain data, 10.1145/1376916.1376944
- Aggarwal Charu C., Yu Philip S., Outlier Detection with Uncertain Data, Proceedings of the 2008 SIAM International Conference on Data Mining (2008) ISBN:9780898716542 p.483-493, 10.1137/1.9781611972788.44
- Aggarwal C.C., Yu P.S., A Survey of Uncertain Data Algorithms and Applications, 10.1109/tkde.2008.190
- Guyon, I., Elisseeff, A.: An Introduction to Variable and Feature Selection. J. Mach. Lear. Res. 3, 1157–1182 (2003)
- Shannon C. E., A Mathematical Theory of Communication, 10.1002/j.1538-7305.1948.tb00917.x
- Battiti R., Using mutual information for selecting features in supervised neural net learning, 10.1109/72.298224
- Peng, H., Long, F., Ding, C.: Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy. IEEE T. Pattern. Anal. 27 (2005)
- Rossi F., Lendasse A., François D., Wertz V., Verleysen M., Mutual information for the selection of relevant variables in spectrometric nonlinear modelling, 10.1016/j.chemolab.2005.06.010
- François D., Rossi F., Wertz V., Verleysen M., Resampling methods for parameter-free and robust feature selection with mutual information, 10.1016/j.neucom.2006.11.019
- Parzen Emanuel, On Estimation of a Probability Density Function and Mode, 10.1214/aoms/1177704472
- Silverman B. W., Density Estimation for Statistics and Data Analysis, ISBN:9780412246203, 10.1007/978-1-4899-3324-9
- Verleysen, M.: Learning High-Dimensional Data. In: Limitations and Future Trends in Neural Computation, pp. 141–162 (2003)
- Kraskov Alexander, Stögbauer Harald, Grassberger Peter, Estimating mutual information, 10.1103/physreve.69.066138
- Gómez-Verdejo Vanessa, Verleysen Michel, Fleury Jérôme, Information-theoretic feature selection for functional data classification, 10.1016/j.neucom.2008.12.035
- Frank, A., Asuncion, A.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2010), http://archive.ics.uci.edu/ml
Bibliographic reference |
Doquire, Gauthier ; Verleysen, Michel. Feature selection with mutual information for uncertain data.13th International Conference on Data Warehousing anf Knowledge Discovery (DaWaK 2011) (Toulouse (France), du 29/08/2011 au 02/09/2011). In: Lecture Notes in Computer Science, Vol. 6862, p. 330-341 (2011) |
Permanent URL |
http://hdl.handle.net/2078.1/116589 |