Verleysen, Michel
[UCL]
François, Damien
[UCL]
Modern data analysis tools have to work on high-dimensional data, whose components are not independently distributed. High-dimensional spaces show surprising, counter-intuitive geometrical properties that have a large influence on the performances of data analysis tools. Among these properties, the concentration of the norm phenomenon results in the fact that Euclidean norms and Gaussian kernels, both commonly used in models, become inappropriate in high-dimensional spaces. This papers presents alternative distance measures and kernels, together with geometrical methods to decrease the dimension of the space. The methodology is applied to a typical time series prediction example.
Bibliographic reference |
Verleysen, Michel ; François, Damien. The curse of dimensionality in data mining and time series prediction.8th International Work-Conference on Artificial Neural Networks (IWANN 2005) (Barcelona (Spain), du 08/06/2005 au 10/06/2005). In: Lecture Notes in Computer Science, Vol. 3512, p. 758-770 (2005) |
Permanent URL |
http://hdl.handle.net/2078.1/60793 |