Proceedings Abstracts of the Twenty-Fifth International Joint Conference on Artificial Intelligence

Clustering Financial Time Series: How Long Is Enough? / 2583
Gautier Marti, Sébastien Andler, Frank Nielsen, Philippe Donnat

Researchers have used from 30 days to several years of daily returns as source data for clustering financial time series based on their correlations. This paper sets up a statistical framework to study the validity of such practices. We first show that clustering correlated random variables from their observed values is statistically consistent. Then, we also give a first empirical answer to the much debated question: How long should the time series be? If too short, the clusters found can be spurious; if too long, dynamics can be smoothed out.