Toeplitz Inverse Covariance-based Clustering of Multivariate Time Series Data

Toeplitz Inverse Covariance-based Clustering of Multivariate Time Series Data

David Hallac, Sagar Vare, Stephen Boyd, Jure Leskovec

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Best Sister Conferences. Pages 5254-5258. https://doi.org/10.24963/ijcai.2018/732

Subsequence clustering of multivariate time series is a useful tool for discovering repeated patterns in temporal data. Once these patterns have been discovered, seemingly complicated datasets can be interpreted as a temporal sequence of only a small number of states, or clusters. However, discovering these patterns is challenging because it requires simultaneous segmentation and clustering of the time series. Here we propose a new method of model-based clustering, which we call Toeplitz Inverse Covariance-based Clustering (TICC). Each cluster in the TICC method is defined by a correlation network, or Markov random field (MRF), characterizing the interdependencies between different observations in a typical subsequence of that cluster. Based on this graphical representation, TICC simultaneously segments and clusters the time series data. We solve the TICC problem through a scalable algorithm that is able to efficiently solve for tens of millions of observations. We validate our approach by comparing TICC to several state-of-the-art baselines in a series of synthetic experiments, and we then demonstrate on an automobile dataset how TICC can be used to learn interpretable clusters in real-world scenarios.
Keywords:
Machine Learning: Time-series;Data Streams
Machine Learning: Unsupervised Learning