Clustering time series for automatic similarity measurement selection of Database

M Thurai Pandian; P Damodharan; K R Bhavya; Sanjay Singh; K Anitha; Ankur Kumar Aggarwal

Authors

M Thurai Pandian School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India
P Damodharan Department of Computer Engineering, Marwadi University, Rajkot, Gujarat, India
K R Bhavya School of Computing and Information Technology, REVA University, Bangalore, Karnataka, India
Sanjay Singh Department of Computer Science and Technology, Manav Rachna University, Faridabad, Haryana, India
K Anitha School of Computing and Information Technology, REVA University, Bangalore, Karnataka, India
Ankur Kumar Aggarwal Department of Computer Science and Technology, Manav Rachna University, Faridabad, Haryana, India

Keywords:

Multi-label classification framework, SOM Clustering, K-Means, Clustering, Time Series database

Abstract

Clustering has turned into a famous undertaking related with time series. The decision of an appropriate measurement of distance is pivotal of the clustered system and, the immense number of measurable distance of time series accessible in the writing and their different attributes, this choice isn't clear. With the target of working on this errand, we propose a multi-name arrangement structure that gives the resources to consequently choose the most reasonable measurable distance of cluster: a period series data set. This is classified depends on an original assortment of attributes that depict the fundamental elements of the time series data sets and give the prescient data important to separate between clusters measurement of distance. To test the legitimacy of this classifier, we direct a total arrangement of investigations utilizing both engineered and constant series data sets and a cluster of 5 normal distance measures. The positive outcomes got by the planned grouping structure for different execution measures show that, the proposed theory is helpful to improve on the course of distance choice in time series clustering undertakings.

References

Liao, T. W. (2005). Clustering of time series data—a survey. Pattern recognition, 38(11), 1857-1874.

Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., & Keogh, E. (2013). Experimental comparison of representation methods and distance measures for time series data. Data Mining and Knowledge Discovery, 26(2), 275-309.

Esling, P., & Agon, C. (2012). Time-series data mining. ACM Computing Surveys (CSUR), 45(1), 1-34.

Rehfeld, K., Marwan, N., Heitzig, J., & Kurths, J. (2011). Comparison of correlation analysis techniques for irregularly sampled time series. Nonlinear Processes in Geophysics, 18(3), 389-404.

Wang, X., Smith, K. and Hyndman, R., (2006). Characteristic-based clustering for time series data. Data mining and knowledge Discovery, 13(3), pp.335-364..

Cleveland, R.B., Cleveland, W.S., McRae, J.E. and Terpenning, I., (1990). STL: A seasonal-trend decomposition. J. Off. Stat, 6(1), pp.3-73..

Ryan, J.A. and Ulrich, J.M., quantmod: Quantitative Financial Modelling Framework, (2010). URL http://CRAN. R-project. org/package= quantmod. R package version 0.3-16.

Lorenz, D. and Köhler, T., (2005). A comparison of denoising methods for one dimensional time series. Zentrum für Technomathematik.

Ahmed, S. T., Singh, D. K., Basha, S. M., Nasr, E. A., Kamrani, A. K., & Aboudaif, M. K. (2021). Neural Network Based Mental Depression Identification and Sentiments Classification Technique From Speech Signals: A COVID-19 Focused Pandemic Study. Frontiers in public health, 9.

Mori, U., Mendiburu, A. and Lozano, J.A., Supplementary material for the work titled “Similarity Measure Selection for Clustering Time Series Databases”.

Percival, D.B. and Walden, A.T., (2000). Wavelet methods for time series analysis (Vol. 4). Cambridge university press.

Hubert, M. and Vandervieren, E., (2008). An adjusted boxplot for skewed distributions. Computational statistics & data analysis, 52(12), pp.5186-5201.

Batista, G.E., Wang, X. and Keogh, E.J., (2011), April. A complexity-invariant distance measure for time series. In Proceedings of the 2011 SIAM international conference on data mining (pp. 699-710). Society for Industrial and Applied Mathematics.

Ahmed, S. T., Sreedhar Kumar, S., Anusha, B., Bhumika, P., Gunashree, M., & Ishwarya, B. (2018, November). A Generalized Study on Data Mining and Clustering Algorithms. In International Conference On Computational Vision and Bio Inspired Computing (pp. 1121-1129). Springer, Cham.

Ahmed, S. S. T., & Patil, K. K. (2016, March). Novel breast cancer detection technique for TMS-India with dynamic analysis approach. In 2016 International Conference on Circuit, Power and Computing Technologies (ICCPCT) (pp. 1-5). IEEE.

Al-Shammari, N. K., Alzamil, A. A., Albadarn, M., Ahmed, S. A., Syed, M. B., Alshammari, A. S., & Gabr, A. M. (2021). Cardiac Stroke Prediction Framework using Hybrid Optimization Algorithm under DNN. Engineering, Technology & Applied Science Research, 11(4), 7436-7441.

Chen, L., Özsu, M.T. and Oria, V., (2005), June. Robust and fast similarity search for moving object trajectories. In Proceedings of the 2005 ACM SIGMOD international conference on Management of data (pp. 491-502).

Agrawal, R., Faloutsos, C. and Swami, A., (1993), October. Efficient similarity search in sequence databases. In International conference on foundations of data organization and algorithms (pp. 69-84). Springer, Berlin, Heidelberg.

Clustering time series for automatic similarity measurement selection of Database

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Milestone Research

Indexing and Abstracting

Information