Abstract
Classifying citation trajectories of scientific publications is crucial. However, they diffuse anomalously due to non-linear, non-stationary, and long-ranged correlations. Previous studies define hard thresholds, arbitrary parameters, and subjective rules to classify based on their rise and fall patterns. It leads to substantial variance and, thus, ambiguous classification. This paper proposes CiteDEK, a hybrid EMD-kNN-DTW classification model framework. It predicts the nature of 5,039 trajectories, each 30 years in length, using only raw time series. We get a classification accuracy of ≈ 76%, and Cohen’s kappa-statistic is 0.63, which is significant.