Assessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information which is able to respond to different challenges in real-world applications such as traffic management and control, smart transportation, surveillance, security and biological studies. Clustering is one of the most important methods for trajectory pattern extraction, their volume reduction, discovering outliers in trajectories, indexing and their simple visualization. So far, different similarity functions and clustering algorithms have been proposed for trajectory clustering. The diversity of clustering algorithms and their unique results highlights the need for paying attention to their weaknesses and strengths. Some clustering algorithms are only effective on low volume datasets. There are also some algorithms which are only able to extract clusters with convex shape, whereas some of them extract clusters of any shapes. On the other hand, several clustering functions require the determination of the initial value, such as the number of clusters by the users while some others do not need initial inputs. In addition, outlier detection is not possible in all clustering algorithms. In this study, spatial trajectories clustering algorithms that are extended from point clustering algorithms is divided into four general categories: partitioning-based clustering, hierarchical clustering, optimization-based clustering and density-based clustering. Then, the most commonly used algorithms in each category are implemented and evaluated. The evaluation process is performed on two sets of data (cross and i5) with dissimilar complexity. The effect of noise and outliers is one of the most critical parameters engaged in the performance quality of clustering functions which is considered in this study. The Silhouette index and computational time are used as two parameters for comparison and evaluation. According to obtained results, it is crucial to consider the data, its features, and also the utilized distance function in order to decide on the proper clustering method.  However, generally, the best results regarding the clustering quality are obtained from optimization-based clustering. With the integration of genetic algorithm into the K-means, all results in two cases of using both two datasets and using two different distance functions are improved. Using the genetic algorithm in K-means leads to finding the optimum location of cluster centers and dealing with the local minimum problem. It is important to note that high computational time is one of the weaknesses of optimization-based clustering. After the optimization-based clustering, regarding the clustering quality, partitioning-based, hierarchical and density-based clustering have achieved the second, third and fourth ranks respectively. With regard to the computational time, the best results are obtained from the density-based, hierarchical, partitioning-based and optimization-based clustering consecutively. Some methods such as K-means (a sub-category of partitioning-based clustering) are severely sensitive to outliers while spectral sub-category of partitioning-based clustering has a high resistance against them. Moreover, the density-based and optimization-based clustering methods have the highest tolerance against noise.
Language:
Persian
Published:
Journal of Geomatics Science and Technology, Volume:8 Issue: 4, 2019
Pages:
135 to 149
https://www.magiran.com/p2002804  
سامانه نویسندگان
  • Ali Abbaspour، Rahim
    Corresponding Author (2)
    Ali Abbaspour, Rahim
    Associate Professor School of Surveying Engineering and Geospatial Information, University of Tehran, Tehran, Iran
  • Chehreghan، Alireza
    Author (3)
    Chehreghan, Alireza
    Associate Professor GIS, Sahand University Of Technology, Tabriz, Iran
اطلاعات نویسنده(گان) توسط ایشان ثبت و تکمیل شده‌است. برای مشاهده مشخصات و فهرست همه مطالب، صفحه رزومه را ببینید.
مقالات دیگری از این نویسنده (گان)