透過您的圖書館登入
IP:3.144.109.5

並列摘要


Clustering in continuous vector data spaces is a well-studied problem. In recent years there has been a significant amount of research work in clustering categorical data. However, most of these works deal with market-basket type transaction data and are not specifically optimized for high-dimensional vectors. Our focus in this paper is to efficiently cluster high-dimensional vectors in non-ordered discrete data spaces (NDDS). We have defined several necessary geometrical concepts in NDDS which form the basis of our clustering algorithm. Several new heuristics have been employed exploiting the characteristics of vectors in NDDS. Experimental results on large synthetic datasets demonstrate that the proposed approach is effective, in terms of cluster quality, robustness and running time. We have also applied our clustering algorithm to real datasets with promising results.

延伸閱讀