The Concept of Ontology for Numerical Data Clustering

Peter Grabusts

Abstract


Classical clustering algorithms have been studied quite well, they are used for the numerical data grouping in similar structures - clusters. Similar objects are placed in the same cluster, different objects – in another cluster. All classical clustering algorithms have common characteristics, their successful choice defines the clustering results. The most important clustering parameters are following: clustering algorithms, metrics, the initial number of clusters, clustering validation criteria. In recent years there is a strong tendency of the possibility to get the rules from clusters. Semantic knowledge is not used in classical clustering algorithms. This creates difficulties in interpreting the results of clustering. Currently, the possibilities to use ontology increase rapidly, that allows to get knowledge of a specific data model. In the frames of this work the ontology concept, prototype development for numerical data clustering, which includes the most important characteristics of clustering performance have been analyzed.

Keywords


cluster analysis; clustering; ontology

Full Text:

PDF

References


B. S. Everitt, Cluster analysis. John Viley and Sons, London, 1993, 170 p.

L. Kaufman and P. J. Rousseeuw, Finding groups in data. An introduction to cluster analysis. John Wiley & Sons, 2005.

S. Russel and P. Norvig, Artificial Intelligence: A Modern Approach. Prentice Hall, 2010, 1132 p.

R. Xu and D. C. Wunch, Clustering. John Wiley & Sons,2010, pp. 263-278.

D. Gašević, D. Djurić and V. Devedžić, Model driven architecture and ontology development. Springer-Verlag, 2006.

F. Hoppner, F. Klawonn, R. Kruse and T. Runkler, Fuzzy Cluster Analysis. John Whiley and Sons, New York, 1999,289 p.

M. Crawen and J. Shavlik, Using sampling and queries to extract rules from trained neural networks. Machine Learning: Proceedings of the Eleventh International Conference, San Francisco, CA, 1994.

P. Vitanyi, Universal similarity. ITW2005, Rotorua, New Zealand, 2005.

R. Andrews and S. Gewa, “RULEX and CEBP networks as the basis for a rule refinement system,” in J. Hallam et al, editor, Hybrid Problems, Hybrid Solutions. IOS Press, 1995.

G. Gan, C. Ma and J. Wu, “Data clustering: Theory, algorithms and applications,” ASA-SIAM series on Statistics and Applied Probability, SIAM, Philadelphia, ASA, Alexandria, VA, 2007.

T. R. Gruber, “A translation approach to portable ontologies,” Knowledge Acquisition, 5(2),199-220, 1993.

N. Guarino, “Formal Ontology in Information Systems,” 1st International Conference on Formal Ontology in Information Systems, FOIS, Trento, Italy, IOS Press, 3-15, 1998.

D. R. Hush and B. G. Horne, “Progress in Supervised Neural Networks. What’s new since Lippmann?” IEEE Signal Processing Magazine, vol.10, No 1.,p.8-39, 1993.

M. Li, X. Chen, B. Ma and P. Vitanyi, “The similarity metric,” IEEE Transactions on Information Theory, vol.50, No. 12, pp.3250-3264, 2004.

X. Rui and D. Wunsch II, “Survey of clustering algorithms,” Neural Networks, IEEE Transactions on, 16(3):645–678, May 2005.

“Protégé project homepage,” [Online]. Available: http://protege.stanford.edu/index.html [Accessed: March 13, 2013].




DOI: http://dx.doi.org/10.17770/etr2013vol2.848

Refbacks

  • There are currently no refbacks.


SCImago Journal & Country Rank