Deriving the Partial Order of Documents to Extend Clustering Applications

Authors

  • Raja AGL Department of Computer Science and Applications, SCSVMV University, Kanchipuram, Tamil Nadu, India
  • Francis FS Department of Computer Science and Engineering, Pondicherry Engineering College, Puducherry, Indi
  • Sugumar P Department of Computer Applications, Sacred Heart College (Autonomous), Tirupattur, Tamil Nadu, India

DOI:

https://doi.org/10.26438/ijcse/v7i1.424430

Keywords:

Clustering, Partial Ordering, Classification,, Categorization, Indexing

Abstract

The exponential growth of text documents over the internet has paved the way for systematic document organization. It is widely accepted that the document clustering has augmented the information retrieval process to a greater extend. Basically all the text clustering algorithms tend to establish more appropriate clusters of text documents, and the accuracy of text clustering algorithms are measured based on cluster cohesion and separation. Keeping to the basic principle of clustering to minimize cohesion and maximize separation, all the algorithms deploy different strategies to generate better quality clusters. It is observed from the detailed literature survey that Classification, Categorization, Plagiarism Detection and Clustering are correlated. All these text mining tasks are performed based on indexing, searching or relating the key terms present in the documents. Moreover, all the text mining methods focuses on establishing the similarity or difference among the text documents, by which they perform their intended tasks. Hence, they tend to limit the application of clustering only to complement information retrieval task. This paper tries to present an algorithm to establish the partial order among the text documents and thus to extend the applications of clustering.

References

[1] 1.www.wikipdia.com/ Hierarchy

[2] 2.www.wikipedia.com/Poset- Wikipedia.html.

[3] 3.Michelangelo Ceci and Donato Malerba, “Classifying web documents in a hierarchyof categories: a comprehensive study”, Journal of Intelligent Information Systems, ISSN: 0925-9902, Volume 28, Issue 4, pp. 37-78, 2007.

[4] 4.W.T. Chuang, A. Tiyyagura, J. Yang and G. Giuffrida, “A fast algorithm for hierarchicaltext classification”,Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2000), pp. 409-418, New York , U.S.A, 2000.

[5] 5.S. D. Alessio, K. Murray, R. Schiaffino, and A. Kershenbau, “The effect of using hierarchical classifiers in text categorization”, Proceedings of the 6thInternationalConferenceonRecherchedInformationAssistdeparOrdinateur(RIAO2000), pp. 302-313, Paris, France,2000.

[6] 6.D. Koller and M. Sahami, “Hierarchically classifying documents using very few words”, Proceedings of the 14th International Conference onMachineLearning , pp. 170-178, California, U.S.A, 1997.

[7] 7.M.K. M. Rahman and Tony W. S. Chow, “Content based hierarchical document organization using multi layer hybrid network and tree structured features”, Expert Systems with Applications, ISSN: 2874-2881, Volume 37, 2010

Downloads

Published

2019-01-31
CITATION
DOI: 10.26438/ijcse/v7i1.424430
Published: 2019-01-31

How to Cite

[1]
A. G. L. Raja, F. S. Francis, and P. Sugumar, “Deriving the Partial Order of Documents to Extend Clustering Applications”, Int. J. Comp. Sci. Eng., vol. 7, no. 1, pp. 424–430, Jan. 2019.

Issue

Section

Research Article