Deriving the Partial Order of Documents to Extend Clustering Applications

Authors

Raja AGL Department of Computer Science and Applications, SCSVMV University, Kanchipuram, Tamil Nadu, India
Francis FS Department of Computer Science and Engineering, Pondicherry Engineering College, Puducherry, Indi
Sugumar P Department of Computer Applications, Sacred Heart College (Autonomous), Tirupattur, Tamil Nadu, India

DOI:

https://doi.org/10.26438/ijcse/v7i1.424430

Keywords:

Clustering, Partial Ordering, Classification,, Categorization, Indexing

Abstract

The exponential growth of text documents over the internet has paved the way for systematic document organization. It is widely accepted that the document clustering has augmented the information retrieval process to a greater extend. Basically all the text clustering algorithms tend to establish more appropriate clusters of text documents, and the accuracy of text clustering algorithms are measured based on cluster cohesion and separation. Keeping to the basic principle of clustering to minimize cohesion and maximize separation, all the algorithms deploy different strategies to generate better quality clusters. It is observed from the detailed literature survey that Classification, Categorization, Plagiarism Detection and Clustering are correlated. All these text mining tasks are performed based on indexing, searching or relating the key terms present in the documents. Moreover, all the text mining methods focuses on establishing the similarity or difference among the text documents, by which they perform their intended tasks. Hence, they tend to limit the application of clustering only to complement information retrieval task. This paper tries to present an algorithm to establish the partial order among the text documents and thus to extend the applications of clustering.

References

[1] 1.www.wikipdia.com/ Hierarchy

[2] 2.www.wikipedia.com/Poset- Wikipedia.html.

[3] 3.Michelangelo Ceci and Donato Malerba, “Classifying web documents in a hierarchyof categories: a comprehensive study”, Journal of Intelligent Information Systems, ISSN: 0925-9902, Volume 28, Issue 4, pp. 37-78, 2007.

[4] 4.W.T. Chuang, A. Tiyyagura, J. Yang and G. Giuffrida, “A fast algorithm for hierarchicaltext classification”,Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2000), pp. 409-418, New York , U.S.A, 2000.

[5] 5.S. D. Alessio, K. Murray, R. Schiaffino, and A. Kershenbau, “The effect of using hierarchical classifiers in text categorization”, Proceedings of the 6thInternationalConferenceonRecherchedInformationAssistdeparOrdinateur(RIAO2000), pp. 302-313, Paris, France,2000.

[6] 6.D. Koller and M. Sahami, “Hierarchically classifying documents using very few words”, Proceedings of the 14th International Conference onMachineLearning , pp. 170-178, California, U.S.A, 1997.

[7] 7.M.K. M. Rahman and Tony W. S. Chow, “Content based hierarchical document organization using multi layer hybrid network and tree structured features”, Expert Systems with Applications, ISSN: 2874-2881, Volume 37, 2010

Downloads

PDF ⁰

Published

2019-01-31

CITATION

DOI: 10.26438/ijcse/v7i1.424430

Published: 2019-01-31

How to Cite

[1]

A. G. L. Raja, F. S. Francis, and P. Sugumar, “Deriving the Partial Order of Documents to Extend Clustering Applications”, Int. J. Comp. Sci. Eng., vol. 7, no. 1, pp. 424–430, Jan. 2019.

Download Citation

Issue

Vol. 7 No. 1 (2019): IJCSE January Edition

Section

Research Article

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.

Deriving the Partial Order of Documents to Extend Clustering Applications

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

Journal Information

UGC Gazette Regulation

Join Editorial Board

Information

Current Issue

Keywords