Combine Approach of CADS and USHER Interfaces for Document Annotation
Keywords:
Annotation, attribute value, USHER, data quality, form design, CADSAbstract
A large data is generated in different organization which is in textual format. In such data structured information is get shadowed in unstructured data. Many algorithms working on extraction of information from raw data but which is costly and not efficient and also shows impure results. Data quality is also the main issue. In existing system used annotation for query search and work on attribute suggestion which make querying feasible but annotation that use attribute value pairs require users to be more principled in their annotation efforts. Also user always has good idea in using and applying the annotations. In this we proposed new techniques that combine the working of (Collaborative Adaptive Data Sharing platform) CADS and USHER for attribute suggestion and improving data quality. In our approach we first generate CADS form and after that we evaluate real-world data sets components using USHER. This technique shows superior results compared to current approach. It improves the visibility of document and also data quality with minimum cost.
References
Eduardo J. Ruiz, Vagelis Hristidis, and Panagiotis G. Ipeirotis, “Facilitating Document Annotation using Content and Querying Value,” IEEE Transactions on knowledge and data engineering, Vol.26, No.2, February 2014.
S.R. Jeffery, M.J. Franklin, and A.Y. Halevy, “Pay-as-You-Go User Feedback for Dataspace Systems,” Proc. ACM SIGMOD Int’1 Conf.Management Data, June 2008.
K. Chen, H. Chen, N. Conway, J.M.Hellerstein, and T.S. Parikh, “Usher: Improving Data Quality with Dynamic Forms,”IEEE Transactions on knowledge and data engineering, Vol.23, No.8, August 2011.
M.Jayapandian and H.V. Jagadish, “Automated Creation of a Forms-Based Database Query Interface, “Proc.VLDB Endowment, Vol.1, 2008, pp.695-709.
M. Jayapandian and H. Jagadish, ”Expressive Query Specification through Form Customization,” Proc. 11th Int’1 Conf. Extending Database Technology: Advances in Database Technology (EDBT ’08), 2008, pp.416-427.
M.Miah, G. Das, V. Hristidis, and H. Mannila, “Standing out in a Crowd: Selecting Attributes for Maximum Visibility,” Proc.Int’1 Conf. Data Eng. (ICDE), 2008.
G. Tsoumakas and I. Vlahavas, “Random K-Labelsets: An Ensemble Method for Multilabel Classification.” Proc. 18th European Conf. Machine Learning(ECML’07),2007, pp.406-417.
K. Saleem, S. Luis, Y. Deng, S.-C. Chen, V. Hristidis, and T. Li, “Towards a Business Continuity Information Network for Rapid Disaster Recovery,” Proc. Int’1 Conf. Digital Govt. Research(dg.o ’08), 2008.
M.J. Cafarella, J. Madhavan, and A. Halevy, “Web-Scale Extraction of Structured Data,” SIGMOD Record, Vol.37, March 2009, pp.55-61.
J. Madhavan et al., “Web-Scale Data Integration: You Can Only Afford to Pay as You Go,” Proc. Third Biennial Conf. Innovative Data Systems Research(CIDR), 2007.
O. Etzioni, M. Banko, S. Soderland, and D.S. Weld,” Open Information Extraction from the Web,” Comm. ACM, Vol. 51, Dec.2008, pp. 68-74.
“Google,” Google Base, 2011.
Microsoft, Microsoft Sharepoint, 2012.
SAP, Sap Content Manager, 2011.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
