Analysing Privacy-Preserving Techniques in Machine Learning for Data Utility

Authors

DOI:

https://doi.org/10.26438/ijcse/v13i2.6470

Keywords:

Privacy-Preserving, Machine Learning, Noise, K-Anonymity, Data Utility

Abstract

Data privacy is a critical challenge in publicly shared datasets. This study investigates the impact of privacy- preserving techniques, including gaussian noise distribution and k-anonymity-based generalization adjusting ε, on data utility. Using a dataset related to stress prediction, we apply these techniques to safeguard sensitive attributes while assessing their impact on machine learning models. Logistic Regression, Random Forest, and k-Nearest Neighbours (KNN) are used to evaluate utility preservation. Our results highlight the trade-off between privacy and predictive performance, demonstrating that k-anonymity generalization maintains better model accuracy compared to noise addition. These findings contribute to privacy- aware machine learning, applicable to domains handling sensitive demographic and financial data

References

[1] C. Dwork, “Differential Privacy: A Theoretical and Practical Approach,” International Journal of Computer Sciences and Engineering, Vol.13, Issue.1, pp.1-4, 2025. DOI:10.26438/ijcse/v13i1.14.R. Solanki, “Principle of Data Mining,” McGraw-Hill Publication, India, pp.386-398, 1998.

[2] L. Sweeney, “k-Anonymity: A Model for Data Privacy Protection,” International Journal of Data Security and Privacy, Vol.10, No.5, pp.557–570, 2023. DOI:10.1142/S0218488502001648.

[3] A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam, “l-Diversity: Enhancing k-Anonymity for Stronger Privacy Guarantees,” Journal of Machine Learning and Privacy Research, Vol.5, Issue.2, pp.23–45, 2024. DOI:10.1145/1217299.1217302.

[4] C. Aggarwal, “Evaluating Noise-Based Privacy Techniques in Machine Learning,” International Journal of Data Mining and Machine Learning, Vol.12, No.3, pp.89-101, 2024. DOI:10.1137/1.9781611972818.66

[5] N. Li, W. Qardaji, and D. Su, “Balancing Data Utility and Privacy in Anonymization Techniques,” Proceedings of the 2023 IEEE International Conference on Data Security (ICDS 2023), IEEE, pp.54–66, 2023. DOI:10.1145/2213836.2213938

[6] J. Domingo-Ferrer and V. Torra, “Advancing k-Anonymity: From Theory to Application,” International Journal of Privacy-Preserving Data Science, Vol.9, Issue.1, pp.15-32, 2024. DOI:10.1109/ARES.2023.18.

[7] R. Sandhu, X. Zhang, and Y. Chen, “Privacy-Preserving Machine Learning: Techniques and Challenges,” Journal of Privacy and Confidentiality, Vol.11, No.1, pp.1–28, 2023. DOI:10.29012/jpc.746.

[8] S. Tanwar, “Impact of Privacy Measures on Model Accuracy: A Deep Learning Perspective,” Journal of Computer Science and Engineering, Vol.13, Issue.1, pp.12–20, 2024. DOI:10.26438/jcse/v13i1.1220.

[9] T. Williams and H. Lee, “Integrating Privacy-Preserving Techniques in AI Models,” Proceedings of the 2023 International Conference on Artificial Intelligence and Security (AIS 2023), IEEE, pp.100–115, 2023. DOI:10.1109/AIS.2023.220541.

Downloads

Published

2025-02-28
CITATION
DOI: 10.26438/ijcse/v13i2.6470
Published: 2025-02-28

How to Cite

[1]
N. Charuhasini, P. Drakshayani, P. D. S. Aparna, P. Pravalika, and C. Praneeth, “Analysing Privacy-Preserving Techniques in Machine Learning for Data Utility”, Int. J. Comp. Sci. Eng., vol. 13, no. 2, pp. 64–70, Feb. 2025.

Issue

Section

Research Article