logo
Volume 22, Issue 3 (12-2025)                   JSDP 2025, 22(3): 35-58 | Back to browse issues page


XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Mosayebi E, Ebrahimi Atani R. A Novel Privacy-Preserving Distributed Data Publishing Protocol Based on Probabilistic Models. JSDP 2025; 22 (3) : 3
URL: http://jsdp.rcisp.ac.ir/article-1-1467-en.html
Associated Professor, Department of Computer Engineering, University of Guilan, Rasht, Iran
Abstract:   (418 Views)

In the era of digital transformation, government agencies and corporations increasingly rely on electronic services, generating vast volumes of sensitive data stored in distributed databases. While these records hold immense potential for knowledge discovery through data mining, their publication or sharing raises critical privacy concerns, particularly when sensitive individual information is at risk. Traditional Privacy-Preserving Distributed Data Publishing (PPDDP) methods rely heavily on Trusted Third-Party (TTP) intermediaries and Secure Multi-Party Computation (SMC), which introduce systemic vulnerabilities such as communication bottlenecks, synchronization failures, insider attacks, and inherent distrust in centralized entities. In healthcare analytics, hospitals leverage patient data to enhance diagnostic precision, optimize clinical workflows, and advance preventive and precision medicine. Yet, reliance on siloed datasets from individual institutions often restricts model generalizability and impedes comprehensive insights into health outcomes. Patient health is a multidimensional construct influenced not only by genetic and biological factors but also by behavioral patterns and socio-environmental determinants. Cross-institutional collaboration integrating diverse datasets from geographically distributed sources is essential to develop robust analytical models. However, such collaboration raises critical privacy concerns, as centralized aggregation of sensitive data risks exposure to breaches or misuse. Our probabilistic framework for privacy-preserving distributed data publishing directly addresses this challenge. By eliminating dependencies on trusted third parties and secure multi-party computation, our approach enables secure, decentralized integration of heterogeneous healthcare data. Through uncertainty-aware probabilistic anonymization and adaptive noise injection, the framework ensures compliance with stringent privacy regulations (e.g., GDPR, CPRA, HIPAA) while preserving the analytical utility required for accurate, actionable health outcome predictions. This balance of utility and privacy empowers researchers to harness the full potential of distributed datasets without compromising individual confidentiality, ultimately fostering innovation in precision medicine and population health management. This paper introduces a novel probabilistic framework for privacy preservation in distributed environments, eliminating dependencies on TTP and SMC. Unlike existing approaches, this method leverages uncertainty-aware probabilistic models to dynamically anonymize and perturb data across distributed nodes while preserving global data utility. First a survey of privacy preservation data publishing methods is presented in this paper and then we discuss about prose and cons of the techniques. After this we present the model and its implementation details. The results obtained by security evaluations shows that the presented method will balance out the privacy security and the accuracy of distributed data better, using the probability model without needing a Trusted Third-Party and Secure Multi-party Computation.

Article number: 3
Full-Text [PDF 1574 kb]   (149 Downloads)    
Type of Study: Research | Subject: Paper
Received: 2025/04/8 | Accepted: 2025/07/21 | Published: 2025/12/19 | ePublished: 2025/12/19

References
1. A. Majeed and S. Lee, (2021) "Anonymization Techniques for Privacy Preserving Data Publishing: A Comprehensive Survey," in IEEE Access, vol. 9, pp. 8512-8545, doi: 10.1109/ACCESS.2020.3045700. [DOI:10.1109/ACCESS.2020.3045700]
2. "Universal Declaration of human Rights", [Online]. Available: http://www.un.org/en/documents/udhr
3. J. Bennett and S.Lanning, (2007), "The Netflix Prize", Proceeding of the KDD Cup Workshop, pp. 3-6
4. Barbaro, M., Zeller, T., Hansell, S, (2006). "A face is exposed for AOL searcher no. 4417749", New York Times, 9(2008), 8For.
5. Kiran, P., Kavya, N. P, (2012). "A Survey on Methods, Attacks and Metric for Privacy Preserving Data Publishing", International Journal of Computer Applications, Vol. 53, No. 18. [DOI:10.5120/8521-2380]
6. Sattar, A. S., Li, J., Ding, X., Liu, J., Vincent, M. (2013), "A general framework for privacy preserving data publishing", Knowledge-Based Systems, Vol. 54, pp. 276-287. [DOI:10.1016/j.knosys.2013.09.022]
7. Parsa, Mojtaba ,(2009), "Privacy and Confidentiality in Medicine and its Various AspectsK", Journal of Medical Ethics and History of Medicine, Vol. 2, No 4, pp. 1-14.
7. پارسا، مجتبی (1388)، «حریم خصوصی و رازداری در پزشکی و جنبه‌های مختلف آن»، مجلة ایرانی اخلاق و تاریخ پزشکی، جلد2، شمارة 4، صص 1-14.
8. Aggarwal, C. C., Philip, S. Y. (2008), "A general survey of privacy-preserving data mining models and algorithms", In Privacy-preserving data mining, Springer US, pp. 11-52. [DOI:10.1007/978-0-387-70992-5_2]
9. Manta, A, (2013), "Literature survey on privacy preserving mechanisms for data publishing", MSc. Thesis on Department of Intelligence Systems, Faculty EEMCS, Delft University of Technology.
10. Benjamin, C. M., Fung, M., Wang, K. E., Chen, R., Yu, P. S. (2010), "Privacy-preserving data publishing: A survey of recent developments", ACM Computing Surveys, Vol. 42, No 4, pp. 141-153. [DOI:10.1145/1749603.1749605]
11. Sun, Chang; Ippel, Lianne; Dekker, Andre; Dumontier, Michel; van Soest, Johan (2021), A systematic review on privacy-preserving distributed data mining, IOS Press, Data Science 4 (2021) 121-150, doi: 10.3233/DS-210036. [DOI:10.3233/DS-210036]
12. Tânia Carvalho, Nuno Moniz, Pedro Faria, and Luís Antunes. (2023), Survey on Privacy-Preserving Techniques for Microdata Publication, ACM Comput, Surv. 55, 14s, Article 309 (December 2023), 42 pages. [DOI:10.1145/3588765]
13. Jurczyk, P., Xiong, L. (2009, July), "Distributed anonymization: Achieving privacy for both data subjects and data providers", In IFIP Annual Conference on Data and Applications Security and Privacy (pp. 191-207). Springer Berlin Heidelberg. [DOI:10.1007/978-3-642-03007-9_13]
14. Fung, B. C., Wang, K., Philip, S. Y. (2007), "Anonymizing classification data for privacy preservation", IEEE transactions on knowledge and data engineering, Vol. 19, No 5. [DOI:10.1109/TKDE.2007.1015]
15. D Joseph Ficek, Wei Wang, Henian Chen, Getachew Dagne, Ellen Daley, (2021), "Differential privacy in health research: A scoping review", Journal of the American Medical Informatics Association, Vol. 28, Issue 10,Pages 2269-2276, https://doi.org/10.1093/jamia/ocab135 [DOI:10.1093/jamia/ocab135.]
16. Clifton, C., Tassa, T. (2013, April), "On syntactic anonymity and differential privacy", In Data Engineering Workshops (ICDEW), 2013 IEEE 29th International Conference on (pp. 88-93). IEEE. [DOI:10.1109/ICDEW.2013.6547433]
17. Slawomir A. Goryczka, (2014), "Secure and Privacy-Preserving Distributed Data Release", PhD. Thesis on Computer Science, Emory University.
18. A. Majeed and S. O. Hwang, (2024) "Differential Privacy and k-Anonymity-Based Privacy Preserving Data Publishing Scheme With Minimal Loss of Statistical Information," in IEEE Transactions on Computational Social Systems, vol. 11, no. 3, pp. 3753-3765, doi: 10.1109/TCSS.2023.3320141. [DOI:10.1109/TCSS.2023.3320141]
19. Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M. (2007), "l-diversity: Privacy beyond k-anonymity", ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 1, No 1. [DOI:10.1145/1217299.1217302]
20. C. Dwork, (2006), "Differential Privacy," in Automata, Languages and Programming, Springer Berlin Heidelberg, pp. 1-12. [DOI:10.1007/11787006_1]
21. Sweeney, L., (2002), "k-anonymity: A model for protecting privacy", International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, Vol. 10, No. 05, pp. 557-570. [DOI:10.1142/S0218488502001648]
22. Goryczka, S., Xiong, L., Fung, B. C., (2014), "m-Privacy for Collaborative Data Publishing", Ieee Transactions On Knowledge And Data Engineering, Vol. 26, No. 10, pp. 2520-2533. [DOI:10.1109/TKDE.2013.18]
23. Mohammed, N., Fung, B., Wang, K., Hung, P. C. (2009, March), "Privacy-preserving data mashup", In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology (pp. 228-239). ACM. [DOI:10.1145/1516360.1516388]
24. Office for Civil Rights, H. H. S. (2002), "Standards for privacy of individually identifiable health information. Final rule", Federal Register, Vol. 67, No. 157, pp. 53181.
25. Sadeghpour, Mehdi, (2015), "Preserving Confidentiality in Data Publication through Batch Anonymization", Master's Thesis in Computer Engineering, Gilan: University of Gilan.
25. صادق پور، مهدی، (1394)، «حفظ محرمانگی در انتشار داده‌ها به‌وسیلة گمنام‌سازی دسته‌ای»، پایان‌نامة کارشناسی‌ارشد مهندسی کامپیوتر، گیلان: دانشگاه گیلان.
26. W Surapon Riyana, Noppamas Riyana, and Srikul Nanthachumphu, (2021), "Privacy Preservation Techniques for Sequential Data Releasing", In Proceedings of the 12th International Conference on Advances in Information Technology (IAIT '21), Association for Computing Machinery, New York, NY, USA, Article 24, 1-9. https://doi.org/10.1145/3468784.3470468 [DOI:10.1145/3468784.3470468.]
27. Samarati, P. (2001), "Protecting respondents identities in microdata release", IEEE transactions on Knowledge and Data Engineering, Vol. 13, No. 6, pp. 1010-1027 [DOI:10.1109/69.971193]
28. Li, N., Li, T., Venkatasubramanian, S. (2010), "Closeness: A new privacy measure for data publishing", IEEE Transactions on Knowledge and Data Engineering, Vol. 22, No. 7, pp. 943-956. [DOI:10.1109/TKDE.2009.139]
29. Silva de Garcia, P., Oliveira, M., & Brohman, K. (2020), "Knowledge sharing, hiding and hoarding: how are they related?", Knowledge Management Research & Practice, 20(3), 339-351. 1774434 [DOI:10.1080/14778238.2020.]
30. W. Ren, K. Ghazinour and X. Lian, (2023) "kt-Safety: Graph Release via k-Anonymity and t-Closeness," in IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 9, pp. 9102-9113, doi: 10.1109/TKDE.2022.3221333. [DOI:10.1109/TKDE.2022.3221333]
31. Dwork, C., McSherry, F., Nissim, K., Smith, A. (2006, March), "Calibrating noise to sensitivity in private data analysis", In Theory of Cryptography Conference (pp. 265-284), Springer Berlin Heidelberg. [DOI:10.1007/11681878_14]
32. Jiang, W., Clifton, C. (2006), "A secure distributed framework for achieving k-anonymity", The VLDB Journal-The International Journal on Very Large Data Bases, Vol. 15, No. 4, pp. 316-333. [DOI:10.1007/s00778-006-0008-z]
33. Hewage, U.H.W.A., Sinha, R. & Naeem, M.A. (2023), "Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review", Artif Intell Rev 56, 10427-10464, doi:10.1007/s10462-023-10425-3. [DOI:10.1007/s10462-023-10425-3]
34. Jun Liu, Yuan Tian, Yu Zhou, Yang Xiao, Nirwan Ansari, (2020) "Privacy preserving distributed data mining based on secure multi-party computation", Computer Communications, Volume 153, Pages 208-216, ISSN 0140-3664, doi:10.1016/j.comcom.2020.02.014. [DOI:10.1016/j.comcom.2020.02.014]
35. FDjordje Slijepčević, Maximilian Henzl, Lukas Daniel Klausner, Tobias Dam, Peter Kieseberg, Matthias Zeppelzauer, (2021), "k-Anonymity in practice: How generalisation and suppression affect machine learning classifiers", Computers & Security, Vol. 111, 2021, 102488,doi: 10.1016/j.cose.2021.102488. [DOI:10.1016/j.cose.2021.102488]
36. Zhong, S., Yang, Z., Wright, R. N. (2005, June), "Privacy-enhancing k-anonymization of customer data", In Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (pp. 139-147). ACM. [DOI:10.1145/1065167.1065185]
37. Kohlmayer, F., Prasser, F., Eckert, C., Kuhn, K. A. (2014), "A flexible approach to distributed data anonymization", Journal of biomedical informatics, Vol. 50, pp. 62-76. [DOI:10.1016/j.jbi.2013.12.002]
38. Nergiz, M. E., Cicek, E., Pedersen, T., Saygin, Y. (2012), "A look-ahead approach to secure multiparty protocols", IEEE Transactions on Knowledge and Data Engineering, Vol. 24, No. 7, pp. 1170-1185. [DOI:10.1109/TKDE.2011.44]
39. Sakthivel, S. and Vinotha, N. (2023), "An Intellectual Optimization of K-anonymity Model for Efficient Privacy Preservation in Cloud Platform". Journal of Intelligent & Fuzzy Systems, Vol. 45, no. 1, pp. 1497-1512 , doi: 10.3233/JIFS-223509. [DOI:10.3233/JIFS-223509]
40. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I. H. (2009), "The WEKA data mining software: an update", ACM SIGKDD explorations newsletter, Vol. 11, No. 1, pp. 10-18. [DOI:10.1145/1656274.1656278]
41. "UCI Machine learning repository: Adult Data Set", [Online]. Available: http://archive.ics.uci.edu/ml/datasets/Adult. [Accessed 5 September 2020]
42. "Taxonomy trees of the Adult data set", [Online]. Available: http://ddm.cs.sfu.ca/dmsoft/privacy/products/adultHierarchy.txt. [Accessed 5 October 2020].
43. Mosayebi, Elyas (2016), "Security Analysis of Privacy-Preserving Publication Methods in Distributed Data", Master's Thesis in Computer Engineering, Gilan: University of Gilan.
43. مسیبی، الیاس. (1395)، «تحلیل امنیتی روش‌های انتشار با حفظ محرمانگی در داده‌های توزیع‌شده»، پایان‌نامة کارشناسی‌ارشد مهندسی کامپیوتر، گیلان: دانشگاه گیلان.
44. Kim, Pauline and Bodie, Matthew T., (2021), "Artificial Intelligence and the Challenges of Workplace Discrimination and Privacy (September 2021)", 35 ABA Journal of Labor and Employment Law 289, Washington University in St. Louis Legal Studies Research Paper No. 21-09-02, Saint Louis U, Legal Studies Research Paper No. 2021-26, Available at SSRN: https://ssrn.com/abstract=392906
45. McMahan Brendan, Moore Eider, Ramage Daniel, Hampson Seth, Arcas Blaise Aguera y (2017), "Communi¬cation-efficient learning of deep networks from decentralized data", In: Artificial intelligence and statis¬tics, pp 1273-1282. PMLR.
46. Latif, N., Ma, W. & Ahmad, H.B, (2025), "Advancements in securing federated learning with IDS: a comprehensive review of neural networks and feature engineering techniques for malicious client detection", Artif Intell Rev, 58, 91 https://doi.org/10.1007/s10462-024-11082-w [DOI:10.1007/s10462-024-11082-w.]
47. Moradi A, Shahbahrami A, Ebrahimi Atani R, Alidoust Nia M (2016), "Persian XML Documents Metaheuristic Clustering Based on Structure and Content Similarity", Jornals of signal and Processing Data; 13 (2): 11-23
47. مرادی لالمی، علی، شاه بهرامی، اسداله، ابراهیمی آتانی، رضا، علی دوست نیا، مهران (1395)، «خوشه‌بندی فراابتکاری اسناد فارسی اِکس‌اِم‌اِل مبتنی بر شباهت ساختاری و محتوایی»، نشریه پردازش علائم و داده‌ها، 2 (28)، صص 11-23.
48. Ebrahimi Atani R, Sadeghpour M (2018), "A New Privacy Preserving Data Publishing Technique Conserving Accuracy of Classification on Anonymized Data", Jornals Jornals of signal and Processing Data; 15 (3) :31-46. [DOI:10.29252/jsdp.15.3.31]
48. ابراهیمی آتانی، رضا، صادقپور، مهدی (1397)، «ارایه یک روش جدید انتشار داده ها با حفظ محرمانگی با هدف بهبود دقت طبقه بندی روی داده های گمنام»، نشریة پردازش علائم و داده‌ها، 3 (37)، صص 31-46.
49. S. Ameri and R. E. Atani, (2024), "A Novel Decentralized Privacy Preserving Federated Learning Model for Healthcare Application," 15th International Conference on Information and Knowledge Technology (IKT), Isfahan, Iran, Islamic Republic of, 2024, pp. 115-120, doi: 10.1109/IKT65497.2024.10892736. [DOI:10.1109/IKT65497.2024.10892736]
50. K. Mohammadi and R. E. Atani, (2024), "Sigma: A Secure Federated Network Gaming Platform," 2024 15th International Conference on Information and Knowledge Technology (IKT), Isfahan, Iran, Islamic Republic of, , pp. 222-227, doi: 10.1109/IKT65497.2024.10892800. [DOI:10.1109/IKT65497.2024.10892800]
51. ابراهیمی آتانی، رضا، صادقپور، مهدی (1395)، مروری بر روش‌های حفظ حریم خصوصی در انتشار داده‌ها»، امنیت فضای تولید و تبادل اطلاعات (منادی)، ۵ (۲)، صص۴۹-۶۲.
51. Sadeghpour M, Ebrahimi Atani R. (2017), "An overview of privacy preserving Data Publishing Techniques", Biannual Journal Monadi for Cyberspace Security (AFTA), Vol. 5(2), pages: 49-62.

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.