RDT: A New Data Replication Algorithm for Hierarchical Data Grid
Keywords:
Distributed systems, Data grid, Data replication, Dynamic Threshold, OptorSimAbstract
Grid computing is a type of distributed computing system that provides access to various computational resources which are shared by different organizations, in order to create an integrated powerful virtual computer. Nowadays, grid is known as an essential technology which is used for different kinds of high performance applications and it is believed that it will be applied more and more in the future as technology progresses. Data replication is a common method used in distributed environments to improve ease of data access and to provide a high level of data availability, increased fault tolerance and data reliability; and that’s why this method is used for data management in data grid systems. Since the data files are very large and the Grid storages are limited, managing replicas in storage for the purpose of more effective utilization requires more attention. In this paper, a novel data replication strategy, called Replication with Dynamic Threshold (RDT) is proposed that uses a new threshold for characterizing the number of appropriate sites for replication. Appropriate sites have the higher number of access for that particular replica from other sites. It also minimizes access latency by selecting the best replica when various sites hold replicas. The simulated results with OptorSim, i.e. European Data Grid simulator show that the RDT strategy gives better performance compared to the other algorithms and prevents the unnecessary creation of replicas which leads to efficient storage usage.
References
[1] M. Li and M. Baker, “The grid core technologies”, John
Wiley & Sons, 2005.
[2] N. Mohd. Zin, A. Noraziah, A. Che Fauzi, and T. Herawan,
“Replication Techniques in Data Grid Environments”, in
Intelligent Information and Database Systems, vol. 7197,
Eds. Springer Berlin, Heidelberg, pp. 549–559, 2012.
[3] K. Ranganathan and I. Foster, “Decoupling computation and
data scheduling in distributed data-intensive applications”,
in 11th IEEE International Symposium on High
Performance Distributed Computing, pp. 352–358, 2002.
[4] K. Sashi and A.S. Thanamani, “A new replica creation and
placement algorithm for data grid environment”, in
International Conference on Data Storage and Data
Engineering (DSDE), pp. 265–269, 2010.
[5] S. Naseera and K.V.M. Murthy, “Agent Based Replica
Placement in a Data Grid Environement”, in First
International Conference on Computational Intelligence,
Communication Systems and Networks, pp. 426–430, 2009,.
[6] F.B. Charrada, H. Ounelli, and H. Chettaoui, “Dynamic
period vs static period in data grid replication”, in
International Conference on P2P, Parallel, Grid, Cloud and
Internet Computing (3PGCIC), pp. 565–568, 2010.
[7] W. Hoschek, J. Jaen-Martinez, A. Samar, H. Stockinger, and
K. Stockinger, “Data management in an international data
grid project”, in Grid Computing, Springer, pp. 77–90, 2000.
[8] A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S.
Tuecke, “The data grid: Towards an architecture for the
distributed management and analysis of large scientific
datasets”, in Network Computing Application, vol. 23, no. 3,
pp. 187–200, 2000.
[9] I. Foster and C. Kesselman, “The Grid 2: Blueprint for a
New Computing Infrastructure”, Morgan Kaufmann, 2003.
[10] S. Venugopal, R. Buyya, and K. Ramamohanarao, “A
taxonomy of data grids for distributed data sharing,
management, and processing”, in Acm Computing Surveys,
vol. 38, no. 1, p. 3, 2006.
[11] M. Mat Deris, J.H. Abawajy, and A. Mamat, “An efficient
replicated data access approach for large-scale distributed
systems”,in Future generation computer systems, vol. 24,
no. 1, pp. 1–9, 2008.
[12] H. Lamehamedi, B. Szymanski, Z. Shentu, and E. Deelman,
“Data replication strategies in grid environments”, in Fifth
International Conference on Algorithms and Architectures
for Parallel Processing, pp. 378 –383, 2002.
[13] K. Ranganathan, A. Iamnitchi, and I. Foster, “Improving
data availability through dynamic model-driven replication
in large peer-to-peer communities”, in 2nd IEEE/ACM
International Symposium on Cluster Computing and the
Grid, pp. 376–376, 2002.
[14] S. Goel and R. Buyya, “Data replication strategies in wide
area distributed systems”, in Enterprise Service Computing:
From Concept to Deployment, vol. 17, 2006.
[15] R. Buyya, D. Abramson, and J. Giddy, “An architecture for
a resource management and scheduling system in a global
computational grid”, in The Fourth International
Conference/Exhibition on High Performance Computing in
the Asia-Pacific Region, vol. 1, pp. 283–289, 2000.
[16] I. Foster, C. Kesselman, and S. Tuecke, “The Anatomy of
the Grid: Enabling Scalable Virtual Organizations”, in
International Journal of High Performance Computing
Applications, vol. 15, no. 3, pp. 200–222, 2001.
[17] K. Ranganathan and I. Foster, “Identifying dynamic
replication strategies for a high-performance data grid”,
Grid Computing, pp. 75–86, 2001.
[18] M. Tang, B.S. Lee, C.K. Yeo, and X. Tang, “Dynamic
replication algorithms for the multi-tier Data Grid”, in
Future Generation Computer Systems, vol. 21, no. 5, pp.
775–790, 2005.
[19] K. Sashi and A.S. Thanamani, “Dynamic replication in a
data grid using a Modified BHR Region Based Algorithm”,
in Future Generation Computer Systems, vol. 27, no. 2, pp.
202–210, 2011.
[20] J. Gwertzman, “M. seltzer: The Case for Geographical Push-
Cashing,” in 5th Conference on Hot Topics in Operating
systems, Orcas Island, USA, 1995.
[21] D.G. Cameron, R. Carvajal-Schiaffino, A.P. Millar, C.
Nicholson, K. Stockinger, and F. Zini, “OptorSim: a grid
simulator for replica optimisation”, in UK e-science all
hands conference, vol. 31, 2004.
[22] J. Abawajy, “Placement of File Replicas in Data Grid
Environments”, in Computational Science ( ICCS), vol.
3038, Springer Berlin, Heidelberg, pp. 66–73, 2004.
[23] S.M. Park, J.H. Kim, Y.B. Ko, and W.S. Yoon, “Dynamic
data grid replication strategy based on Internet hierarchy”, in
Grid and Cooperative Computing, pp. 838–846, 2004.
[24] R.M. Rahman, K. Barker, and R. Alhajj, “Replica placement
design with static optimality and dynamic maintainability”, in Sixth IEEE International Symposium on Cluster
Computing and the Grid (CCGRID), vol. 1, pp.4, 2006.
[25] R.M. Rahman, K. Barker, and R. Alhajj, “Replica placement
in data grid: a multi-objective approach”, in Grid and
Cooperative Computing (GCC), Springer, pp. 645–656,
2005.
[26] W. Zhao, X. Xu, N. Xiong, and Z. Wang, “A weight-based
dynamic replica replacement strategy in data grids”, in IEEE
Asia-Pacific Services Computing Conference,(APSCC), pp.
1544–1549, 2008.
[27] R.S. Chang and H.P. Chang, “A dynamic data replication
strategy using access-weights in data grids”, in The Journal
of Supercomputing, vol. 45, no. 3, pp. 277–295, 2008.
[28] M. Shorfuzzaman, P. Graham, and R. Eskicioglu, “Adaptive
popularity-driven replica placement in hierarchical data
grids”, in The Journal of Supercomputing, vol. 51, no. 3, pp.
374–392, 2010.
[29] F.B. Charrada, H. Ounelli, and H. Chettaoui, “An efficient
replication strategy for dynamic data grids”, in International
Conference on P2P, Parallel, Grid, Cloud and Internet
Computing (3PGCIC), pp. 50–54, 2010.
[30] W. Zhao, X. Xu, Z. Wang, Y. Zhang, and S. He, “Improve
the performance of data grids by value-based replication
strategy”, in Sixth International Conference on Semantics
Knowledge and Grid (SKG), pp. 313–316, 2010.
[31] M. Bsoul, A. Al-Khasawneh, E.E. Abdallah, and Y. Kilani,
“Enhanced fast spread replication strategy for data grid”, in
Journal of Network and Computer Applications , vol. 34, no.
2, pp. 575–580, 2011.
[32] L.M. Khanli, A. Isazadeh, and T.N. Shishavan, “PHFS: A
dynamic replication method, to decrease access latency in
the multi-tier data grid”, in Future Generation Computer
Systems, vol. 27, no. 3, pp. 233–244, 2011.
[33] N. Mansouri and G.H. Dastghaibyfard, “A dynamic replica
management strategy in data grid”, in Journal of Network
and Computer Applications, vol. 35, no. 4, pp. 1297–1303,
2012.
[34] N. Mansouri, “An Effective Weighted Data Replication
Strategy for Data Grid”, in Australian Journal of Basic and
Applied Sciences, vol. 6, no. 10, pp. 336–346, 2012.
[35] W. H. Bell, D. G. Cameron, R. Carvajal-Schiaffino, A. P.
Millar, K. Stockinger, and F. Zini, “Evaluation of an
Economy- Based File Replication Strategy for a Data Grid”,
in Proc. Of 3rd IEEE Int. Symposium on Cluster Computing
and the Grid (CCGrid), Tokyo, Japan, IEEE CS-Press,
2003.
[36] A. Horri, R. Sepahvand, and G. Dastghaibyfard, “A
hierarchical scheduling and replication strategy”,
International Journal of Computer Science and Network
Security- IJCSNS, vol. 8, no. 8, pp. 30–35, 2008.
[37] D. G. Cameron, “OptorSim v2.1 Installation and User
Guide”, User Guide, University of Glasgow, Scotland, 2006
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Dayyani S, Khayyambashi MR

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
