RDT: A New Data Replication Algorithm for Hierarchical Data Grid

Authors

  • Dayyani S Department of Computer Engineering, Sheikh Bahaei University, Iran
  • Khayyambashi MR Department of Computer Engineering, University of Isfahan, Iran

Keywords:

Distributed systems, Data grid, Data replication, Dynamic Threshold, OptorSim

Abstract

Grid computing is a type of distributed computing system that provides access to various computational resources which are shared by different organizations, in order to create an integrated powerful virtual computer. Nowadays, grid is known as an essential technology which is used for different kinds of high performance applications and it is believed that it will be applied more and more in the future as technology progresses. Data replication is a common method used in distributed environments to improve ease of data access and to provide a high level of data availability, increased fault tolerance and data reliability; and that’s why this method is used for data management in data grid systems. Since the data files are very large and the Grid storages are limited, managing replicas in storage for the purpose of more effective utilization requires more attention. In this paper, a novel data replication strategy, called Replication with Dynamic Threshold (RDT) is proposed that uses a new threshold for characterizing the number of appropriate sites for replication. Appropriate sites have the higher number of access for that particular replica from other sites. It also minimizes access latency by selecting the best replica when various sites hold replicas. The simulated results with OptorSim, i.e. European Data Grid simulator show that the RDT strategy gives better performance compared to the other algorithms and prevents the unnecessary creation of replicas which leads to efficient storage usage.

References

[1] M. Li and M. Baker, “The grid core technologies”, John

Wiley & Sons, 2005.

[2] N. Mohd. Zin, A. Noraziah, A. Che Fauzi, and T. Herawan,

“Replication Techniques in Data Grid Environments”, in

Intelligent Information and Database Systems, vol. 7197,

Eds. Springer Berlin, Heidelberg, pp. 549–559, 2012.

[3] K. Ranganathan and I. Foster, “Decoupling computation and

data scheduling in distributed data-intensive applications”,

in 11th IEEE International Symposium on High

Performance Distributed Computing, pp. 352–358, 2002.

[4] K. Sashi and A.S. Thanamani, “A new replica creation and

placement algorithm for data grid environment”, in

International Conference on Data Storage and Data

Engineering (DSDE), pp. 265–269, 2010.

[5] S. Naseera and K.V.M. Murthy, “Agent Based Replica

Placement in a Data Grid Environement”, in First

International Conference on Computational Intelligence,

Communication Systems and Networks, pp. 426–430, 2009,.

[6] F.B. Charrada, H. Ounelli, and H. Chettaoui, “Dynamic

period vs static period in data grid replication”, in

International Conference on P2P, Parallel, Grid, Cloud and

Internet Computing (3PGCIC), pp. 565–568, 2010.

[7] W. Hoschek, J. Jaen-Martinez, A. Samar, H. Stockinger, and

K. Stockinger, “Data management in an international data

grid project”, in Grid Computing, Springer, pp. 77–90, 2000.

[8] A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S.

Tuecke, “The data grid: Towards an architecture for the

distributed management and analysis of large scientific

datasets”, in Network Computing Application, vol. 23, no. 3,

pp. 187–200, 2000.

[9] I. Foster and C. Kesselman, “The Grid 2: Blueprint for a

New Computing Infrastructure”, Morgan Kaufmann, 2003.

[10] S. Venugopal, R. Buyya, and K. Ramamohanarao, “A

taxonomy of data grids for distributed data sharing,

management, and processing”, in Acm Computing Surveys,

vol. 38, no. 1, p. 3, 2006.

[11] M. Mat Deris, J.H. Abawajy, and A. Mamat, “An efficient

replicated data access approach for large-scale distributed

systems”,in Future generation computer systems, vol. 24,

no. 1, pp. 1–9, 2008.

[12] H. Lamehamedi, B. Szymanski, Z. Shentu, and E. Deelman,

“Data replication strategies in grid environments”, in Fifth

International Conference on Algorithms and Architectures

for Parallel Processing, pp. 378 –383, 2002.

[13] K. Ranganathan, A. Iamnitchi, and I. Foster, “Improving

data availability through dynamic model-driven replication

in large peer-to-peer communities”, in 2nd IEEE/ACM

International Symposium on Cluster Computing and the

Grid, pp. 376–376, 2002.

[14] S. Goel and R. Buyya, “Data replication strategies in wide

area distributed systems”, in Enterprise Service Computing:

From Concept to Deployment, vol. 17, 2006.

[15] R. Buyya, D. Abramson, and J. Giddy, “An architecture for

a resource management and scheduling system in a global

computational grid”, in The Fourth International

Conference/Exhibition on High Performance Computing in

the Asia-Pacific Region, vol. 1, pp. 283–289, 2000.

[16] I. Foster, C. Kesselman, and S. Tuecke, “The Anatomy of

the Grid: Enabling Scalable Virtual Organizations”, in

International Journal of High Performance Computing

Applications, vol. 15, no. 3, pp. 200–222, 2001.

[17] K. Ranganathan and I. Foster, “Identifying dynamic

replication strategies for a high-performance data grid”,

Grid Computing, pp. 75–86, 2001.

[18] M. Tang, B.S. Lee, C.K. Yeo, and X. Tang, “Dynamic

replication algorithms for the multi-tier Data Grid”, in

Future Generation Computer Systems, vol. 21, no. 5, pp.

775–790, 2005.

[19] K. Sashi and A.S. Thanamani, “Dynamic replication in a

data grid using a Modified BHR Region Based Algorithm”,

in Future Generation Computer Systems, vol. 27, no. 2, pp.

202–210, 2011.

[20] J. Gwertzman, “M. seltzer: The Case for Geographical Push-

Cashing,” in 5th Conference on Hot Topics in Operating

systems, Orcas Island, USA, 1995.

[21] D.G. Cameron, R. Carvajal-Schiaffino, A.P. Millar, C.

Nicholson, K. Stockinger, and F. Zini, “OptorSim: a grid

simulator for replica optimisation”, in UK e-science all

hands conference, vol. 31, 2004.

[22] J. Abawajy, “Placement of File Replicas in Data Grid

Environments”, in Computational Science ( ICCS), vol.

3038, Springer Berlin, Heidelberg, pp. 66–73, 2004.

[23] S.M. Park, J.H. Kim, Y.B. Ko, and W.S. Yoon, “Dynamic

data grid replication strategy based on Internet hierarchy”, in

Grid and Cooperative Computing, pp. 838–846, 2004.

[24] R.M. Rahman, K. Barker, and R. Alhajj, “Replica placement

design with static optimality and dynamic maintainability”, in Sixth IEEE International Symposium on Cluster

Computing and the Grid (CCGRID), vol. 1, pp.4, 2006.

[25] R.M. Rahman, K. Barker, and R. Alhajj, “Replica placement

in data grid: a multi-objective approach”, in Grid and

Cooperative Computing (GCC), Springer, pp. 645–656,

2005.

[26] W. Zhao, X. Xu, N. Xiong, and Z. Wang, “A weight-based

dynamic replica replacement strategy in data grids”, in IEEE

Asia-Pacific Services Computing Conference,(APSCC), pp.

1544–1549, 2008.

[27] R.S. Chang and H.P. Chang, “A dynamic data replication

strategy using access-weights in data grids”, in The Journal

of Supercomputing, vol. 45, no. 3, pp. 277–295, 2008.

[28] M. Shorfuzzaman, P. Graham, and R. Eskicioglu, “Adaptive

popularity-driven replica placement in hierarchical data

grids”, in The Journal of Supercomputing, vol. 51, no. 3, pp.

374–392, 2010.

[29] F.B. Charrada, H. Ounelli, and H. Chettaoui, “An efficient

replication strategy for dynamic data grids”, in International

Conference on P2P, Parallel, Grid, Cloud and Internet

Computing (3PGCIC), pp. 50–54, 2010.

[30] W. Zhao, X. Xu, Z. Wang, Y. Zhang, and S. He, “Improve

the performance of data grids by value-based replication

strategy”, in Sixth International Conference on Semantics

Knowledge and Grid (SKG), pp. 313–316, 2010.

[31] M. Bsoul, A. Al-Khasawneh, E.E. Abdallah, and Y. Kilani,

“Enhanced fast spread replication strategy for data grid”, in

Journal of Network and Computer Applications , vol. 34, no.

2, pp. 575–580, 2011.

[32] L.M. Khanli, A. Isazadeh, and T.N. Shishavan, “PHFS: A

dynamic replication method, to decrease access latency in

the multi-tier data grid”, in Future Generation Computer

Systems, vol. 27, no. 3, pp. 233–244, 2011.

[33] N. Mansouri and G.H. Dastghaibyfard, “A dynamic replica

management strategy in data grid”, in Journal of Network

and Computer Applications, vol. 35, no. 4, pp. 1297–1303,

2012.

[34] N. Mansouri, “An Effective Weighted Data Replication

Strategy for Data Grid”, in Australian Journal of Basic and

Applied Sciences, vol. 6, no. 10, pp. 336–346, 2012.

[35] W. H. Bell, D. G. Cameron, R. Carvajal-Schiaffino, A. P.

Millar, K. Stockinger, and F. Zini, “Evaluation of an

Economy- Based File Replication Strategy for a Data Grid”,

in Proc. Of 3rd IEEE Int. Symposium on Cluster Computing

and the Grid (CCGrid), Tokyo, Japan, IEEE CS-Press,

2003.

[36] A. Horri, R. Sepahvand, and G. Dastghaibyfard, “A

hierarchical scheduling and replication strategy”,

International Journal of Computer Science and Network

Security- IJCSNS, vol. 8, no. 8, pp. 30–35, 2008.

[37] D. G. Cameron, “OptorSim v2.1 Installation and User

Guide”, User Guide, University of Glasgow, Scotland, 2006

Downloads

Published

2015-07-30

How to Cite

[1]
S. Dayyani and M. R. Khayyambashi, “RDT: A New Data Replication Algorithm for Hierarchical Data Grid”, Int. J. Comp. Sci. Eng., vol. 3, no. 7, pp. 186–197, Jul. 2015.

Issue

Section

Research Article