QOS-AWARE DATA REPLICATION FOR DATA-INTENSIVE APPLICATIONS IN CLOUD COMPUTING SYSTEMS
ABSTRACT:
Cloud
computing provides scalable computing and storage resources. More and more
data-intensive applications are developed in this computing environment. Different
applications have different quality-of-service (QoS) requirements. To
continuously support the QoS requirement of an application after data
corruption, we propose two QoS-aware data replication (QADR) algorithms in cloud
computing systems. The first algorithm adopts the intuitive idea of high-QoS
first-replication (HQFR) to perform data replication. However, this greedy
algorithm cannot minimize the data replication cost and the number of
QoS-violated data replicas. To achieve these two minimum objectives, the second
algorithm transforms the QADR problem into the well-known minimum-cost
maximum-flow (MCMF) problem. By applying the existing MCMF algorithm to solve
the QADR problem, the second algorithm can produce the optimal solution to the
QADR problem in polynomial time, but it takes more computational time than the
first algorithm. Moreover, it is known that a cloud computing system usually
has a large number of nodes. We also propose node combination techniques to
reduce the possibly large data replication time. Finally, simulation
experiments are performed to demonstrate the effectiveness of the proposed algorithms
in the data replication and recovery.
EXISTING SYSTEM:
Due
to a large number of nodes in the cloud computing system, the probability of
hardware failures is nontrivial based on the statistical analysis of hardware
failures. Some hardware failures will damage the disk data of nodes. As a
result, the running data-intensive applications may not read data from disks
successfully. To tolerate the data corruption, the data replication technique
is
extensively
adopted in the cloud computing system to provide high data availability. For
example, the Amazon EC2 is a realistic heterogeneous cloud platform, which provides
various infrastructure resource types to meet different user needs in the computing
and storage resources. The cloud computing system has heterogeneous
characteristics in nodes. Note that the QoS requirement of an application is
defined from the aspect of the request information. For example, in, the
response time of a data object access is defined as the QoS requirement of an application
in the content distribution system.
DISADVANTAGES OF
EXISTING SYSTEM:
v
The QoS requirement of an application is
not taken into account in the data replication. When data corruption occurs,
the QoS requirement of the application cannot be supported continuously.
v
The data of a high-QoS application may
be replicated in a low-performance node (the node with slow communication and
disk access latencies). Later, if data corruption occurs in the node running
the high-QoS application, the data of the application will be retrieved from
the low-performance node.
v
Since the low-performance node has slow
communication and disk access latencies, the QoS requirement of the high-QoS application
may be violated.
PROPOSED SYSTEM:
We Propose QoS-aware data replication (QADR) problem for data-intensive
applications in cloud computing systems. The QADR problem concerns how to efficiently
consider the QoS requirements of applications in the data replication. This can
significantly reduce the probability that the data corruption occurs before
completing data replication. Due to limited replication space of a
storage node, the data replicas of some applications may be stored in
lower-performance nodes. This will result in some data replicas that cannot
meet the QoS requirements of their corresponding applications. These data
replicas are called the QoS-violated data replicas. The number of QoS-violated
data replicas is expected to be as small as possible.
To solve the QADR problem, we first
propose a greedy algorithm, called the high-QoS first-replication (HQFR) algorithm.
In this algorithm, if application i has a higher QoS requirement, it will take
precedence over other applications to perform data replication. However, the HQFR
algorithm cannot achieve the above minimum objective. Basically, the optimal
solution of the QADR problem can be obtained by formulating the problem as an integer
linear programming (ILP) formulation. However, the ILP formulation
involves complicated computation. To find the optimal solution of the QADR
problem in an efficient manner, we propose a new algorithm to solve the QADR problem.
In this algorithm, the QADR problem is transformed to the minimum-cost
maximum-flow (MCMF) problem.
We propose a new algorithm to
solve the QADR problem. In this algorithm, the QADR problem is transformed to
the minimum-cost maximum-flow (MCMF) problem. Then, an existing MCMF algorithm
is utilized to optimally solve the QADR problem in polynomial time. Compared to
the HQFR algorithm, the optimal algorithm takes more computational time.
ADVANTAGES OF PROPOSED
SYSTEM:
v
While minimizing the data replication cost,
the data replication can be completed quickly.
v
We use node combination techniques to
suppress the computational time of the QADR problem without linear growth as increasing
the number of nodes.
SYSTEM ARCHITECTURE:
SYSTEM CONFIGURATION:-
HARDWARE REQUIREMENTS:-
ü Processor - Pentium
–IV
ü Speed - 1.1 Ghz
ü RAM - 512 MB(min)
ü Hard
Disk - 40 GB
ü Key
Board - Standard Windows Keyboard
ü Mouse - Two or Three Button Mouse
ü Monitor - LCD/LED
SOFTWARE
REQUIREMENTS:
•
Operating system : Windows XP.
•
Coding Language : C# .Net
•
Data Base : SQL
Server 2005
•
Tool : VISUAL STUDIO 2008.
REFERENCE:
Jenn-Wei
Lin, Chien-Hung Chen, and J. Morris Chang, “QOS-AWARE DATA REPLICATION FOR DATA-INTENSIVE APPLICATIONS IN CLOUD
COMPUTING SYSTEMS” IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 1, NO. 1, JUNE 2013
No comments:
Post a Comment