BESTPEER++:
A PEER-TO-PEER BASED LARGE-SCALE DATA PROCESSING PLATFORM
ABSTRACT:
The corporate network is
often used for sharing information among the participating companies and
facilitating collaboration in a certain industry sector where companies share a
common interest. It can effectively help the companies to reduce their operational
costs and increase the revenues. However, the inter-company data sharing and
processing poses unique challenges to such a data management system including
scalability, performance, throughput, and security. In this paper, we present BestPeer++,
a system which delivers elastic data sharing services for corporate network applications
in the cloud based on BestPeer—a peer-to-peer (P2P) based data management
platform. By integrating cloud computing, database, and P2P technologies into
one system, BestPeer++ provides an economical, flexible and scalable platform
for corporate network applications and delivers data sharing services to participants
based on the widely accepted pay-as-you-go business model. We evaluate
BestPeer++ on Amazon EC2 Cloud platform. The benchmarking results show that
BestPeer++ outperforms HadoopDB, a recently proposed large-scale data
processing system, in performance when both systems are employed to handle
typical corporate network workloads. The benchmarking results also demonstrate
that BestPeer++ achieves near linear scalability for throughput with respect to
the number of peer nodes.
EXISTING SYSTEM:
The corporate network
needs to scale up to support thousands of participants, while the installation
of a large-scale centralized data warehouse system entails nontrivial costs
including huge hardware/software investments (a.k.a total cost of ownership)
and high maintenance cost (a.k.a total cost of operations). In the real world,
most companies are not keen to invest heavily on additional information systems
until they can clearly see the potential return on investment (ROI). Second, companies
want to fully customize the access control policy to determine which business
partners can see which part of their shared data. Unfortunately, most of the
data warehouse solutions fail to offer such flexibilities. Finally, to maximize
the revenues, companies often dynamically adjust their business process and may
change their business partners. Therefore, the participants may join and leave
the corporate networks at will. The data warehouse solution has not been
designed to handle such dynamicity.
DISADVANTAGES OF
EXISTING SYSTEM:
·
Its most of the data warehouse solutions
fail to offer flexibilities.
· Its
warehousing solution has some deficiencies in real
deployment.
· It is expensive.
PROPOSED SYSTEM:
BestPeer++ achieves its query processing efficiency and
is a promising approach for corporate network applications, with the following
distinguished features. BestPeer++ is deployed as service in the cloud. To form
a corporate network, companies simply register their sites with the BestPeer++
service provider,
launch
BestPeer++ instances in the cloud and finally export data to those instances
for sharing. BestPeer++ adopts the pay-as-you-go business model popularized by
cloud computing. The total cost of ownership is therefore substantially reduced
since companies do not have to buy any hardware/software in advance. Instead,
they pay for what they use in terms of BestPeer++ instance’s hours and storage
capacity. BestPeer++ extends the
role-based access control for the inherent distributed environment of corporate
networks. Through a web console interface, companies can easily configure their
access control policies and prevent undesired business partners to access their
shared data. BestPeer++ employs P2P technology to retrieve data between
business partners. BestPeer++ instances are organized as a structured P2P
overlay network named BATON. The data are indexed by the table name, column
name and data range for efficient retrieval. BestPeer++ employs a hybrid design
for achieving high performance query processing. The major workload of a
corporate network is simple, lowoverhead queries. Such queries typically only involve
querying a very small number of business partners and can be processed in short
time. Best- Peer++ is mainly optimized for these queries. For infrequent
time-consuming analytical tasks, we provide an interface for exporting the data
from Best- Peer++ to Hadoop and allow users to analyze those data using
MapReduce.
ADVANTAGES OF PROPOSED
SYSTEM:
·
It provides economical, flexible and
scalable solutions for corporate network applications.
· It
is more efficient.
· It
prevent undesired business partners to access their shared data.
SYSTEM CONFIGURATION:-
HARDWARE REQUIREMENTS:-
ü Processor - Pentium –IV
ü Speed - 1.1 Ghz
ü RAM - 512 MB(min)
ü Hard
Disk - 40 GB
ü Key
Board - Standard Windows Keyboard
ü Mouse - Two or Three Button Mouse
ü Monitor - LCD/LED
SOFTWARE
REQUIREMENTS:
•
Operating system : Windows XP
•
Coding Language : Java
•
Data Base : MySQL
•
Tool : Net Beans IDE
REFERENCE:
Gang Chen, Tianlei Hu, Dawei Jiang, Peng Lu,
Kian-Lee Tan, Hoang Tam Vo, and Sai Wu “BestPeer++: A Peer-to-Peer
Based Large-Scale Data Processing Platform”
IEEE
TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 26, NO. 6, JUNE 2014.
No comments:
Post a Comment