SANE: SEMANTIC-AWARE NAMESPACE
IN ULTRA-LARGE-SCALE FILE SYSTEMS
ABSTRACT:
The
explosive growth in data volume and complexity imposes great challenges for
file systems. To address these challenges, an innovative namespace management
scheme is in desperate need to provide both the ease and efficiency of data access.
In almost all today’s file systems, the namespace management is based on
hierarchical directory trees. This tree-based namespace scheme is prone to
severe performance bottlenecks and often fails to provide real-time response to
complex data lookups. This paper proposes a Semantic-Aware Namespace scheme,
called SANE, which provides dynamic and adaptive namespace management for
ultra-large storage systems with billions of files. SANE introduces a new
naming methodology based on the notion of semantic-aware per-file namespace,
which exploits semantic correlations among files, to dynamically aggregate
correlated files into small, flat but readily manageable groups to achieve fast
and accurate lookups. SANE is implemented as a middleware in conventional file
systems and works orthogonally with hierarchical directory trees. The semantic
correlations and file groups identified in SANE can also be used to facilitate
file prefetching and data de-duplication, among other system-level
optimizations. Extensive trace-driven experiments on our prototype
implementation validate the efficacy and efficiency of SANE.
EXISTING SYSTEM:
According
to a recent survey of 1,780 data center managers in 26 countries, over 36
percent of respondents faced two critical challenges: efficiently supporting a
flood of emerging applications and handling the sharply increased data
management complexity. This reflects a reality in which we are generating and
storing much more data than ever and this trend continues at an accelerated
pace. This data volume explosion has imposed great challenges to storage
systems, particularly to the metadata management of file systems. For example,
many systems are required to perform hundreds of thousands of metadata
operations per second and the performance is severely restricted by the
hierarchical directory-tree based metadata management scheme used in almost all
file systems today.
The
most important functions of namespace management are file identification and
lookup. File system namespace as an information-organizing infrastructure is fundamental
to system’s quality of service such as performance, scalability, and ease of
use. Almost all current file systems, unfortunately, are based on hierarchical
directory trees.
DISADVANTAGES OF
EXISTING SYSTEM:
v Limited
system scalability.
v Reliance
on end-users to organize and lookup data.
v Lack
of metadata-semantics exploration.
PROPOSED
SYSTEM:
We
propose a new namespace management scheme, called SANE, which provides a flat
but small, manageable and efficient namespace for each file. In SANE, the
notion of semantic-aware per-file namespace is proposed in which a file is
represented by its semantic correlations to other files, instead of
conventional static file names. Our goal is not to replace conventional
directory-tree management that already has a large user base. Instead, we aim
to provide another metadata overlay that is orthogonal to directory trees. SANE
runs concurrently with the conventional file system that integrates it and
takes over the responsibilities of file search and semantic file grouping from
the file system when necessary. Moreover, SANE, while providing the same
functionalities, makes use of a new naming scheme that only requires constant-scale
complexity to identify and aggregate semantically correlated files. SANE
extracts the semantic correlation information from a hierarchical tree.
ADVANTAGES OF PROPOSED
SYSTEM:
v
The metadata of files that are strongly
correlated are automatically aggregated and then stored together in SANE.
v
SANE is implemented as a transparent
middleware that can be deployed / embedded in most existing file systems
without modifying the kernels or applications.
SYSTEM CONFIGURATION:-
HARDWARE REQUIREMENTS:-
Processor - Pentium –IV
Speed - 1.1 Ghz
RAM - 512 MB(min)
Hard Disk - 40 GB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - LCD/LED
SOFTWARE
REQUIREMENTS:
Operating
system : Windows XP.
Coding
Language : .Net
Data
Base : SQL Server 2005
Tool : VISUAL STUDIO 2008.
REFERENCE:
Yu
Hua, Hong Jiang, Senior , Yifeng Zhu, Dan Feng, and Lei Xu_, “SANE: Semantic-Aware Namespace in
Ultra-Large-Scale File Systems” IEEE TRANSACTIONS ON PARALLEL AND
DISTRIBUTED SYSTEMS, VOL. 25, NO. 5, MAY 2014
No comments:
Post a Comment