FAST NEAREST NEIGHBOR
SEARCH WITH KEYWORDS
ABSTRACT:
Conventional spatial
queries, such as range search and nearest neighbor retrieval, involve only
conditions on objects’ geometric properties. Today, many modern applications
call for novel forms of queries that aim to find objects satisfying both a spatial
predicate, and a predicate on their associated texts. For example, instead of
considering all the restaurants, a nearest neighbor query would instead ask for
the restaurant that is the closest among those whose menus contain “steak,
spaghetti, brandy” all at the same time. Currently, the best solution to such queries
is based on the IR2-tree, which, as shown in this paper, has a few deficiencies
that seriously impact its efficiency. Motivated by this, we develop a new
access method called the spatial inverted index that extends the conventional
inverted index to cope with multidimensional data, and comes with algorithms
that can answer nearest neighbor queries with keywords in real time. As verified
by experiments, the proposed techniques outperform the IR2-tree in query
response time significantly, often by a factor of orders of magnitude.
EXISTING SYSTEM:
The widespread use of
search engines has made it realistic to write spatial queries in a brand new way.
Conventionally, queries focus on objects’ geometric properties only, such as
whether a point is in a rectangle, or how close two points are from each other.
We have seen some modern applications that call for the ability to select
objects based on both of their geometric coordinates and their associated texts.
For example, it would be fairly useful if a search engine can be used to find
the nearest restaurant that offers “steak, spaghetti, and brandy” all at the
same time. Note that this is not the “globally” nearest restaurant (which would
have been returned by a traditional nearest neighbor query), but the nearest
restaurant among only those providing all the demanded foods and drinks. There
are easy ways to support queries that combine spatial and text features. For
example, for the above query, we could first fetch all the restaurants whose menus
contain the set of keywords {steak, spaghetti, brandy}, and then from the
retrieved restaurants, find the nearest one. Similarly, one could also do it
reversely by targeting first the spatial conditions—browse all the restaurants in
ascending order of their distances to the query point until encountering one
whose menu has all the keywords.
DISADVANTAGES OF
EXISTING SYSTEM:
·
It will fail to provide real time answers
on difficult inputs.
· Its closer neighbors are missing at least one of
the query keywords.
PROPOSED SYSTEM:
In this paper, we design a variant of inverted index
that is optimized for multidimensional points, and is thus named the spatial
inverted index (SI-index). This access method successfully incorporates point
coordinates into a conventional inverted index with small extra space, owing to
a delicate compact storage scheme. Meanwhile, an SI-index preserves the spatial
locality of data points, and comes with an R-tree built on every inverted list
at little space overhead. As a result, it offers two competing ways for query
processing. We can (sequentially) merge multiple lists very much like merging
traditional inverted lists by ids. Alternatively, we can also leverage the
R-trees to browse the points of all relevant lists in ascending order of their
distances to the query point. As demonstrated by experiments, the SI-index
significantly outperforms the IR2-tree in query efficiency, often by a factor
of orders of magnitude.
ADVANTAGES OF PROPOSED
SYSTEM:
·
It offers two competing ways for query
processing.
·
It uses Inverted indexes (I-index), it have
proved to be an effective access method for keyword-based document retrieval.
SYSTEM CONFIGURATION:-
HARDWARE REQUIREMENTS:-
ü Processor - Pentium –IV
ü Speed - 1.1 Ghz
ü RAM - 512 MB(min)
ü Hard
Disk - 40 GB
ü Key
Board - Standard Windows Keyboard
ü Mouse - Two or Three Button Mouse
ü Monitor - LCD/LED
SOFTWARE
REQUIREMENTS:
•
Operating system : Windows XP
•
Coding Language : Java
•
Data Base : MySQL
•
Tool : Net Beans IDE
REFERENCE:
Yufei Tao and Cheng Sheng “Fast Nearest Neighbor Search with Keywords” IEEE TRANSACTIONS ON
KNOWLEDGE AND DATA ENGINEERING, VOL. 26, NO. 4, APRIL 2014.
No comments:
Post a Comment