Sunday 18 October 2015

Co-Extracting Opinion Targets and Opinion Words from Online Reviews Based On the Word Alignment Model

ABSTRACT
Mining opinion targets and opinion words from online reviews are important tasks for fine-grained opinion mining, the key component of which involves detecting opinion relations among words. To this end, this paper proposes a novel approach based on the partially-supervised alignment model, which regards identifying opinion relations as an alignment process. Then, a graph-based co-ranking algorithm is exploited to estimate the confidence of each candidate. Finally, candidates with higher confidence are extracted as opinion targets or opinion words. Compared to previous methods based on the nearest-neighbor rules, our model captures opinion relations more precisely, especially for long-span relations. Compared to syntax-based methods, our word alignment model effectively alleviates the negative effects of parsing errors when dealing with informal online texts. In particular, compared to the traditional unsupervised alignment model, the proposed model obtains better precision because of the usage of partial supervision. In addition, when estimating candidate confidence, we penalize higher-degree vertices in our graph-based co-ranking algorithm to decrease the probability of error generation. Our experimental results on three corpora with different sizes and languages show that our approach effectively outperforms state-of-the-art methods.
AIM
The main aim of this paper is a novel approach based on the partially-supervised alignment model, which regards identifying opinion relations as an alignment process. Then, a graph-based co-ranking algorithm is exploited to estimate the confidence of each candidate. Finally, candidates with higher confidence are extracted as opinion targets or opinion words.
SCOPE
The scope of this paper is our experimental results on three corpora with different sizes and languages show that our approach effectively outperforms state-of-the-art methods
EXISTING SYSTEM
Opinion target and opinion word extraction are not new tasks in opinion mining. There is significant effort focused on these tasks. They can be divided into two categories: sentence-level extraction and corpus level extraction according to their extraction aims. In sentence-level extraction, the task of opinion target/ word extraction is to identify the opinion target mentions or opinion expressions in sentences. Thus, these tasks are usually regarded as sequence-labeling problems. Intuitively, contextual words are selected as the features to indicate opinion targets/words in sentences. Most previous approaches adopted a collective unsupervised extraction framework. As mentioned in our first section, detecting opinion relations and calculating opinion associations among words are the key component of this type of method. adopted the co-occurrence frequency of opinion targets and opinion words to indicate their opinion associations. Exploited nearest-neighbor rules to identify opinion relations among words. Next, frequent and explicit product features were extracted using a bootstrapping process. Only the use of co-occurrence information or nearest-neighbor rules to detect opinion relations among words could not obtain precise results.
DISADVANTAGES
·      This strategy cannot obtain precise results because there exist long-span modified relations and diverse opinion expressions.
·      Some errors are extracted by an iteration, they would not be filtered out in subsequent iterations.
PROPOSED SYSTEM
In this paper, propose a method based on a monolingual word alignment model (WAM). An opinion target can find its corresponding modifier through word alignment. The WAM is more robust because it does not need to parse informal texts. In addition, the WAM can integrate several intuitive factors, such as word co-occurrence frequencies and word positions, into a unified model for indicating the opinion relations among words. Thus, we expect to obtain more precise results on opinion relation identification. A constrained EM algorithm based on hill-climbing is then performed to determine all of the alignments in sentences, where the model will be consistent with these links as much as possible. A random walk based co-ranking algorithm is then proposed to estimate each candidate’s confidence on the graph. In this process, we penalize high-degree vertices to weaken their impacts and decrease the probability of a random walk running into unrelated regions on the graph. Meanwhile, we calculate the prior knowledge of candidates for indicating some noises and incorporating them into our ranking algorithm to make collaborated operations on candidate confidence estimations.
 ADVANTAGES
  1. The  advantages of the word alignment model for opinion relation identification, but it also has a more precise performance because of the use of partial supervision
  2. The confidence of each candidate is estimated in a global process with graph co-ranking. Intuitively, the error propagation is effectively alleviated.

System Configuration
Hardware Requirements
  • Speed                  -    1.1 Ghz
  • Processor              -    Pentium IV
  • RAM                    -    512 MB (min)
  • Hard Disk            -    40 GB
  • Key Board                    -    Standard Windows Keyboard
  • Mouse                  -    Two or Three Button Mouse
  • Monitor                -     LCD/LED
 Software requirements
  • Operating System              : Windows 7             
  •  Front End                           : ASP.Net and C#
  • Database                             : MSSQL
  • Tool                                    : Microsoft Visual studio

References                              
Kang Liu, Liheng Xu, Jun Zhao,“ Co-extracting Opinion Targets and Opinion Words from Online Reviews Based on the Word Alignment Model” IEEE Transactions on Knowledge and Data Engineering, Volume 27   Issue 3 July 2014.

No comments:

Post a Comment