Sentiment Analysis

From SAM
Jump to: navigation, search

Sentiment Analysis, also known as Opinion Mining, is considered an area of Natural Language Processing focused in identifying and extracting subjective information from human language. Sentiment Analysis usually tries to identify the attitude of the owner of the analysed source (i.e. text) with respect to some topic, to the entities present in the text, or to the overall contextual sentiment polarity (positive or negative) of the document.

Introduction

Nowadays Sentiment Analysis has become a popular discipline due to its wide-relatedness to Social Media behaviour studies. Sentiment Analysis is commonly used to analyse the comments that people post on social web sites. Also, it allows to identify the preferences and criteria of users about situations, events, products, brands, etc.

Relevant works in this field include Wiebe[1], which defines Sentiment Analysis as "linguistic expression of somebody’s opinions, sentiments, emotions, evaluations, beliefs and speculations". Other interesting terms also related to this field and proposed by Wiebe[2] include private state, opinions, beliefs, thoughts, feelings, emotions, goals, evaluations, and judgments.

Relevance to SAM

Sentiment Analysis techniques will be applied in SAM to the large amount of subjective information that the project must deal with, enhancing the End User experience by discovering User Preferences and feelings while using the Platform.

In SAM, Sentiment Analysis will be applied in tasks such as T6.4 Business Intelligence and Social Mining, where the data providers will be able to discover tendencies, specific reactions in specific film sequences, etc. This allows the media producers, Publishers or Broadcasters to take further decisions.

State of the Art Analysis

The use of sentiment resources has proven to be a necessary step for training and evaluating systems implementing Sentiment Analysis, including fine-grained opinion mining[3].

Approaches

There are some approaches to document-level Sentiment Analysis (such as extracting opinions from movie reviews) that use as gold standard texts that are already classified by their polarity. These texts are extracted from e-commerce sites and include information describing products (from 1 to 5 stars, 1 star being "bad" and 5 stars meaning "very well") pertaining to different categories[4]. Different techniques have been used for product reviews to obtain lexicons of subjective words with their associated polarity.

The strategy defined by Hu and Liu[5] starts with a set of seed adjectives ("good" and "bad") and reinforces the semantic knowledge by applying and expanding the lexicon with synonymy and antonymy relations provided by WordNet[6]. Hu and Liu obtained an opinion lexicon compounded by a list of positive and negative opinion words or sentiment words for English (around 6800 words). A similar approach has been used in building WordNet-Affect[7]. In this case the procedure starts with a larger set of seed affective words. These words, classified according to the six basic categories of emotion (joy, sadness, fear, surprise, anger and disgust), are also expanded to increase the lexicon using paths in WordNet.

Other widely used resource in Sentiment Analysis is SentiWordNet[8]. It was built using a set of seed words which polarity was previously known, and expanded using glosses similarity. The main assumption behind this approach was that "terms with similar glosses in WordNet tend to have similar polarity".

In the collection of Appraisal Terms by Whitelaw[9], the terms also have polarity assigned. This resource involves annotations according to the principles of the appraisal theory framework in linguistics.

Another popular lexicon is MicroWNOp[10]. This resource contains opinion words with their associated polarity. It has been built on the basis of a set of terms (100 terms for each of the positive, negative and objective categories) extracted from the General Inquirer lexicion[11], subsequently adding all the synsets in WordNet where these words appeared.

The problem with these resources is that they do not consider the context in which the words appear. Some methods tried to overcome this issue building sentiment lexicons using the local context of words. Pang and Lee[12] built a lexicon of sentiment words with associated polarity value, starting with a set of classified seed adjectives and using conjunctions ("and") and disjunctions ("or", "but") to deduce orientation of new words in a corpus.

Turney[13] classified words according to their polarity on the basis of the idea that terms with similar orientation tend to co-occur in documents. Thus, the author computed the Pointwise Mutual Information (PMI) score between seed words and new words on the basis of the number of AltaVista hits returned when querying the seed word and the word to be classified with the "NEAR" operator.

Balahur builded a recommender system[14] which computed the polarity of new words using "polarity anchors" (words whose polarity is known beforehand) and Normalized Google Distance scores[15]. The authors used as training examples opinion words extracted from "pros and cons reviews" from the same domain, using the clue that opinion words appearing in the "pros" section are positive and those appearing in the "cons" section are negative. Research carried out by these authors employed the lexical resource Emotion Triggers[16].

Another interesting work presented by Popescu and Etzioni [17] extracts the polarity from local context to compute word polarity. To this extent, it uses a weighting function of the words around the context to be classified.

Recently Deep Learning technologies have been considered for addressing Sentiment Analysis tasks reaching very interesting results. For example, works presented by Duyu Tang and also researches conducted by the University of Stanford (see NAACL2013) support this kind of approaches.

Evaluation

To assess Sentiment Analysis systems, some international competitions have taken place:

From these competitions many systems arised using novel state-of-the-art techniques. Nowadays some of them are used in competitive technological companys to lead social challenges.

Applications

There are some relevant applications of Sentiment Analysis, such as Reputation and Recommendation and User Profile Creation. The first one consists of searching for comments and beliefs for potential mentions of the target entity, filtering those that do not refer to the target entity, detecting topics (e.g., clustering comments by subject) and ranking them based on the degree to which their signal reputation alerts (i.e., issues that may have a substantial impact on the reputation of the entity). On the other hand, User Profile Creation consists of analysing posted comments from a user according to some entities to determine the degree of user preferences or rejections according to these entities.

Related Projects

There have been a number of projects focused on Sentiment Analysis. The following link provide a comprehensive description of some of them:

Tools, Frameworks and Services

Developing a Sentiment Analysis system requires to have sentiment resources previously developed. These resources include annotated corpora, affective semantic structures, and sentiment dictionaries. Therefore, it is clear that this research area depends on huge economic and human efforts in order to move forward. Many researchers such as Wiebe[19], Balahur[20], Hatzivassiloglou[21] and Kim[22] have been working in this task and related areas.

Some authors consider the task of Sentiment Analysis subsequent to the subjectivity analysis, involving an additional step: the classification of the retrieved opinion words according to their polarity. Thus, the existing lexical resources for the opinion task contain words and expressions that are subjective and that have a value of polarity assigned. In parallel, a deeper analysis of the context is required for distinguishing a text that presents facts where one subjective statement is present and classifying what is said into valence categories. To deepen into context information, apart from creating lexical resources that contain words and expressions with their corresponding a priori assigned polarity, research in sentiment analysis also focuses on the development of annotation schemes. There are several specific annotation schemes and corresponding corpora created for the affect-related applications in Natural Language Processing:

  • The ISEAR (International Survey on Emotion Antecedents and Reactions) corpus[23];
  • The MPQA (Multi-perspective Question Answering System) corpus[24];
  • The EmotiBlog corpus[25];
  • The TAC (Text Analysis Conference) Opinion Pilot data (TAC 2008) and TREC (Text Retrieval Conference) data (from 2006 to 2010), which consists of annotated texts for the different opinion retrieval specific tasks, on the Blog06 collection;
  • The NTCIR MOAT (Multilingual Opinion Analysis Track) data (2007- 2010), which contains both monolingual annotated data for opinion mining in English, Chinese and Japanese, as well as cross-lingual analysis data (in MOAT 2010);
  • Semeval-2013 (Task 2. Sentiment Analysis in Twitter ), which contains annotated data for opinion mining in English.

SAM Approach

Sentiment Analysis functionalities are included in the Social Mining subcomponent. This component is part of the Analytic component, which includes the Business Intelligence subcomponent.

Architecture and Dependencies

The Social Mining subcomponent architecture is presented in the Automatic Summarisation page.

Implementation and Technologies

After Extended Analysis and comparison the most appropriate technologies for the backend have been selected. They are described in Automatic Summarisation page.

Subcomponents

A summary of the tasks carried out for the Social Mining subcomponent during the first and second versions of the prototype is shown in the Automatic Summarisation page.

Functionality and UI Elements

This section explains how to use the Social Mining subcomponent with the available interfaces for performing an invokation to sentiment analysis and semantic functionalities of the Semantic Services component, extracting features from the UGC related to an Asset from the Cloud Storage component. The following section describe details for accessing and using this interface.

Processing Social Data

The Social Mining subcomponent provides, through the Social Mining Controller, a RESTful interface for retrieving UGC related to a specific Asset (by querying the Cloud Storage), extracting sentiment and semantic features from them by means of the functionalities provided by the Semantic Services component (see figure beneath).

ProcessingSocialDataAPI.png

Figure above includes all necessary data requests, response parameters and their expected types and values. In the Parameters section of this demo page, the body textbox can be filled with a JSON example (as can be see in the figure above). The Try it out! button executes the service.

The data input to the Processing Social Data service contains all required information for retrieving specific Asset’s UGC from the Cloud Storage. Furthermore, this data input includes analysis parameters (e.g. subjectList and listOfTechniques) for carrying out sentiment analysis and extracting semantic features from UGC using the Semantic Services functionalities. The data input for exploiting this RESTful interface is a JSON object which contains the following attributes:

  • assetIDs: List of Asset’s identifiers for which to obtain their UGC
  • startDate: Initial date (mmddyyyy) to consider when retrieving UGC for an Asset (inclusive), i.e., retrieve only comments submitted from the specified date
  • endDate: Final date (mmddyyyy) to consider when retrieving UGC for an Asset (exclusive)
  • depth: Since the SAM platform can deal with complex Assets (i.e. an Asset can be linked to another Asset) the depth attribute is used to establish a numeric value to determine whether UGC from linked Assets should be also retrieved and analysed. For instance, depth=0 indicates the system just to analyse the UGC of the main Asset, whereas depth=1 forces to retrieve and analyse UGC not only from the main Asset, but also from its directly linked Assets. This parameter is optional, and by default the depth is 0
  • subjectList: List of subjects that will be taken as the target for the UGC analysis
  • listOfTechniques: List with two possible values: sentiment_analysis (to apply sentiment analysis) and/or data_characterisation (to apply data characterisation). These parameters are optional. By default both parameters are considered as active.

A JSON object (see code below) which includes a list of analysed user comments will be obtained as an output following this schema:

  • userID: Identifier of the user that submitted the comment
  • followers: Number of followers of the user (where available)
  • commentAnalysis: An object that involves semantic and sentiment analysis features:
  • commentID: Identifier of the UGC
  • publicationDate: Publication date of the UGC
  • semanticFeatures: An object which involves semantic features found in analysed UGC:
  • relatedAsset: Asset’ data found in the current comment, where assetId refers to the Asset identifier and assetTitle is a string representing that title of the Asset
  • sentimentFeatures: An object which involves sentiment analysis features evaluated for the UGC analysed:
  • subject: Label or term evaluated in the current UGC
  • assetId: Asset identifier in case that the subject refers to a SAM Asset
  • intensity: numeric value to score the sentiment polarity intensity
  • emotionLabels: List of emotion labels that conceptualise the current UGC (e.g. joy or anger)
  • sentimentCategory: Sentiment polarity category (i.e. positive, negative or neutral)
  {
  "contentAnalysisList": [
    {
      "userID": "u155",
      "followers": 100,
      "commentAnalysis": [
        {
          "commentID": "c01",
          "publicationDate": "12312014",
          "semanticFeatures": [
            {
              "relatedAsset": [
                {
                  "assetId": "a101",
                  "assetTitle": "Mads_Mikkelsen"
                },
                {
                  "assetId": "a102",
                  "assetTitle": "Daniel_Craig"
                }
              ]
            }
          ],
          "sentimentFeatures": [
            {
              "subject": "OVERALL",
              "assetId": "", // Empty since the subject OVERALL is not focused on any specific Asset 
              "intensity": 0.647352,
              "emotionLabels": null,
              "sentimentCategory": "positive"
            },
            {
              "subject": "Casino_Royale",
              "assetId": "a100",
              "intensity": 0.8310321,
              "emotionLabels": null,
              "sentimentCategory": "positive"
            },
            {
              "subject": "Bond",
              "assetId": "", // Empty since the subject Bond does not refer to a SAM Asset
              "intensity": 0.52452224,
              "emotionLabels": null,
              "sentimentCategory": "negative"
            }
          ]
        }
      ]
    }
  ]
}


Latest Developments

During the last period of the SAM life cicle some new developments have been performed. These new developments have been mostly addressed toward the improvement of existing RESTful interfaces based on a enriched SAM Asset structure and evaluating the research technologies. In this case some tasks generated results regarding the following functionalities: Sentiment Polarity classification and Emotion Detection.

Sentiment Polarity classification

For evaluating the Sentiment Polarity classification it was necessary to select proper corpora. This task includes evaluating three different approaches:


Overall Sentiment Polarity

Corpus selection: The corpus used to train and test the sentiment analysis system implies two modalities, Semeval task 2 dataset (10,709 comments) and Metacritic (5888 reviews). The former consists of a list of Social Media comments (i.e. tweets) with sentiment polarity annotations (i.e. positive, negative and none). The latter is a list of user feedbacks regarding films, music or games. These reviews have been rated following the Metacritic scales. Music and movie reviews share the same scale (i.e. positive, 7-10; neutral/mixed, 4-6; and negative, 0-3); whereas game reviews have a different one (i.e. positive, 8-10; neutral/mixed, 5-7; and negative, 0-4).

Approaches: It were developed two approaches for sentiment analysis classification:

  • The first one constitutes a system reused from the research Skipgram Scorer” taken from Fernández, et. al. [26], which was adapted to English language. This approach was tested in the SAM scenario (manually annotated user comments collected during the first trials) in global polarity getting a precision of 62% a recall of 50% and a F1 of 56%. As can be seen this first approach is so good for processing formal texts form our point of view, which provoked the study of a second approach for this task.
  • For improving the first approach it was created a training dataset, involving Tweets (above mentioned) and media content reviews (obtained from Metacritic web site). Then, the system was retrained and tuned to fit with the SAM scenario. The results obtained for global polarity were 75% of precision, recall and F1 (See next table.).


Polarity’s Intensity

Corpus selection: In terms of evaluating the sentiment polarity intensity it was considered the annotated corpus at intensity level “Movie Reviews - Scale Dataset 1.0 ”. The results obtained were for the classes Strong Negative and Negative 61% of precision, 59% of recall and 57 of F1. Regarding Strong Positive and Positive the results were 72% of precision, 67% of recall and 68% of F1. The mean precision for this task is 66.5%.

Aspect Based polarity

An evaluation of the Aspect Based analysis was performed. For that, the annotated corpus more appropriated to carry out this evaluation was “Semeval 2014 - Task 4 - Subtask 1 – Restaurants” . The results obtained in this evaluation task were 61% of precision, 55% of recall and 57% of F1.

Emotion detection

This task has consisted on studying and improving the sentiment analysis approaches to increase the accuracy on emotion detection, which initially reached an 34% of precision. This technology has been substantially improved by creating two approaches which have been evaluated over three datasets : Potter, Grimms and HCAndersen. The state of the art (see Özbal and Pighin [OP13]) on Emotion detection gets results around 52% of F1 average by considering the same classification than us: Happiness, Sadness, Anger-Disgust, Fear and Surprise. So, we would like to compare our results with them.

Approaches:

  • The first approach consists of the indexing, Apache Lucene grouped by emotions, of a semantic dataset generated from the mixing of WordNet dictionary and WordNet Affect semantic structure. Finally, a text is provided to Lucene and the emotions with context similarity over a threshold are given as candidates. The results of this approach were 29% of precision, 29 of recall and 25% of F1.
  • The second approach (definitive for this prototype) and stable into the SAM platform consists of a machine learning kernel based on Javi et. al. 2013, which applies Skip-gram techniques for processing the texts from emotional point of views. The results of this approach have been 65% of Precision, 48% of Recall and 54% of F1.


Research results

A list of scientific publications are mentioned next, which constitutes the research result of the SAM technologies regarding Sentiment Analysis:[27], [28], [29]


Technology Corpus Precision Recall
Sentiment Polarity classification (evaluation I, overall analysis) 146 comments from SAM trials 75% 75%
entiment Polarity classification (evaluation II, overall analysis) 10,709 comments of Semeval task 2 dataset , 5888 reviews of Metacritic [30] 75% 68%
Sentiment Polarity classification (evaluation II, polarity’s intensity) 5006 reviews from “Movie Reviews - Scale Dataset 1.0 [31] 66.5% (average) 63% (average)
Sentiment Polarity classification (evaluation II, aspect based) 3009 comments “Semeval 2014 - Task 4 - Subtask 1 – Restaurants” [32] 61% 55%
Emotion detection 11,117 documents [33] 65% 48%

References

  1. J. Wiebe, "Tracking point of view in narrative," Computational Linguistic, vol. 20, pp. 233-287, 1994.
  2. J. Wiebe, T. Wilson, and C. Cardie, "Annotating Expressions of Opinions and Emotions in Language," in Kluwer Academic Publishers, Netherlands, 2005.
  3. BALAHUR, A. (2011) Methods and Resources for Sentiment Analysis in Multilingual Documents of Different Text Types. Department of Software and Computing Systems. Alacant, Univeristy of Alacant.
  4. HU, M. & LIU, B. (2004) Mining and Summarizing Customer Reviews. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2004). USA..
  5. HU, M. & LIU, B. (2004) Mining and Summarizing Customer Reviews. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2004). USA.
  6. MILLER, G. A., BECKWITH, R., FELLBAUM, C., GROSS, D. & MILLER, K. (1990) Five papers on WordNet. Princenton University, Cognositive Science Laboratory.
  7. STRAPPARAVA, C. & VALITUTTI, A. (2004) WordNet-Affect: an affective extension of WordNet. Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004). Lisbon.
  8. ESULI, A. & SEBASTIANI, F. (2006) SentiWordNet: A Publicly Available Lexical Resource for Opinion Mining. IN 2006, L. (Ed.) Fifth international conference on Languaje Resources and Evaluation Genoa - ITaly.
  9. WHITELAW, C., GARG, N. & ARGAMON, S. (2005) Using appraisal groups for sentiment analysis. Proccedings of CIKM 2005. Bremen, Germany.
  10. CERINI, S., COMPAGNONI, V., DEMONTIS, A., FORMENTELLI, M. & GANDINI, G. (2007) Language resources and linguistic theory: Typology, second language acquisition, English linguistics (Forthcoming), chapter Micro-WNOp: A gold standard for the evaluation of automatically compiled lexical resources for opinion mining.
  11. STONE, P., C.DUMPHY, D., SMITH, M. S. & OGILVIE, D. M. (1996) The General Inquirer: A Computer Approach to Content Analysis. The MIT Press.
  12. PANG, B., LEE, L. & VAITHYANATHAN, S. (2002) Thumbs up? Sentiment Classification using machine learning techniquies. EMNLP -02, the Conference on Empirical Methods in Natural Language Processing. USA.
  13. TURNEY, P. D. (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. Proceeding 40th Annual Meeting of the Association for Computational Linguistic. ACL 2002. USA.
  14. BALAHUR, A. & MONTOYO, A. (2008) Building a recommender system using community level social filtering. 5th International Workshop on Natural Language and Cognitive Science (NLPCS).
  15. CILIBRASI, R. & VITÁNYI, P. Automatic Meaning Discovery Using Google.
  16. BALAHUR, A. & MONTOYO, A. (2008) Applying a culture dependent emotion trigger database for text valence and emotion classification. Procesamiento del Lenguaje Natural.
  17. POPESCU, A. M. & ETZIONI, O. (2005) Extracting product features and opinions from reviews. Proccedings of HLT-EMNLP. Canada.
  18. Z. Kozareva, P. Nakov, A. Ritter, S. Rosenthal, V. Stoyonov, and T. Wilson, "Sentiment Analysis in Twitter," in Proceedings of the 7th International Workshop on Semantic Evaluation: Association for Computation Linguistics, 2013.
  19. J. Wiebe, T. Wilson, and C. Cardie, "Annotating Expressions of Opinions and Emotions in Language," in Kluwer Academic Publishers, Netherlands, 2005.
  20. A. Balahur, E. Boldrini, A. Montoyo, and P. Martinez-Barco, "The OpAL System at NTCIR 8 MOAT," in Proceedings of NTCIR-8 Workshop Meeting, Tokyo, Japan., 2010, pp. 241-245.
  21. Hatzivassiloglou, Vasileios, and J. Wiebe, "Effects of Adjective Orientation and Gradability on Sentence Subjectivity," in International Conference on Computational Linguistics (COLING-2000), 2000.
  22. S.-M. Kim and E. Hovy, "Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text," in In Proceedings of workshop on sentiment and subjectivity in text at proceedings of the 21st international conference on computational linguistics/the 44th annual meeting of the association for computational linguistics (COLING/ACL 2006), Sydney, Australia, 2006, pp. 1-8.
  23. SCHERER, K. & WALLBOTT, H. (1997) The ISEAR Questionnaire and Codebook. Geneva Emotion Research Group.
  24. J. Wiebe, T. Wilson, and C. Cardie, "Annotating Expressions of Opinions and Emotions in Language," in Kluwer Academic Publishers, Netherlands, 2005.
  25. BOLDRINI, E., BALAHUR, A., MARTÍNEZ-BARCO, P. & MONTOYO, A. (2009) Emotiblog: an annotation scheme for emotion detection and analysis in non-traditional textual genres. Proccedings of the 5th International Conference on Data Mining (DMIN 2009)
  26. Fernández, J.; Gutiérrez, Y.; Gómez, JM.; Martínez-Barco, P.; Montoyo, A.; Muñoz, R. Sentiment Analysis of Spanish Tweets Using a Ranking Algorithm and Skipgrams. Journal Sociedad Espanola de Procesamiento de Lenguaje Natural, pp 133-142, 2013.
  27. Gutiérrez, Y.; Tomás, D.; Fernández, J. Benefits of Using Ranking Skip-Gram Techniques for Opinion Mining Approaches. Proceedings of eChallenges 2015 e-2015. pp 1-10. 2015
  28. Fernández, J.; Gutiérrez, Y.; Tomás, D.; Gómez, J.M.; Martínez-Barco, P. Evaluating a Sentiment Analysis Approach from a Business Point of View. Taller de Análisis de Sentimientos en la SEPLN (TASS). 2015
  29. Fernández, J.; Gutiérrez, Y.; Gómez, JM.; Martínez-Barco, P.; Montoyo, A.; Muñoz, R. Sentiment Analysis of Spanish Tweets Using a Ranking Algorithm and Skipgrams. Journal Sociedad Espanola de Procesamiento de Lenguaje Natural, pp 133-142, 2013.
  30. http://www.metacritic.com/about-metacritic
  31. https://www.cs.cornell.edu/people/pabo/movie-review-data
  32. http://alt.qcri.org/semeval2014/task4/index.php?id=data-and-tools
  33. Downloaded from http://lrc.cornell.edu/swedish/dataset/affectdata/index.html