Research Article: 2021 Vol: 27 Issue: 2S
Pedro Neves Mata, ISCTE-IUL Instituto Universitário de Lisboa (ISCTE-IUL)
Mário Nuno Mata, ISCAL-IPL: Instituto Superior de Contabilidade e Administração de Lisboa
Jéssica Nunes Martins, NOVA -Information Management School (NOVA IMS), Universidade Nova de Lisboa
João Xavier Rita, ISCAL-IPL (Instituto Superior de Contabilidade e Administração de Lisboa)
Anabela Batista Correia, ISCAL-IPL (Instituto Superior de Contabilidade e Administração de Lisboa)
Sentiment Analysis (SA), Opinion Mining (OM), Text Mining, Natural Language Processing (NLP), Case Study, Application.
Sentiment Analysis (SA) or Opinion Mining (OM) is the field of study for a broader topic of Natural Language Processing. SA seeks to understand people's opinions, feelings, assessments, attitudes and emotions through text to generate knowledge and relevant information on a particular subject, in the business world with a greater focus on understanding the evaluation of products. We can often resume to an interpretation of attitude behind the text whether it is positive, negative or neutral. The growing importance of SA coincides with the growth of social networks, opinions, criticism, forum discussions, and blogs, among others. With this exponential evolution of data has arisen the need to apply SA in almost all social and commercial domains, because opinions are key in almost all activities and are one of the influencing factors in human and social behaviors, beliefs and perceptions of our own choices. As the opinion is one of the main influencing factors in the people's choice has made the spectrum of analysis broader for organizations making this a very relevant topic these days. This paper revealed that although there some advances for algorithms, techniques and frameworks to help SA implementations there is still a gap towards identifying benefits for business applications.
The field of Artificial Intelligence (AI) is constantly growing and discovering new ways to solve real-world problems (Moreno & Redondo, 2016). One of the AI knowledge and research fields is Natural Language Processing (NLP) which attempts to classify and process human language data that use devices to comprehend humans.
NLP is an AI field focused on enabling computers to understand, process and act based on human languages, getting computers closer to a human level language understanding (Jurado & Rodriguez, 2015). Some advances in Machine Learning (ML) have allowed computers to do many useful things using NLP techniques and deep learning (Zhang et al., 2018) such as online language translators or semantic understanding Feldman, (2013).
One of the most popular and important uses of NLP is Sentiment Analysis (SA) (Sayeedunnisa et al., 2018). With this technique, we can build systems that attempt to identify and extract opinions or sentiments from oral speaking or written texts (Kotzias et al., 2015). This type of analysis is extremely important for organizations because they can take customer's opinions and accordingly make improvements to their products and businesses.
Also known as Opinion Mining (OM), SA can be defined as the “computational treatment of opinion, sentiment, and subjectivity in text” (Bakshi et al., 2016; Dwivedi et al., 2019). It has been applied to many contexts, like reviewing customers, products and services, examining reputations in social networks [REF], tracking people’s feelings about politicians, promoting marketing campaigns, among others (Feldman, 2013).
Both text and OM are originally conducted for two purposes. The first purpose is to analyze people’s sentiment on an issue or phenomenon. Hence, sentiment analysis goes through a huge amount of textual data to identify people’s attitudes, thoughts, judgments, and emotions on an issue (Feldman, 2013; Yu, 2003; Hatzivassiloglou, 2003). The second purpose is to assess people’s opinion on a product, person, event, organization, or topic from a user or group of user perspectives. Similar to SA, OM is a NLP task that uses an algorithmic technique to recognize opinionated content and classify it into positive, negative, or neutral polarity (Piryani et al., 2017). Nonetheless, the application of OM has been extended to other fields of human-computer applications, and the applications are growing with the growth in big data analytics (Shayaa et al., 2018).
Despite a large number of studies on SA and OM techniques, the impact they have on organizations has been less studied. So rather than being concentrated on techniques and algorithms the need for a systematic review arises from the requirement to summarize all relevant information about application and creation of value for SA implementations in organizations.
To draw a general conclusion about this phenomenon, this research evaluates individual studies that can help to understand the main features of this field.
After this introduction, the rest of the paper is structured in four sections. In the next section, we will define the research methods applied. After that, we will define the data used in this paper as well as metrics used to compare them and then we will see the main results that we have seen during the analysis. Finally, we will see the conclusions that we took.
A Systematic Review (SLR) is a process of identifying, assessing and interpreting all available research evidence, to provide answers for a particular research question.
A form of secondary study that uses a well-defined methodology:
• To systematically accumulate, organize, evaluate and synthesize all existing research evidence of your research area.
• To present a fair evaluation of a research topic by using a trustworthy, rigorous, and auditable methodology.
• To produce reliable and unbiased results.
• To identify gaps in the existing research that will lead to topics for further investigation.
• To provide a background to position new research activities.
• To support Evidence-based research.
Steps for Conducting a SLR
Planning the Review
• Formulate the Review research questions.
• Develop the review´s protocol.
Conducting the Review
• Search the relevant literature
• Perform a selection of primary studies
• Perform data extraction
• Assess studies quality
• Conduct synthesis of evidence
Reporting the Review
• Write up the results of the review
• Taking out conclusions
The objective of the paper is to discuss the value that can be added through sentiment analysis in organizations and try to identify potential risks so it can generate innovation throughout the business.
Therefore, the purpose of our systematic review must answer 3 questions:
RQ1: Which are the industries that applied sentiment analysis and opinion mining?
RQ2: What kind of advantages and disadvantages are related to implementation?
RQ3: What are the main innovations that are correlated with this kind of implementation?”
Study Selection and Evaluation
To obtain a comprehensive set of papers we started by searching studies in the widely accepted literature search engines and databases ACM, IEEE, Science Direct and Springer that contain all text collections such as papers in journals, chapters, conference proceedings, review articles and research articles. These papers were collected based on their title, keywords, abstract, and rank.
Since the study is based on sentiment analysis and respective implementations, we used 3 different strings to identify relevant papers:
• Sentiment Analysis application
• Sentiment Analysis case study
• Sentiment Analysis implementation
In order to ensure and maintain the quality of the paper, we have constrained our selection of articles to the following criteria’s:
1. Filter in title or abstract
2. Filter in title
3. Filter Q1, Q2 ranks of magazines (SCImago Journal & Country Rank based) and A, B (ERA - Australian Computing Research and Education Association of Australasia based) for conference proceedings
4. Removing duplicates
The application of the criteria’s has followed as shown in Table 1 :
|Keyword||Database||0.No filter||1.Filter in title or abstract||2.Filter in title||3.Rank||4. Duplicates|
|Sentiment Analysis application||ACM||266.586||9||36||10||10|
|Sentiment Analysis case study||ACM||231.557||2||8||1||1|
|Sentiment Analysis implementation||ACM||200.406||0||0||0||0|
Data Extraction Analysis
A more in-depth analysis was done regarding the sources of datasets which are shown in Figures 1 and 2.
In picture 1 we can see that we have a rise in articles in 2014 and 2015 but there is a downfall in 2016 and 2017, after that, we can see an improvement for 2018 and 2019 with four (4) articles.
The main source for industries application, picture 2, is Social Media with ten (10) articles, followed by Technology with five (5) articles, Education with two (2) articles and all the others with 1 article
Synthesis of Selected Studies
SA and OM and has increased popularity in recent years and has been applied in several areas. It has been used in differentiated areas like Communications and Media, Energy and utilities, Industry, Healthcare, Financial sector, Public sector, Distribution, Banking, Social Media, Technology among others.
Some of these applications will be reviewed in this section following the studies that were selected.
This market is gaining importance worldwide because it brings convenience to our life as shoppers, users can search, browse, compare, and purchase various items without the time and geographical constraints.
This study offers a decision support model for item comparison in e-commerce using qualitative flexible multiple criteria methods and online reviews to support consumers (Ji, 2018).
The acquisition of knowledge, skills, values, beliefs, and habits are in place for thousands of years we have seen several educational methods include teaching, training, storytelling, discussion and directed research. Education frequently takes place under the guidance of teachers; however, learners can also educate themselves.
We have two different perspectives that SA was able to aid for one side it helped bring student evaluation comments for a specific professor and on the other side was applied to assisting language learning (Borromeo & Toyama, 2015; Cao, 2014).
Hospitality & Sales
The hospitality industry is a broad category of fields within the service industry that includes lodging, food and drink service, event planning, theme parks, transportation, traveling, airline and additional areas within the tourism industry.
Hospitality industry concentrates on customer's satisfaction by creating good services and products that will meet their needs. Therefore, it is important for service providers to establish a good relationship with customers so that they will come back for more and this could be an excellent area to work with SA evaluation.
Since the main goal of almost all corporations is to sell goods or services we can find two great examples of SA applications one to an ideal decision or best restaurant for outing based on textual reviews available online (Dwivedi & Pant, 2019) and the other one exploring the relationship between the sales performance of products and their reviews (Liu, 2010).
Social Media and Technology
Social media is the technology that facilitates the sharing of ideas, thoughts, and information through the building of virtual networks and communities. By design, social media is internet-based and gives users quick electronic communication of content.
SA is extremely useful in social media monitoring as it allows us to achieve an overview of the public opinion behind certain topics, product or services. The ability to extract insights from social data is a practice that this technology can provide and is being adopted by organizations across the world.
From our studies we can realize that there is a constant application and creation of algorithms, methodologies and frameworks to assist this pursue of insights (Chen et al., 2014; Costa et al., 2012; Feldman, 2013; Gimnez et al., 2019; Jurado & Rodriguez, 2015; Kauffmann et al., 2019; Konan et al., 2016; Kranjc et al., 2015; Li et al., 2016; López et al., 2019; Shayaa et al., 2018; Oliveira et al., 2014; Tun Thura Thet et al., 2010). Some of the studies are looking for correlations of images, sound or audiovisual SA (Chen et., 2014; Konan et al., 2016; Tun Thura Thet et al., 2010) others are pursuing the best technique of SA application (Feldman, 2013; Jurado & Rodriguez, 2015; López et al., 2019; Shayaa et al., 2018).
Since this is a field that is still growing knowledge these studies will help to clarify.
Transportation is the movement of humans, animals and goods from one location to another. Since this, a primary concern on our daily basis this paper focuses on the use of SA to help us understand the traffic information from websites considering human affection to enrich the analysis (Cao et al., 2014).
The main source of data is Twitter with five (5) articles and the others are using several other sources for the analysis. Although we can see there is a pattern that is mainly using social media sources to get insights.
Advantages, Disadvantages & Innovation
We realize that this collection of papers doesn’t gather information on business impact for organizations.
The next table 2 will resume the selected studies in terms of industries, datasets, advantages vs disadvantages and innovations criteria for business applications.
|E-commerce||(Ji et al., 2018)||Build a review-based decision support model foritems comparison in e-commerce.||PConline.com||n/a||n/a||n/a|
|Education||(Borromeo & Toyama, 2015).||Compare SA identification from manual,crowdsourced and automatic systems||.csv files||n/a||n/a||n/a|
|(Chen et., 2018)||Application of SA to language learning.||Several documents||n/a||n/a||n/a|
|Hospitality||(Dwivedi & Pant, 2019)||Framework for big data SA on real-time updates inonline reviews or text for best decision selection.||Kaggle||n/a||n/a||n/a|
|Sales||(Liu et al., 2010)||Algorithm that can be applied to salesperformance prediction.||IMDB||n/a||n/a||n/a|
|Social Media||(Chen et al., 2014)||Framework to detect visual concepts.||Flickr||n/a||n/a||n/a|
|(Costa et al., 2012)||Framework for building blog mining applicationsin e-commerce.||Blogosphere||n/a||n/a||n/a|
|(Konan et al., 2016)||Algorithm to choose the correct background music(BGM) to a photo or movie scene.||Movies||n/a||n/a||n/a|
|(Kranjc, et al., 2015)||Methodology and workflow implementation for SAusing data streams.||n/a||n/a||n/a|
|(Li et al., 2015)||Two algorithms: a Weibo emotion classificationalgorithm and Weibo open evaluation algorithm.||n/a||n/a||n/a|
|(Ofek et al., 2015)||Demonstrating that the text styling, in terms ofpronouns usage, is useful for some text analyses, which relates to emotionalstates||Cancer Survivors Network forum||n/a||n/a||n/a|
|(Oliveira et al., 2014)||Algorithm to create a stock market lexicon.||StockTwits||n/a||n/a||n/a|
|(Sharma et al., 2018)||Create web-based application that allowsvisualization of current sentiments associated with a keyword on Twittermessages by plotting them on a map.||n/a||n/a||n/a|
|(Tellez et al., 2017)||Identify in a large set of combinations whichtext and token-weighting schemes make the most impact on the accuracy of aclassifier (SVM) trained.||n/a||n/a||n/a|
|(Tun Thura Thet et al., 2018)||Two frameworks for joint visual-textual sentimentanalysis. Both of which are trying to integrate textual and visualinformation into a unified model.||Visual Sentiment Ontology, Flickr, getty images||n/a||n/a||n/a|
|Technology||(Feldman, 2013)||Techniques and applications for SA.||n/a||n/a||n/a||n/a|
|(Gimnez et al., 2019)||Methodology for applying semantic-based paddingin Convolutional Neural Networks for NLP.||Stanford Sentiment Treebank||n/a||n/a||n/a|
|(Jurado & Rodriguez, 2015)||SA techniques in order to identify and monitorthe underlying sentiments in the text written by developers.||Github||n/a||n/a||n/a|
|(Kauffmann et al., 2019)||Framework for big data analytics in commercialsocial networks.||Amazon||n/a||n/a||n/a|
|(López et al., 2019)||Identify to most accurate supervised learningmethod for sentiment analysis.||Yelp, Amazon||n/a||n/a||n/a|
|(Shayaa et al., 2018)||Understanding of the various OM and SA approachesperformed on text analytics.||Twitter, Facebook, Amazon||n/a||n/a||n/a|
|Transportation||(Cao et al., 2014)||Traffic sentiment analysis (TSA) for processingtraffic information from websites.||Twitter, Weibo, online forum´s, blogs||n/a||n/a||n/a|
In this paper, we have presented the results of a systematic literature review on SA which included
23 different approaches, based on this review, the main objectives were to help organizations to classify existing and future applications on this area, advantages, disadvantages and innovation of possible implementations.
During the review process, we acquired knowledge of different research subareas and structured the results to several tables, which are aimed to speed up knowledge transfer among various research communities.
Regarding RQ1, the industries that applied SA were successfully identified and described in a number of applications has followed Social Media, Technology, Education among others. Concerning RQ2 and RQ3, advantages, disadvantages and innovation for organizations were not possible to identify since these papers are oriented to techniques of applications rather than being oriented to organizations.
In summary, we learned that although there some advances for algorithms, techniques and frameworks to help SA implementations there is still a gap towards identifying benefits for business applications. We believe that the results of our systematic review will help to advance future studies to search for these gaps.
In the future, research should be carried out on RQ2 and RQ3, that we were not able to gather enough information and present a robust conclusion. As a starting point, we can include a wider spectrum of studies and second, it would be interesting doing some real application on a business organization.