Sentiment Analysis
Sentiment Analysis refers to opinion mining where contextual mining of text is done to extract subjective information to gain an understanding of polarity or emotional reaction of people towards a particular product, event or service.
A Real Life Example Where Sentiment Analysis Has Been Used
The sentiment analysis technique was used to measure public opinion to various policy announcements by the US President before the 2012 presidential election. They strategized and planned better by quickly visualizing the sentiment behind everything from posts to news articles.
Building A Sentiment Analysis Tool For Twitter Using Python
Twitter is a famous social media site and a perfect fit for running sentiment analysis test. Here we start with a simple python code for mining public opinion on Twitter.
The steps involved in the Python script are:-
i) We gather Tweets using the Twitter API.
ii) We use AYLIEN Text Analysis API to analyze their sentiment.
iii) Visualize the results using matplotlib.
iv) Save the result in a CSV file for reporting and analysis.
What we gain from the code:-
i) Understanding and measuring the public’s reaction and opinion to events on Twitter.
ii) Identify negative mentions of your competitor and use it for generating sales
Pre-requisites
Installing the libraries and getting API keys :
i) Install the following libraries ( It can be done with pip):
- tweepy (to gather Tweets)
- aylien-apiclient (to analyze the sentiment of the Tweets)
- matplotlib (to visualize the results)
ii) Get API Keys for the following :
- API keys for Twitter :
You can get it from Twitter Developer.
Here is a reference – https://developer.twitter.com/en/community
(You can also use the free Twitter plan)
- API keys for AYLIEN
You can get it by signing up for Text API.
Here is a reference – https://developer.aylien.com/signup
(You can also use the free Text API plan)
iii) Copy, paste and run the script below.
The Python Script
import re import sys import csv import tweepy from tweepy import OAuthHandler import matplotlib.pyplot as plt from collections import Counter from aylienapiclient import textapi class TwitterClient(object): #Generic Twitter Class for getting tweets data from twitter and cleaning it. def __init__(self): ##Class constructor for authentication purpose. # keys and tokens for Twitter api consumer_key = "your consumer key here" consumer_secret = "your secret consumer key here" access_token = "your access token here" access_token_secret = "your secret access token here" try: self.auth = OAuthHandler(consumer_key, consumer_secret) self.auth.set_access_token(access_token, access_token_secret) self.api = tweepy.API(self.auth) except: print("Error: Twitter Authentication Failed") def clean_tweet(self, tweet): # function to clean tweet text by removing emojis,characters, links return ' '.join(re.sub("(@[A-Za-z0-9]+)|([^0-9A-Za-z \t])|(\w+:\/\/\S+)", " ", tweet).split()) def get_tweets(self, query, count = 20): #function to fetch all tweets. # default count for fetching tweets is 20 try: fetched_tweets = self.api.search(q = query, count = count) return fetched_tweets except tweepy.TweepError as e: print("Error while fetching tweets : " + str(e)) class AYLIENClient(object): #Generic AYLIEN Class for getting sentiment of of given text . def __init__(self): ''' Class constructor for authentication of Text api for AYLIEN . ''' ## AYLIEN credentials application_id = "your app_id" application_key = "your api_key" try: self.api = textapi.Client(application_id, application_key) except: print("Error: AYLIEN Authentication Failed") def get_tweet_sentiment(self, tweet): # function to classify sentiment of tweet response = self.api.Sentiment({'text': tweet}) return response if sys.version_info[0] < 3: input = raw_input query = input("What subject do you want to analyze for this example? \n") number = input("How many Tweets do you want to analyze? \n") file_name = 'Sentiment_Analysis_of_{}_Tweets_About_{}.csv'.format(number, query) api = TwitterClient() tweets = api.get_tweets(query = query, count = number) text_api=AYLIENClient() with open(file_name, 'w', newline='') as csvfile: csv_writer = csv.DictWriter( f=csvfile, fieldnames=["Tweet", "Sentiment"] ) csv_writer.writeheader() for c, result in enumerate(tweets, start=1): tweet = result.text cleaned_tweet = api.clean_tweet(tweet) if len(tweet) == 0: print('Empty Tweet') continue response = text_api.get_tweet_sentiment( cleaned_tweet) csv_writer.writerow({ 'Tweet': response['text'], 'Sentiment': response['polarity'] }) print("Analyzed {} Tweets about {} \n".format(number, query)) print("Saved data in Sentiment_Analysis_of_{}_Tweets_About_{}.csv \n".format(number, query)) with open(file_name, 'r') as data: counter = Counter() for row in csv.DictReader(data): counter[row['Sentiment']] += 1 pos = counter['positive'] neg = counter['negative'] neu = counter['neutral'] # percentage of positive tweets print("Positive tweets percentage: {} %".format(100*pos/len(tweets))) # percentage of negative tweets print("Negative tweets percentage: {} %".format(100*neg/len(tweets))) # percentage of neutral tweets print("Neutral tweets percentage: {} %".format(100*neu/len(tweets))) colors = ['green', 'red', 'grey'] sizes = [positive, negative, neutral] labels = 'Positive', 'Negative', 'Neutral' ## use matplotlib to plot the chart plt.pie( x=sizes, shadow=True, colors=colors, labels=labels, startangle=90 ) plt.title("Sentiment of {} Tweets about {}".format(number, query)) plt.show()
Inputs Required In The Code
This python snippet is made quick and easy by letting you change:-
- Terms you want to search tweets for
- How many tweets you want to gather and analyze (that is, sample size)
every time you run the script from the shell using the input() method.
query = input(“What subject do you want to analyze for this example? \n”)
number = input(“How many Tweets do you want to analyze? \n”)
NOTE:
Python 3 runs input() as a string, whereas Python 2 runs input() as a Python expression, so these lines change this to raw_input() if you’re running Python 2.
if sys.version_info[0] < 3:
input = raw_input
Few Important Things To Note
- Your results here will be limited to 100 Tweets, and it will not return an error message if you try to search for more than 100 Tweets. So even the title reads “500 Tweets”, be careful that it will only analyze the first 100 Tweets.
- Tweets returned by the AYLIEN Text API and the Tweets we got from Twitter essentially are the same, however, if you write the Tweet that the AYLIEN API returns, the chance of errors will be less.
Advance Application Of Sentiment Analysis
You can do much more things related to sentiment analysis using your AYLIEN API keys:
- Aspect-based sentiment analysis feature.
Helps you to identify every specific sentiment attached to the specific aspect of a tweet.
- TAP
You can train your own language model using this feature.
Future Of Sentiment Analysis
Sentiment analysis is in its initial stage right now, there are several aspects to work on, some of them are like:-
i) Sentiment analysis needs to move beyond a one-dimensional approach that is positive to negative scale. We need a more sophisticated multidimensional scale, where a broader range of emotions can be captured.
ii) Text analytics should be made capable to measure hope, anxiety, excitement, etc.
iii) The accuracy and reliability of sentiment analysis is still a concern. We should focus on achieving results as close as human performance.