Machine Learning For Sentiment Analysis (Using Python)


Sentiment Analysis

Sentiment Analysis refers to opinion mining where contextual mining of text is done to extract subjective information to gain an understanding of polarity or emotional reaction of people towards a particular product, event or service.

A Real Life Example Where Sentiment Analysis Has Been Used

The sentiment analysis technique was used to measure public opinion to various policy announcements by the US President before the 2012 presidential election. They strategized and planned better by quickly visualizing the sentiment behind everything from posts to news articles.

Building A Sentiment Analysis Tool For Twitter Using Python

Twitter is a famous social media site and a perfect fit for running sentiment analysis test. Here we start with a simple python code for mining public opinion on Twitter.

The steps involved in the Python script are:-

i)     We gather Tweets using the Twitter API.

ii)    We use AYLIEN Text Analysis API to analyze their sentiment. 

iii)    Visualize the results using matplotlib.

iv)   Save the result in a CSV file for reporting and analysis.

What we gain from the code:-

i)    Understanding and measuring the public’s reaction and opinion to events on Twitter.

ii)   Identify negative mentions of your competitor and use it for generating sales


Installing the libraries and getting API keys :

 i) Install the following libraries ( It can be done with pip):

  • tweepy (to gather Tweets)
  • aylien-apiclient (to analyze the sentiment of the Tweets)
  • matplotlib (to visualize the results)

 ii) Get API Keys for the following :

  • API keys for Twitter :

You can get it from Twitter Developer.

Here is a reference –

(You can also use the free Twitter plan)

  • API keys for AYLIEN

You can get it by signing up for Text API.

Here is a reference –

(You can also use the free Text API plan)

 iii) Copy, paste and run the script below.

The Python Script

import re
import sys
import csv
import tweepy
from tweepy import OAuthHandler
import matplotlib.pyplot as plt 

from collections import Counter
from aylienapiclient import textapi

class TwitterClient(object): 
    #Generic Twitter Class for getting tweets data from twitter and cleaning it.

    def __init__(self): 
        ##Class constructor for authentication purpose. 
        # keys and tokens for Twitter api 
        consumer_key = "your consumer key here" 
        consumer_secret = "your secret consumer key here"
        access_token = "your access token here"
        access_token_secret = "your secret access token here"
            self.auth = OAuthHandler(consumer_key, consumer_secret) 
            self.auth.set_access_token(access_token, access_token_secret) 
            self.api = tweepy.API(self.auth) 
            print("Error: Twitter Authentication Failed") 
    def clean_tweet(self, tweet): 
        # function to clean tweet text by removing emojis,characters, links 
        return  ' '.join(re.sub("(@[A-Za-z0-9]+)|([^0-9A-Za-z \t])|(\w+:\/\/\S+)", " ", tweet).split()) 
    def get_tweets(self, query, count = 20): 
        #function to fetch all tweets.
        # default count for fetching tweets is 20
            fetched_tweets = = query, count = count)  
            return fetched_tweets 
        except tweepy.TweepError as e: 
            print("Error while fetching tweets : " + str(e))
class AYLIENClient(object): 
    #Generic AYLIEN Class for getting sentiment of of given text .
    def __init__(self): 
        Class constructor for authentication of Text api for AYLIEN  . 

        ## AYLIEN credentials
        application_id = "your app_id"
        application_key = "your api_key"

            self.api = textapi.Client(application_id, application_key)

            print("Error: AYLIEN Authentication Failed")  
    def get_tweet_sentiment(self, tweet): 
        # function to classify sentiment of tweet 
        response = self.api.Sentiment({'text': tweet})
        return response

if sys.version_info[0] < 3:
    input = raw_input

query = input("What subject do you want to analyze for this example? \n")
number = input("How many Tweets do you want to analyze? \n")
file_name = 'Sentiment_Analysis_of_{}_Tweets_About_{}.csv'.format(number, query)

api = TwitterClient() 
tweets = api.get_tweets(query = query, count = number)
with open(file_name, 'w', newline='') as csvfile:
    csv_writer = csv.DictWriter(
    fieldnames=["Tweet", "Sentiment"]

    for c, result in enumerate(tweets, start=1):
        tweet = result.text
        cleaned_tweet = api.clean_tweet(tweet)
        if len(tweet) == 0:
           print('Empty Tweet')
        response = text_api.get_tweet_sentiment( cleaned_tweet)
        'Tweet': response['text'],
        'Sentiment': response['polarity']

print("Analyzed {} Tweets about {} \n".format(number, query))

print("Saved data in Sentiment_Analysis_of_{}_Tweets_About_{}.csv \n".format(number, query))

with open(file_name, 'r') as data:
    counter = Counter()
    for row in csv.DictReader(data):
        counter[row['Sentiment']] += 1

    pos = counter['positive']
    neg = counter['negative']
    neu = counter['neutral']

# percentage of positive tweets 
print("Positive tweets percentage: {} %".format(100*pos/len(tweets))) 
# percentage of negative tweets 
print("Negative tweets percentage: {} %".format(100*neg/len(tweets))) 
# percentage of neutral tweets 
print("Neutral tweets percentage: {} %".format(100*neu/len(tweets))) 

colors = ['green', 'red', 'grey']
sizes = [positive, negative, neutral]
labels = 'Positive', 'Negative', 'Neutral'
## use matplotlib to plot the chart
plt.title("Sentiment of {} Tweets about {}".format(number, query))

Inputs Required In The Code

This python snippet is made quick and easy by letting you change:-

  1. Terms you want to search tweets for
  2. How many tweets you want to gather and analyze  (that is, sample size)

every time you run the script from the shell using the input() method.

query = input(“What subject do you want to analyze for this example? \n”)

number = input(“How many Tweets do you want to analyze? \n”)


Python 3 runs input() as a string, whereas Python 2 runs input() as a Python expression, so these lines change this to raw_input() if you’re running Python 2.

if sys.version_info[0] < 3:

  input = raw_input

Few Important Things To Note

  1. Your results here will be limited to 100 Tweets, and it will not return an error message if you try to search for more than 100 Tweets. So even the title reads “500 Tweets”, be careful that it will only analyze the first 100 Tweets.
  2. Tweets returned by the AYLIEN Text API and the Tweets we got from Twitter essentially are the same, however, if you write the Tweet that the AYLIEN API returns, the chance of errors will be less.

Advance Application Of Sentiment Analysis

You can do much more things related to sentiment analysis using your AYLIEN API keys:

  • Aspect-based sentiment analysis feature.

Helps you to identify every specific sentiment attached to the specific aspect of a tweet.

  • TAP

You can train your own language model using this feature.

Future Of Sentiment Analysis

Sentiment analysis is in its initial stage right now, there are several aspects to work on, some of them are like:-

i) Sentiment analysis needs to move beyond a one-dimensional approach that is positive to negative scale. We need a more sophisticated multidimensional scale, where a broader range of emotions can be captured.

ii) Text analytics should be made capable to measure hope, anxiety, excitement, etc.

iii) The accuracy and reliability of sentiment analysis is still a concern. We should focus on achieving results as close as human performance.


Please enter your comment!
Please enter your name here