In this tutorial, you’ll learn how to do sentiment analysis on Twitter data using Python.
Twitter is one of the most popular social networking platforms. With hundred millions of active users, there is a huge amount of information within daily tweets and their metadata.
With Twitter sentiment analysis, companies can discover insights such as customer opinions about their brands and products to make better business decisions. Also, analyzing Twitter data sentiment is a popular way to study public views on political campaigns or other trending topics.
With an example, you’ll discover the end-to-end process of Twitter sentiment data analysis in Python:
- How to extract data from Twitter APIs.
- How to process the data for TextBlob sentiment analysis.
- How to evaluate the sentiment analysis results.
- Plus, some visualizations of the insights.
If you want to learn about the sentiment of a product/topic on Twitter, but don’t have a labeled dataset, this post will help!
Let’s jump in.
Before we start
If you are new to Python, please take our FREE Python crash course for data science. This tutorial assumes you have basic knowledge of Python. It’s also good to know the Python library pandas: Learn Python Pandas for Data Science: Quick Tutorial.
We’ll also be requesting Twitter data by calling the APIs, which you can learn the basics in How to call APIs with Python to request data.
Lastly, what is sentiment analysis?
Sentiment analysis (also known as opinion mining or emotion AI) refers to the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information.
Wikipedia
A basic sentiment analysis task is classifying the polarity of some given text. For example, a restaurant review saying, ‘This is so tasty. I love it!’ obviously shows a positive sentiment, while the sentence ‘I want to get out of here as soon as possible’ is more likely a negative one.
With this basic knowledge, we can start our process of Twitter sentiment analysis in Python!
Step #1: Set up Twitter authentication and Python environments
Before requesting data from Twitter, we need to apply for access to the Twitter API (Application Programming Interface), which offers easy access to data to the public.
There are different tiers of APIs provided by Twitter. We’ll be using the Premium search APIs with Search Tweets: 30-day endpoint, which provides Tweets posted within the previous 30 days. If you are interested in exploring other APIs, check out Twitter API documents.
Following the instructions, you can easily apply for a Twitter developer account, create an app, and generate four keys/tokens as your credentials to use the API:
- consumer key
- consumer secret
- access token
- access token secret
Further Reading: if you are not familiar with APIs, check out our tutorial How to call APIs with Python to request data.
Next, we’ll install and import some Python libraries needed for our sentiment analysis:
- TwitterAPI: provides easy access to Twitter APIs.
Note: If you read our Python API tutorial’s Twitter example, you’ll find it much easier to authenticate with this TwitterAPI package. - pandas: data manipulation and analysis tool.
- json: encoder and decoder of the JSON format.
Twitter API responds in the JSON format, so we need this package to decode it. - time: useful time-related functions.
- textblob: pre-built NLP tool based on NLTK, which we’ll be using for sentiment analysis.
We don’t have a large training dataset with the sentiment labels to build models, so we’ll apply the existing TextBlob tool.
You can use the ‘pip install <library_name>’ statement to install these packages. For example, to install the TextBlob package, we can run the command below.
pip install textblob
Once you have all the packages installed, we can run the Python code below to import them.
Next, let’s input the four tokens and instantiate a TwitterAPI object. These tokens are credentials to authenticate your access to the Twitter API, so please keep them secret like other usernames/passwords.
Note: due to the changes with Twitter APIs, the detailed procedures might vary from time to time.
Step #2: Request data from Twitter API
Now we are ready to get data from Twitter.
To standardize the extraction process, we’ll create a function that:
- takes a search keyword as input
- searches for tweets with this keyword
- then returns the related tweets as a pandas dataframe.
To achieve this, we created the below three functions:
- get_df_from_tweets: transforms the JSON format response with data from Twitter API to the familiar pandas dataframes.
- get_df_from_search: mainly helps to specify the search parameters and grab the related last 30 days of Twitter data. We set the parameter called query as the search keyword within this tutorial.
This function uses the previous function get_df_from_tweets. - get_data: This is the final function to get data from the Twitter API, which uses the previous two functions.
Twitter only allows 100 records for each search query for our environment and account. To get more records, this function obtains the next (query) token after each query and uses it to grab the next page of data in the next query. You can learn more about the details by reading the code with comments below.
When running this function, it will return the status code of the API calls and the quota/limitation left for our Twitter accounts.
That’s a lot of work!
With these predefined functions, we can easily grab data.
Let’s focus our analysis on tweets related to Starbucks, a popular coffee brand.
In the Python code below, we use the function get_data to extract 3000 (30*100) tweets mentioned the keyword ‘@starbucks’.
As the function runs, you’ll see the status code and the limit information printing out like below. If everything works well, you should expect to see 30 of these messages all with status code ‘200’, which means a success data pull.
200 {'remaining': 1795, 'limit': None, 'reset': None}
This is great!
We now have the data needed (df_starbucks) in the pandas dataframe format.
Step #3: Process the data and Apply the TextBlob model
The dataset from Twitter certainly doesn’t have labels of sentiment (e.g., positive/negative/neutral). And we don’t have the resources to label a large dataset to train a model; we’ll use an existing model from TextBlob for analysis.
Even though the dataset is in pandas dataframe, we still need to wrangle it further before applying TextBlob. We want to define a function that:
- takes in the data from the previous step
- transforms it
- applies the existing TextBlob model to it.
To do this, we created four functions below:
- flatten_user_info: unpacks the user object, which contains Twitter User account metadata, into four separate columns ‘id’, ‘name’, ‘screen_name’, ‘location’. There is more information stored in the user object. You can check out the details on user object docs.
When fitting the TextBlob model, this information is not needed. This is for demonstration only. - get_full_text: unpacks the tweet text into full_text if available.
There are two fields with text in tweets: ‘text’ and ‘full_text’, with the latter containing the complete message. You can read about them in Extended Tweets docs. - get_sentiment: applies the TextBlob sentiment model on a column of text.
- prepare_data: This is the final function we’ll be using, which uses the previous three functions. It prepares the data and applies the TextBlob model to produce the polarity score as a column called textblob_sentiment. You can read about its details in the code below.
When we apply the TextBlob.sentiment.polarity method, it produces a polarity score between -1.0 to 1.0 to each tweet record.
Note: in this post, we only clean the data enough to fit the TextBlob model. It’s for demonstration purposes only. In reality, you may want to clean the data more by removing URLs, special characters, and emojis from the text.
After the hard work of defining these functions, we can apply the prepare_data function on the dataframe df_starbucks. As the Python code below shows, we can also look at the summary information and the first few rows of the new dataframe.
Below is the summary info of the new dataframe. As you can see, we have a dataframe of shape 1821 * 42. And among the 42 columns, we have obtained the score of TextBlob in textblob_sentiment.
<class 'pandas.core.frame.DataFrame'> Int64Index: 1821 entries, 0 to 2998 Data columns (total 42 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 created_at 1821 non-null datetime64[ns, UTC] 1 id 1821 non-null int64 2 id_str 1821 non-null int64 3 text 1821 non-null object 4 source 1821 non-null object 5 truncated 1821 non-null bool 6 in_reply_to_status_id 1140 non-null float64 7 in_reply_to_status_id_str 1140 non-null float64 8 in_reply_to_user_id 1339 non-null float64 9 in_reply_to_user_id_str 1339 non-null float64 10 in_reply_to_screen_name 1339 non-null object 11 user 1821 non-null object 12 geo 29 non-null object 13 coordinates 29 non-null object 14 place 135 non-null object 15 contributors 0 non-null float64 16 is_quote_status 1821 non-null bool 17 quote_count 1821 non-null int64 18 reply_count 1821 non-null int64 19 retweet_count 1821 non-null int64 20 favorite_count 1821 non-null int64 21 entities 1821 non-null object 22 favorited 1821 non-null bool 23 retweeted 1821 non-null bool 24 filter_level 1821 non-null object 25 lang 1821 non-null object 26 matching_rules 1821 non-null object 27 display_text_range 1365 non-null object 28 extended_entities 186 non-null object 29 possibly_sensitive 482 non-null float64 30 extended_tweet 534 non-null object 31 quoted_status_id 78 non-null float64 32 quoted_status_id_str 78 non-null float64 33 quoted_status 78 non-null object 34 quoted_status_permalink 78 non-null object 35 retweeted_status 0 non-null object 36 user_id 1821 non-null int64 37 username 1821 non-null object 38 user_screen_name 1821 non-null object 39 user_location 1257 non-null object 40 full_text 1821 non-null object 41 textblob_sentiment 1821 non-null float64 dtypes: bool(4), datetime64[ns, UTC](1), float64(9), int64(7), object(21) memory usage: 561.9+ KB
To take a closer look at the new dataframe, the head of it is printed below.
created_at | id | id_str | text | source | truncated | in_reply_to_status_id | in_reply_to_status_id_str | in_reply_to_user_id | in_reply_to_user_id_str | … | quoted_status_id_str | quoted_status | quoted_status_permalink | retweeted_status | user_id | username | user_screen_name | user_location | full_text | textblob_sentiment | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2020-09-02 16:55:44+00:00 | 1301202157010477056 | 1301202157010477056 | I swear @Starbucks purposely just hiring cunts | <a href=”http://twitter.com/download/android” … | False | NaN | NaN | NaN | NaN | … | NaN | NaN | NaN | NaN | 1286814633672601600 | ×︵× | nu3BvTk35cRGS4B | None | I swear @Starbucks purposely just hiring cunts | 0.000 |
1 | 2020-09-02 16:55:36+00:00 | 1301202126803210240 | 1301202126803210240 | Gotta love the way @Starbucks spells “Vinny” 🤣… | <a href=”http://twitter.com/download/iphone” r… | False | NaN | NaN | NaN | NaN | … | NaN | NaN | NaN | NaN | 1296617554232913920 | OohOkay | ooh_okay_ | None | Gotta love the way @Starbucks spells “Vinny” 🤣… | 0.500 |
2 | 2020-09-02 16:54:24+00:00 | 1301201824603688961 | 1301201824603688960 | @FortniteLlama21 @Starbucks Aww well hey they … | <a href=”http://twitter.com/download/android” … | False | 1.301201e+18 | 1.301201e+18 | 1.054348e+18 | 1.054348e+18 | … | NaN | NaN | NaN | NaN | 3866775613 | Sr.AsdfLad🕷🕸 | Asdfman12800 | None | @FortniteLlama21 @Starbucks Aww well hey they … | 0.375 |
3 | 2020-09-02 16:53:06+00:00 | 1301201497359867904 | 1301201497359867904 | @Mwparsons @chiIIum @Starbucks Exactly 💯 | <a href=”http://twitter.com/download/android” … | False | 1.301169e+18 | 1.301169e+18 | 1.998923e+08 | 1.998923e+08 | … | NaN | NaN | NaN | NaN | 1288538307580719106 | moonflower | moonflower97035 | Portland, OR | @Mwparsons @chiIIum @Starbucks Exactly 💯 | 0.250 |
4 | 2020-09-02 16:53:00+00:00 | 1301201472953212934 | 1301201472953212928 | @Asdfman12800 @Starbucks Yeah. I was… yeah. … | <a href=”http://twitter.com/download/iphone” r… | False | 1.301201e+18 | 1.301201e+18 | 3.866776e+09 | 3.866776e+09 | … | NaN | NaN | NaN | NaN | 1054347907296567296 | 𝓜єη𝓓σƵ𝔞 | FortniteLlama21 | Married | @Asdfman12800 @Starbucks Yeah. I was… yeah. … | 0.000 |
5 rows × 42 columns
Nice!
Now we have a score for our Twitter sentiment analysis. But how do we know if it performs well? And how do we use it to classify?
Step #4: Label a sample manually
To learn more about the dataset’s sentiment, let’s save a sample of size 100 and label it manually.
Within the twitter-data.csv file, we only keep the columns full_text and textblob_sentiment, and add a column named label with three possible values:
- -1: negative sentiment
- 0: neutral sentiment
- 1: positive sentiment
Note: the label is based on our subjective judgment. It’s hard to classify the sentiment for tweets that are not well-written English or without context.
After manually labeling the tweets in a spreadsheet, the file is renamed as twitter-data-labeled.csv and loaded into Python. We can also take a look at its first 10 rows.
full_text | textblob_sentiment | label | |
---|---|---|---|
0 | @victoria0429 @MeganADutta @MachinaMeg Not a s… | 0.143750 | -1 |
1 | @themavennews @PatPenn2 @Starbucks Report to p… | 0.000000 | 0 |
2 | @Starbucks takes the cake worste drive through… | 0.000000 | -1 |
3 | @chiIIum @Starbucks https://t.co/Pdztc7l7QH | 0.000000 | 0 |
4 | @Briansweinstein @Starbucks Thanks, my friend! 🙏 | 0.250000 | 1 |
5 | @bluelivesmtr @Target @Starbucks Talk about a … | 0.002500 | -1 |
6 | My last song #Ahora on advertising for @Starbu… | 0.500000 | 1 |
7 | I propose that the @Starbucks Pumpkin Spice La… | 0.000000 | 1 |
8 | @beckiblairjones @mezicant @Starbucks @Starbuc… | 0.000000 | 0 |
9 | @QueenHollyFay20 @bluelivesmtr @Target @Starbu… | -0.233333 | -1 |
Since our sentiment label has three (multiple) classes (negative, neutral, positive), we’ll encode it using the label_binarize function in scikit-learn to convert it into three indicator variables. For example, is_neg = 1 when label = -1, otherwise 0. In this way, we can look at the model classification results for negative and positive sentiment separately.
The converted dataframe df_labelled looks like below.
full_text | textblob_sentiment | label | is_neg | is_neutral | is_pos | |
---|---|---|---|---|---|---|
0 | @victoria0429 @MeganADutta @MachinaMeg Not a s… | 0.143750 | -1 | 1 | 0 | 0 |
1 | @themavennews @PatPenn2 @Starbucks Report to p… | 0.000000 | 0 | 0 | 1 | 0 |
2 | @Starbucks takes the cake worste drive through… | 0.000000 | -1 | 1 | 0 | 0 |
3 | @chiIIum @Starbucks https://t.co/Pdztc7l7QH | 0.000000 | 0 | 0 | 1 | 0 |
4 | @Briansweinstein @Starbucks Thanks, my friend! 🙏 | 0.250000 | 1 | 0 | 0 | 1 |
5 | @bluelivesmtr @Target @Starbucks Talk about a … | 0.002500 | -1 | 1 | 0 | 0 |
6 | My last song #Ahora on advertising for @Starbu… | 0.500000 | 1 | 0 | 0 | 1 |
7 | I propose that the @Starbucks Pumpkin Spice La… | 0.000000 | 1 | 0 | 0 | 1 |
8 | @beckiblairjones @mezicant @Starbucks @Starbuc… | 0.000000 | 0 | 0 | 1 | 0 |
9 | @QueenHollyFay20 @bluelivesmtr @Target @Starbu… | -0.233333 | -1 | 1 | 0 | 0 |
How are the sentiment classifications distributed based on our labels?
Let’s look at the count of different labels.
We can see that there are 37 negative, 23 positive, and 40 neutral tweets in our sample of 100 that mentioned Starbucks.
0 63 1 37 Name: is_neg, dtype: int64 0 77 1 23 Name: is_pos, dtype: int64 0 60 1 40 Name: is_neutral, dtype: int64
With this manually labeled sample, we can go back to the TextBlob polarity and evaluate its performance.
Step #5: Evaluate the sentiment analysis results
Now it’s the exciting part!
We’ll discover how well the model has classified the sentiment based on our sample.
To evaluate the performance of TextBlob, we’ll use metrics including ROC curve, AUC, and accuracy score. So let’s import these extra packages first.
Further Reading: if you are not familiar with these metrics, read 8 popular Evaluation Metrics for Machine Learning Models.
We’ll create a function plot_roc_curve to help us plot the ROC curve.
As mentioned earlier, we’ll look into classifications of positive and negative sentiments separately.
Negative Tweets
First, let’s look at the ROC curve for the negative labels. We can calculate the metrics and plot the ROC curve for our 100 tweets sample dataset (df_labelled) as below.
The AUC is 0.72, as shown below.

But what’s the optimal threshold we should use?
We can look at the accuracy of classification of different thresholds.
We can see below that the accuracy is the highest (77%) when we use a threshold of -0.05, i.e., we consider the tweet negative when textblob_sentiment < -0.05.
threshold: -1.7, accuracy: 0.63 threshold: -0.7, accuracy: 0.63 threshold: -0.666666667, accuracy: 0.64 threshold: -0.6, accuracy: 0.65 threshold: -0.39, accuracy: 0.69 threshold: -0.35, accuracy: 0.7 threshold: -0.21666666699999998, accuracy: 0.71 threshold: -0.2, accuracy: 0.72 threshold: -0.058333333, accuracy: 0.76 threshold: -0.05, accuracy: 0.77 threshold: 0.0, accuracy: 0.76 threshold: 0.0025, accuracy: 0.55 threshold: 0.025, accuracy: 0.56 threshold: 0.14285714300000002, accuracy: 0.51 threshold: 0.14375, accuracy: 0.5 threshold: 0.2, accuracy: 0.5 threshold: 0.205194805, accuracy: 0.49 threshold: 0.25, accuracy: 0.5 threshold: 0.260416667, accuracy: 0.48 threshold: 0.28148148100000003, accuracy: 0.49 threshold: 0.28571428600000004, accuracy: 0.48 threshold: 0.41666666700000005, accuracy: 0.44 threshold: 0.5, accuracy: 0.43 threshold: 0.6, accuracy: 0.4 threshold: 0.8, accuracy: 0.38
How about the positive tweets classification?
Positive Tweets
We can use the same method as the negative tweets classification. Let’s first plot the ROC curve.
As you can see, the AUC is higher at 0.85.

Then we can look at the accuracy of different thresholds.
We reach the highest accuracy (86%) at a threshold of 0.2857, i.e., we classify the tweets as positive when textblob_sentiment > 0.2857.
threshold: 1.8, accuracy: 0.77 threshold: 0.8, accuracy: 0.77 threshold: 0.625, accuracy: 0.8 threshold: 0.55, accuracy: 0.8 threshold: 0.5, accuracy: 0.79 threshold: 0.304545455, accuracy: 0.85 threshold: 0.28571428600000004, accuracy: 0.86 threshold: 0.28148148100000003, accuracy: 0.84 threshold: 0.260416667, accuracy: 0.85 threshold: 0.25, accuracy: 0.84 threshold: 0.14285714300000002, accuracy: 0.82 threshold: 0.066666667, accuracy: 0.81 threshold: 0.05, accuracy: 0.82 threshold: 0.0375, accuracy: 0.81 threshold: 0.025, accuracy: 0.82 threshold: 0.0025, accuracy: 0.82 threshold: 0.0, accuracy: 0.81 threshold: -0.5, accuracy: 0.25 threshold: -0.6, accuracy: 0.24 threshold: -0.7, accuracy: 0.24
Now we have the optimal thresholds for classification of both positive and negative sentiments based on our sample.
Let’s apply them on the entire dataset!
As shown below, we create a new column predicted_sentiment with labels ‘negative’, ‘neutral’, and ‘positive’ based on the optimal score thresholds.
We can print out some of the dataset to take a look at our new column.
full_text | textblob_sentiment | predicted_sentiment | |
---|---|---|---|
2051 | Is nobody else suspicious of @Starbucks logo? … | 0.000000 | neutral |
1791 | @emilymchavez Same! What’s your favorite @Star… | 0.333333 | positive |
737 | @Starbucks can you bring back the flat lid ple… | -0.015625 | neutral |
2491 | @Starbucks If I say a bad word here, will I st… | -0.185000 | negative |
2879 | I like that @Starbucks finally has a fall drin… | -0.200000 | negative |
2659 | Starbucks barista teaches how to make poisonou… | 0.000000 | neutral |
591 | @TheAvayel @Starbucks and breathe….\n\nI am … | 0.000000 | neutral |
1103 | @katiecouric What’s his favorite @Starbucks dr… | 0.500000 | positive |
867 | @dmcdonald141 @Starbucks Oh yes!!!! Jealous t… | 0.000000 | neutral |
2972 | @Skitts01 @Starbucks Haha fuck wad got fired | -0.100000 | negative |
It looks good overall.
Step #6: Explore the results
Congratulations if you’ve made this far!
In this final step, we’ll explore the results with some plots. The application of the results depends on the business problems you are trying to solve.
Sentiment by the hour of day
Maybe you want to know how the Twitter sentiment changes across the day?
We can certainly plot the number of negative, neutral, or positive tweets by the hour of day.
Let’s obtain the dataset first and print it out to take a look.
created_at_hour | predicted_sentiment | cnt | |
---|---|---|---|
0 | 2020-09-01 19:00:00+00:00 | negative | 25 |
1 | 2020-09-01 19:00:00+00:00 | neutral | 67 |
2 | 2020-09-01 19:00:00+00:00 | positive | 30 |
3 | 2020-09-01 20:00:00+00:00 | negative | 33 |
4 | 2020-09-01 20:00:00+00:00 | neutral | 90 |
… | … | … | … |
64 | 2020-09-02 16:00:00+00:00 | neutral | 67 |
65 | 2020-09-02 16:00:00+00:00 | positive | 18 |
66 | 2020-09-02 17:00:00+00:00 | negative | 13 |
67 | 2020-09-02 17:00:00+00:00 | neutral | 25 |
68 | 2020-09-02 17:00:00+00:00 | positive | 8 |
69 rows × 3 columns
We’ll use Plotly Express to plot the count of tweets by hour. You may use other plotting packages of your preference.

WordCloud
Another popular visualization is the word cloud, which shows us the keywords. Let’s see how to make it using our Starbucks dataset.
First, we can install and import the necessary packages.
Then we can follow the code below and plot it.
This looks nice!
We can see the recent trends (popular words) that were tweeted related to the Starbucks brand.

Besides looking at Starbucks only, you can also try comparing it with other popular coffee brands over time to see brand resilience.
In this tutorial, you’ve learned how to apply Twitter sentiment data analysis using Python.
Now it’s your turn to try it out!
We’d love to hear from you. Leave a comment for any questions you may have or anything else.
Further Reading: How to do Sentiment Analysis with Deep Learning (LSTM Keras)
A tutorial showing an example of sentiment analysis on Yelp reviews: learn how to build a deep learning model to classify the labeled reviews data in Python.