More SQL

Examine the data in the tweets table that you have imprted into the trump.db

The data in this table is a list of > 3000 tweets by Donald Trump from 2009 to 2019.

Each tweet has been analysed to give it a sentiment score (whether it is positive or negative).

The Sentiment column indicates the overall sentiment, i.e. Posive, Negative or Neutral

The Polarity column provides a numerical score ranging from -1 (very negative) through 0 (Neutral) to 1 (very positive)

These scores allow us to perform some analysis of the data using SQL

select 
min(Polarity),
avg(Polarity),
max(Polarity)
from tweets;

This returns the most positve, negative and average sentiments of the tweets. As you can see they vary from very negative to very positive with the average being inclined to positive.

The average Polarity is around 0.18

But what if we change the query to filter the tweet by a particular piece of text.

select 
avg(Polarity) from tweets
where Text LIKE '%Clinton%';

Using the wildcard % before and after our term ensures that it finds any reference to it in the tweet text.

You can see from the result of this query that the Polarity of the tweet has dropped significantly to about 0.003

Experiment by editing this query with your own keywords, eg

select 
avg(Polarity) from tweets
where Text LIKE '%Ivanka%';
select 
avg(Polarity) from tweets
where Text LIKE '%Obama%';

You can also use the % wildcard just to examine tweets containing a specific term

select 
* from tweets
where text LIKE '%impeach%' 
order by created;

The results of any of these queries can be easily saved to a CSV file:

Just click on the button we used earlier to save a View, but this time select 'Export to CSV'

Save View