A Sentiment and Tonal Analysis of the First Presidential Debate

It’s been about a week since the first debate between Donald Trump and Hillary Clinton, and data scientists have gone to TOWN on the transcript of their debate. (Check out analyses here, here, here, and here for some awesome examples). The majority of these analyses focused on which candidate spoke the most, which words did each candidate use and how often, and how long each candidate spoke.

Rather than rehash some of the same topics, I wanted to look at the sentiment (to what extent were a candidates statements positive or negative?) and tone with which each candidate spoke (emotional, language, and social tone). To do this, I used Columbus Collaboratory’s CognizeR package, which calls on IBM Bluemix services. Sentiment will allow me to examine what the overall positivity and negativity of the whole debate was, and which candidate went more in each direction. The tonal analysis will allow me to look into how each candidate tries to get their ideas across. The data I used was from this Kaggle Kernel

In terms of sentiment, the distribution shows that Trump’s statements tended to be more negative than Clinton’s. The distribution of his statements has a large (yuge?) peak in the negative score range, with a tiny peak in the positive range. Clinton’s distribution is more even, there’s about an equal peak for both positive and negative statements.sentdist

Overall, Trump had 84 negative statements and 30 positive  (2.8 negative per 1 positive) while Clinton had 44 negative statements and 36 positive (1.2 to 1). At a general level, this is not surprising given each campaign’s orientation to American Greatness. If America needs to be “Great Again”, then a candidate is likely to point out all the things currently wrong in the country and vice versa for a candidate who thinks America is “Already Great”. Even taking that all into account, however, it is still a bit staggering to see Trump giving twice as many negative statements as Clinton.sentscore

In terms of emotion tone (think Pixar’s Inside-Out), there wasn’t as large a discrepancy as there was in sentiment, but there still were some interesting differences to point out. The largest difference between the two candidates was Clinton having a higher ‘joy’ score than Trump, even though both scores relatively low on the measure. I was expecting to see a bigger difference in the ‘disgust’ emotion, but one might have to dig into each candidate’s tweets and subsequent about Alicia Machado to find that.emotionaltone

For language tones (examining a Candidate’s speaking style) , Clinton scores higher in ‘Analytic’ and ‘Tentative’ scores, and Trump narrowly beats her out in ‘Confidence’. Again, this is not surprising. Depending on who you talk to, Clinton’s analytic style is one of her biggest strengths or weaknesses; she brings in a lot of facts and specific policies, but some interpret that as ‘lecturing’ or ‘speaking down’ to audiences. I’m not quite sure exactly what to make of the difference in tentativeness. My guess is that tentativeness goes hand in hand with an analytic mindset, the more you examine something, the more you realize what you do and don’t know about it, and then you have to communicate with that in mind. I’m certainly open to other interpretations of that.languagetone

Finally, differences in social tones (adopted from Big Five personality traits) show that Trump had higher scores for Agreeableness, Emotional Range, and Extraversion while Clinton had higher scores for Conscientiousness and Openness. I was surprised by Trump’s higher score on Agreeableness, as he seemed to be more flustered as the debate went on. The overall pattern that stand out to me (and it’s present in the Social Tones graph as well as the Emotional Tones graph) is that for tones with higher values, Trump scores higher than Clinton, but for tones with lower values, Clinton scores higher than Trump. In other words, if you put a line through the center of those two graphs, Clinton would always be closer. I think this fits into the main narratives for and against each candidate’s personality: Clinton is less variable than Trump. If you want to stay the course, Clinton is likely for you, if you want to shake things up, Trump is your guy.

socialtone

I hope this has been informative for you all. It’s always a lot of fun to dig into this kind of data and extract insights. As I noted previously with my Colin Kaepernick Analysis, using IBM Watson within R makes this kind of text analysis a breeze. For anyone who wants to work with this data WITHOUT having to query IBM Bluemix yourself, I’ve added the dataset with each candidate’s statements, their sentiment, and each tonal score (along with my code) to my Github

 

A Sentiment and Tonal Analysis of the First Presidential Debate

Leave a comment