Introduction:

This report is a technical and commercial report at the same time. In this report, we will explain how we performed analysis and give some basics insights regarding different analysis.

This project was done the following way: We analyzed the tweets from South West Airlines company by using the Twitter API with RTweet.

We divided the analysis in few differents topics:

Global Analysis:

The Global analysis contains all the tweets we collected related to South West Airlines. Wether it’s using hashtag or mentionning the brand or even just talking about the brand without mentionning it.

First, let’s check where are located the tweets that talk about the brand:

Now, let’s check the most comon words that are related to the company:

Hashtag Analysis:

In this analysis, we focused on 3 hashtags:

We analyzed the number of tweets, their feeling using Bing & Afinn dictionary and create topic modeling in order to determine what are the mains subjects related to these hashtags.

Here, we present some graphics that are related to the three hashtags tweets merged but if you want to look closer on a spcific hashtag, you can go on the dashboard and use the interactive filter: https://sebastienpavot.shinyapps.io/SouthWestAirlines_Dashboard/

Let’s have a look on the sentiment using Bing dictionary:

We divided the topics of the hashtag into 4 differents topics using a topic modeling algorithm:

We can see that the first topic is mainly related to the airline industry, while the second is more focused on questioning. We can guess this one is related to people having question about the company or flights. The third one is linked to job offers as the company publish tweets for job offers (not only the company, some recruiters also publish offers on their own account for the company). The last one is more about travel but this one is hard to interprate. We can guess that this one is about everything the algorithm couldn’t classify in the three first topics.

Arrobas Analysis:

For this analysis we used the get_timeline function focusing on one mention(“@SouthWestAir”). As a result of this search we obtained the replies and the retweets from southwest. We found it could be interesting to focus in a an analysis of the replies as follows:

Also using the dictionary “bing” to have a sentiment analysis for the replies made by Southwest. As a result there is a higher amount of words which conribute to a positive sentiment. Therefore with this information it can be inferred that how Southwest Airlines concludes its interaction with users in twitter end positively.

Competitors Analysis:

For analysing the competitors a search was made using the search_tweets funtion from the rtweets package. So we searched tweets including the name of the competitors account (United,American Airlines, Jetblue and Delta) and the word "Southwest. This was to see the interaction that the competition has when they refer to SouthWest. The topic analysis output from this approach is as follows:

AS a further analysis and to view the interaction between the competitors and southwest. We built a network using a bigram tokenization, which has as an origin the name of the competitors or another word of interest:

Memberships:

We intended to make an analysis based on the list memberships of the followers (sites of interest that they have). Although we encountered an issue because most of them where written in languages other than english. The languages encountered where the following:

Unmentioned Analysis:

Do sentiment analysis using “bing” dictionary to see how people feel about the southwest airline.

This part is mainly focus on when poeple tweet a lot, and to see the differences of post time/poportion form different tools. First plot is about how people twitt through a day. People are most likely to post in the morning, and people which use iphone post most.

This plot how people twitt during the month, we can see that people like to twitt at the begining of the month and the post number reached the peak at the end of the month.

This image shows the time during the week that people are more likely to twitt. It shows that people posted a lot at the middle of the week.

This shows the post data at the begining of 2020 and the poportion of the twitt reduced when time went by.

This shows if people “quote” when they twitt. What ever tools do people use, they always quote. Besides, from plots all above, we could say most people twitting with iphone and the number is much more bigger than people who use Twitter of Android and web app.