Skip to content

A dataset which contains 30k+ so called "self-help" tweets from 100+ authors.

License

Notifications You must be signed in to change notification settings

Hsankesara/The-Tweets-of-Wisdom

Repository files navigation

The-Tweets-of-Wisdom

Context

In the last few years, Twitter became one of the most popular social media platforms. From celebrity status to government policies, Twitter can accommodate a diverse range of people and thoughts. In these diverse set of thoughts, there are many Twitter accounts who tweet "self-help" thoughts often. These "self-help" thoughts are often related to improving one's life and how to excel at what you're doing. So I went down to the rabbit-hole to search these sorts of tweets. I find many common themes between them. Therefore, I decided to scrap the tweets so that you can explore the words of so-called "self-help" tweets and understand them much better.

Content

I scraped the data using Tweepy API. I have scraped all the tweets, retweets and retweets with a comment of 40 authors. The data contains more than 40 authors because every retweet from any of the 40 authors is stored as a tweet from the original author. Also, every retweet with a comment contains <Q> and </Q> tags. The author's comment is followed by <Q> tag and then the content of the retweet comes which is followed by </Q>. I've also uploaded the same dataset on Kaggle so that you can explore it easily. Here is the link to the dataset there. Happy Exploring.

Acknowledgements

I would like to thanks Stack Overflow which helped me at literally every stage of this project from scrapping to data analysis. Also kudos to the Tweepy API which made it far more easier to fetch tweets.

Inspiration

I downloaded this dataset for many reasons. The most important one is that I want to know how similar these tweets are. Also, I like to know what makes some tweets viral and what factors affect a viral tweet. I explore these and many more questions in my notebook which you can find in the repository or find it here on Kaggle.

Contact Me

About

A dataset which contains 30k+ so called "self-help" tweets from 100+ authors.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published