Follow the passion to Sentiment Analysis in NLP, we conduct our first sentiment analysis project as our first assignment in study.
This project explores the emotional characteristics of texts written by two celebrated British authors, Jane Austen and Charles Dickens, using text mining and sentiment analysis. By obtaining paragraph-level datasets from Kaggle, then filtering and reducing their size, it was possible to analyze:
- Polarity (positive/negative sentiment)
- Subjectivity
The results were visualized using:
- Radar charts
- Box plots
- Word clouds
The main goal of this study is to compare the emotional tendencies in the writing styles of both authors, in order to verify common literary views:
- Jane Austen: Often characterized as humorous and generally positive, with relatively balanced emotional changes.
- Charles Dickens: Known for greater emotional fluctuation, focusing on themes such as social injustice, poverty, and the complexity of human nature.
The findings of this project provide a valuable foundation for:
- Developing automated writing style simulations.
- Performing literary text analysis.
Additionally, the results help us:
- Better understand the authors’ backgrounds and the historical context they wrote in.
- Reflect on how readers today might respond to their works.
This project aims to build upon earlier research by further examining the stylistic and emotional characteristics found in the works of Jane Austen and Charles Dickens.
We discovered:
- Austen’s writing tends to be more positively inclined overall.
- Dickens’s works display greater emotional fluctuation and a higher proportion of neutral sentiment.
We use the same pre-processed dataset to conduct a more in-depth analysis:
- Syntactic structure analysis
- Part-of-speech distribution (adjectives, adverbs, and verbs)
- Sentiment analysis enriched by:
- Compound sentiment indices
- Subjectivity indices
- TF-IDF (Term Frequency-Inverse Document Frequency)
- Named Entity Recognition (NER)
By integrating both traditional and advanced NLP methods, we aim to:
- Present a more comprehensive picture of each author’s stylistic features and emotional tendencies.
- Investigate how different vocabulary, syntax, and characters influence the overall emotional style of the texts.
Through systematic corpus analysis, this project contributes to:
- A better understanding of the emotional dynamics in classic literary works.
- Insights for:
- Developing automated writing style simulations.
- Enhancing text sentiment analysis.
- Supporting literary research.