GitHub - Emdya/Mental-Health-Sentiment-Analysis: This project applies Natural Language Processing (NLP) and machine learning techniques to classify emotional states in mental health-related text. Using a real-world dataset from Kaggle, the model learns to identify emotions such as Anxiety, Relief, Depression, and Optimism based on raw textual input.

📊 Emotion Classification Using LinearSVC

(NLP Pipeline with NLTK + Scikit-learn)

This project applies Natural Language Processing and Machine Learning to classify text statements into emotional categories like Anxiety, Depression, Relief, and more. It uses a combination of NLTK for preprocessing and scikit-learn for modeling and evaluation.

📁 Dataset

This project uses the dataset:
🔗 Sentiment Analysis for Mental Health – Kaggle

💬 About the Dataset:

Contains thousands of labeled statements collected from mental health-related sources.
Each entry has a statement (text) and a status (emotion category).
The dataset focuses on mental health sentiment and reflects a wide range of human emotions like:
- Anxiety
- Depression
- Loneliness
- Optimism
- Gratitude
- Relief

This makes the dataset ideal for building models that help understand emotional expression in real-world mental health contexts.

🧠 What This Code Does

Prepares and cleans text using NLTK tokenization and stopwords.
Converts text into numerical features using CountVectorizer.
Trains a Linear Support Vector Classifier (LinearSVC).
Evaluates the model using accuracy, precision, recall, F1-score.
Uses RandomizedSearchCV to optimize the C hyperparameter.
Outputs the best model and its performance metrics.

📈 What This Code Tells Us About the Data

There are learnable patterns in the language used to describe emotional states.
Even a simple model like LinearSVC achieves ~75% accuracy, showing that:
- Words and phrases strongly correlate with specific emotions.
- Emotional text can be quantified and predicted with solid performance.
The model generalizes well across multiple emotional labels, especially after tuning C to 0.1.

In other words:

The language people use when expressing mental health concerns contains enough signal for a machine learning model to recognize and classify emotions with meaningful accuracy.

⚙️ Pipeline Breakdown

1. NLTK Setup

nltk.download('punkt')
nltk.download('stopwords')

2. Data Preparation

data = pd.read_csv('Combined Data.csv')
X = data['statement'].fillna("").astype(str)
y = data['status']

3. Vectorization

vectorizer = CountVectorizer()
X_train_features = vectorizer.fit_transform(X_train)

4. Training LinearSVC

clf = LinearSVC()
clf.fit(X_train_features, y_train)

5. Evaluation

accuracy_score(y_test, y_pred)
precision_score(...)
confusion_matrix(...)

6. Hyperparameter Tuning

RandomizedSearchCV(..., param_distributions={'C': [...]})

✅ Model Results

Metric	Score
Accuracy	74.5%
Precision	74.1%
Recall	74.5%
F1 Score	74.2%
Best C Value	0.1
Best CV Score	75.2%

🛠️ Getting Started

To run the code:

Download the dataset from Kaggle.
Install required libraries:

pip install nltk scikit-learn pandas

Upload the CSV file and run the notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
Sentiment_Analysis_for_Mental_Health.ipynb		Sentiment_Analysis_for_Mental_Health.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📊 Emotion Classification Using LinearSVC

📁 Dataset

💬 About the Dataset:

🧠 What This Code Does

📈 What This Code Tells Us About the Data

⚙️ Pipeline Breakdown

1. NLTK Setup

2. Data Preparation

3. Vectorization

4. Training LinearSVC

5. Evaluation

6. Hyperparameter Tuning

✅ Model Results

🛠️ Getting Started

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📊 Emotion Classification Using LinearSVC

📁 Dataset

💬 About the Dataset:

🧠 What This Code Does

📈 What This Code Tells Us About the Data

⚙️ Pipeline Breakdown

1. NLTK Setup

2. Data Preparation

3. Vectorization

4. Training LinearSVC

5. Evaluation

6. Hyperparameter Tuning

✅ Model Results

🛠️ Getting Started

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages