This repository contains a sentiment analysis model implemented in Python using Natural Language Processing (NLP) techniques. The model is built in a Jupyter notebook and demonstrates various steps such as data preprocessing, vectorization, and classification of text data into positive or negative sentiment.
This project aims to build a sentiment analysis model using Python and NLP techniques. The model is designed to classify text data into positive or negative sentiment categories. The Jupyter notebook provided in this repository contains the step-by-step implementation of the model.
To run this project locally, follow these steps:
-
Clone the repository:
git clone https://github.com/Umayanga12/Sentiment-Analysis-NLP.git
-
Navigate to the project directory:
cd Sentiment-Analysis-NLP
-
Install the necessary packages:
pip install pandas numpy scikit-learn nltk
-
Download NLTK stopwords (if not already installed):
import nltk nltk.download('stopwords')
To run the sentiment analysis model, open the Jupyter notebook Sentiment_Analysis_using_NLP.ipynb
in your local environment:
- Start Jupyter Notebook:
jupyter notebook
- Open
Sentiment_Analysis_using_NLP.ipynb
and run the cells sequentially.
The notebook will guide you through the process of text preprocessing, vectorization using TF-IDF, and building the sentiment analysis model.
Sentiment_Analysis_using_NLP.ipynb
: The main Jupyter notebook containing the implementation of the sentiment analysis model.
-
Data Preprocessing:
- Convert text to lowercase.
- Remove punctuation.
- Remove stopwords.
- Apply stemming using
PorterStemmer
.
-
Feature Extraction:
- Convert the preprocessed text into numerical features using
TfidfVectorizer
.
- Convert the preprocessed text into numerical features using
-
Model Training:
- Train a machine learning model (e.g., Logistic Regression, SVM) using the TF-IDF features to classify sentiment.
-
Evaluation:
- Assess the performance of the model using metrics such as accuracy, precision, recall, and F1-score.
The results of the model are displayed in the notebook, including visualizations and performance metrics. The model successfully classifies text into positive or negative sentiment with high accuracy.
This project is licensed under the MIT License. See the LICENSE file for more details.