Spam Classification

The primary objective of this project is to develop a robust text classification model using natural language processing (NLP) techniques. The model is trained to distinguish between spam and non-spam messages based on the content of email texts. The project is available at https://github.com/Yossranour1996/Text-Classification/tree/main.

Text Processing:

The project utilizes various text processing techniques, including Count Vectorization and Term Frequency-Inverse Document Frequency (TF-IDF), to convert raw email text into numerical features suitable for machine learning models.

Machine Learning Model:

The classification is constructed using two models to compare between them: Linear Support Vector classifier and Naïve Bayes classifier. The models were trained on pipelines that combines TF-IDF vectorization and the classifier.

Evaluation Metrics:

The model's performance is evaluated using standard classification metrics, including the confusion matrix, classification report, and overall accuracy. The results showed that SVC model outperforms NB model.

Skills:

#ScikitLearn #NLP #Pipeline #SVC #NB