Spam Classification
The primary objective of this project is to develop a robust text classification model using natural language processing (NLP) techniques. The model is trained to distinguish between spam and non-spam messages based on the content of email texts. The project is available at https://github.com/Yossranour1996/Text-Classification/tree/main.
Text Processing:
The project utilizes various text processing techniques, including Count Vectorization and Term Frequency-Inverse Document Frequency (TF-IDF), to convert raw email text into numerical features suitable for machine learning models.
Machine Learning Model:
The classification is constructed using two models to compare between them: Linear Support Vector classifier and Naïve Bayes classifier. The models were trained on pipelines that combines TF-IDF vectorization and the classifier.
Evaluation Metrics:
The model's performance is evaluated using standard classification metrics, including the confusion matrix, classification report, and overall accuracy. The results showed that SVC model outperforms NB model.
Skills:
#ScikitLearn #NLP #Pipeline #SVC #NB