This paper presents a comparative analysis of machine learning algorithms — including Naive Bayes, SVM, Logistic Regression, and Random Forest — combined with text vectorization techniques such as TF-IDF and Count Vectorizer. The study evaluates their effectiveness in detecting spam messages across benchmark datasets, identifying optimal model-vectorizer pairings.
Machine LearningSpam DetectionNLPTF-IDFSVMNaive Bayes
An in-depth comparison of supervised machine learning algorithms — including Decision Trees, Random Forest, KNN, and Logistic Regression — applied to breast cancer classification using the Wisconsin Diagnostic Breast Cancer (WDBC) dataset. The paper benchmarks accuracy, precision, recall, and F1-score to identify the most reliable model for clinical decision support.
Machine LearningBreast CancerClassificationHealthcare AIPythonscikit-learn