International Journal For Multidisciplinary Research
E-ISSN: 2582-2160
•
Impact Factor: 9.24
A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal
Home
Research Paper
Submit Research Paper
Publication Guidelines
Publication Charges
Upload Documents
Track Status / Pay Fees / Download Publication Certi.
Editors & Reviewers
View All
Join as a Reviewer
Reviewer Referral Program
Get Membership Certificate
Current Issue
Publication Archive
Conference
Publishing Conf. with IJFMR
Upcoming Conference(s) ↓
WSMCDD-2025
GSMCDD-2025
Conferences Published ↓
RBS:RH-COVID-19 (2023)
ICMRS'23
PIPRDA-2023
Contact Us
Plagiarism is checked by the leading plagiarism checker
Call for Paper
Volume 6 Issue 6
November-December 2024
Indexing Partners
Enhancement of Logistic Regression Algorithm Applied in Email Spam Detection
Author(s) | Vince Anthony S. Carlos, John Cedric C. Pancho, Vivien A. Agustin |
---|---|
Country | Philippines |
Abstract | Logistic regression is a popular binary classification approach, but like any machine learning algorithms, it has its limitations and possible concerns such as class imbalance, large datasets, and overfitting, which reduce its accuracy and efficiency. This study enhanced the Logistic Regression algorithm's performance for email spam detection by addressing these problems using the techniques of Term Frequency-Inverse Document Frequency for class imbalance, Recursive Feature Elimination for large datasets, and Principal Component Analysis for overfitting concerns. TF-IDF improves feature representation, highlighting key terms that differentiate spam from non-spam. RFE systematically eliminates irrelevant features, reducing computational complexity and enhancing efficiency, particularly for large datasets. PCA mitigates overfitting by reducing the dimensionality of feature spaces, ensuring the model generalizes effectively to unseen data. The enhanced Logistic Regression model demonstrated a significant improvement in spam detection accuracy, achieving up to 98% accuracy with TF-IDF. RFE reduced training time while maintaining robust performance on large datasets, and PCA improved model generalization, reducing overfitting risks. The proposed enhancements successfully address the key limitations of traditional Logistic Regression models in spam detection. This refined approach improves predictive accuracy, computational efficiency, and robustness, making it highly applicable to real-world email security systems. |
Keywords | Logistic Regression, Spam Detection, TF-IDF, Recursive Feature Elimination, Principal Component Analysis, Machine Learning |
Field | Computer |
Published In | Volume 6, Issue 6, November-December 2024 |
Published On | 2024-12-17 |
Cite This | Enhancement of Logistic Regression Algorithm Applied in Email Spam Detection - Vince Anthony S. Carlos, John Cedric C. Pancho, Vivien A. Agustin - IJFMR Volume 6, Issue 6, November-December 2024. DOI 10.36948/ijfmr.2024.v06i06.32374 |
DOI | https://doi.org/10.36948/ijfmr.2024.v06i06.32374 |
Short DOI | https://doi.org/g8wkpr |
Share this
E-ISSN 2582-2160
doi
CrossRef DOI is assigned to each research paper published in our journal.
IJFMR DOI prefix is
10.36948/ijfmr
Downloads
All research papers published on this website are licensed under Creative Commons Attribution-ShareAlike 4.0 International License, and all rights belong to their respective authors/researchers.