International Journal For Multidisciplinary Research

E-ISSN: 2582-2160     Impact Factor: 9.24

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 7, Issue 2 (March-April 2025) Submit your research before last 3 days of April to publish your research paper in the issue of March-April.

Duplicate Question Detection in Q&A Platforms: A Comparative Study of Traditional and Deep Learning Approaches

Author(s) Jwalin Thaker
Country India
Abstract With the world getting more and more connected, the amount of data being generated is also increasing at an alarming rate. Identifying data duplicacy and relevancy is a very important task in the field of data science and an interesting problem. This paper dives into one of the biggest data pool of questions and answers present on the internet, Quora, and presents a comprehensive analysis of duplicate question detection using Quora’s question-pair dataset. We compare traditional machine learning approaches (Random Forest, XGBoost) with modern deep learning architectures (LSTM networks) for semantic similarity detection. Our experiments demonstrate that LSTM- based models achieve superior performance (78% validation accuracy) compared to conventional methods (72-74% accuracy), highlighting the importance of sequence modeling for natural language understanding tasks. The study provides insights into fea- ture engineering challenges, model scalability, and computational trade-offs in real-world NLP applications.
Keywords Duplicate Question Detection, LSTM, Random Forest, Xgboost, Natural Language Processing
Field Engineering
Published In Volume 3, Issue 2, March-April 2021
Published On 2021-03-06
DOI https://doi.org/10.36948/ijfmr.2021.v03i02.38544
Short DOI https://doi.org/

Share this