International Journal For Multidisciplinary Research

E-ISSN: 2582-2160     Impact Factor: 9.24

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 7, Issue 2 (March-April 2025) Submit your research before last 3 days of April to publish your research paper in the issue of March-April.

AI-Powered Multilingual OCR System for Multi-Language Translation and Summarization

Author(s) Prof. Dr. SAKTHIVEL M, Ms. AKSHATHAA SA, Ms. YUVASRI V, Mr. SETHURAMAN C R, Ms. SREEJITHA S
Country India
Abstract The Optical Character Recognition System has become a major machine technology to produce and manipulate texts from images so that it can achieve a stage of digital accessibility and information retrieval. This study aims at meeting challenges such as extracting and processing handwritten multilingual texts from images with high accuracy. Conventional OCR systems usually are not capable of performing recognition operations accurately on handwritten documents, thus resulting in errors or problems during the text extraction process and next translation activities. Therefore, in this paper, we have proposed a state-of-the-art AI-based OCR solution with Google's Gemini 2.0 Flash, enhancing accuracy on text detection, language identification, translation, and summarization through deep learning technologies. This system, unlike normal OCR systems, is based on multimodal vision transformers for reliable text recognition across designs of handwriting and as such enables the entire process to be an efficient multilingual processing pipeline. The system has been rigorously tested across real-life handwritten samples and shows a great extent of promise in the area of correctness in text extraction, language detection, and translation quality. The system attained an OCR of 92.5% and a language detection success rate of 95%. This solution opens pathways of tremendous access possibilities into handwritten contents, presenting a potential boon for multilingual communication, education options, and cross-cultural informational exchange. It would allow seamless operation of the developing system across the web and mobile platforms.
Keywords Multilingual OCR, Handwritten Text Recognition, Google Gemini 2.0 Flash, Deep Learning, AI-based Translation, Language Detection, Text Summarization, Natural Language Processing, Image Processing, Flet Mobile and Web Application.
Field Engineering
Published In Volume 7, Issue 2, March-April 2025
Published On 2025-04-17
DOI https://doi.org/10.36948/ijfmr.2025.v07i02.41596
Short DOI https://doi.org/g9f4sg

Share this