A Comprehensive Theoretical and Empirical Framework for Fine-Tuning the CoRover's BharatGPT Transformer for Indic Languages

Author(s)	Ankush Sabharwal, Vikas Tripathi, Onkar Nath
Country	India
Abstract	The widespread adoption of transformer-based models in natural language processing (NLP) has led to significant breakthroughs in numerous languages. However, models like BharatGPT (from CoRover.a) - though robust for high-resource languages - require specialized adaptation to effectively handle the rich morphological and syntactic diversity of Indic languages. In this paper, we propose a comprehensive framework for fine-tuning the BharatGPT transformer to support Indic languages. Our approach integrates tailored data preprocessing, script-specific embedding enhancements, and rigorous convergence analysis. We derive key theoretical properties of the fine-tuning algorithm, including a convergence theorem under Lipschitz continuity and bounded gradient variance assumptions, and we validate our approach with empirical evaluations using standard metrics such as perplexity, BLEU, and F1 score. The results demonstrate significant improvements across several Indic languages, thereby underscoring the effectiveness of our methodology.
Keywords	BharatGPT, CoRover, Fine Tuning, LLM, Indic Languages, AI,
Field	Computer > Artificial Intelligence / Simulation / Virtual Reality
Published In	Volume 7, Issue 1, January-February 2025
Published On	2025-02-15
DOI	https://doi.org/10.36948/ijfmr.2025.v07i01.37188
Short DOI	https://doi.org/g847z5

About IJFMR Fees & Payment Current Issue Publication Archive	Submit Research Paper Track Submission Status Publication Guidelines Publication Ethics Peer Review & Plagiarism	Join as a Reviewer Editors & Reviewers Reviewer Referral Program Get Reviewer Membership Certi.	Website/Journal Policies Usage Policy Content Policies Privacy Policy

Contact Us		+91-9687-828-838	editor@ijfmr.com

International Journal For Multidisciplinary Research