International Journal For Multidisciplinary Research

E-ISSN: 2582-2160     Impact Factor: 9.24

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 7, Issue 1 (January-February 2025) Submit your research before last 3 days of February to publish your research paper in the issue of January-February.

The Future of AI in Production: Leveraging Kubernetes for Large Language Model Deployment

Author(s) Shikhar Srivastava, Harsh Srivastava, Ayushi Jaymani, Palak Singh
Country India
Abstract Deploying Large Language Models (LLMs) at scale presents significant challenges in resource allocation, cost-efficiency, latency, multi-cloud compatibility, and system reliability. This paper introduces a transformative approach leveraging Docker’s lightweight containerization and Kubernetes’ robust orchestration to redefine LLM deployment. Our proposed architecture ensures seamless scalability, optimal resource utilization, and multi-cloud flexibility, while addressing ethical, environmental, and security concerns. Through compelling case studies, we demonstrate how these technologies revolutionize AI workflows, delivering unmatched performance, cost savings, and operational excellence for large-scale LLM production systems.
Keywords Large Language Models, Docker, Kubernetes, Containerization, Orchestration, Multi-cloud Deployment, AI Workflows, Cloud Computing, Resource Management, Cost Efficiency, Security Best Practices, Ethical Considerations.
Field Engineering
Published In Volume 7, Issue 1, January-February 2025
Published On 2025-01-29
Cite This The Future of AI in Production: Leveraging Kubernetes for Large Language Model Deployment - Shikhar Srivastava, Harsh Srivastava, Ayushi Jaymani, Palak Singh - IJFMR Volume 7, Issue 1, January-February 2025. DOI 10.36948/ijfmr.2025.v07i01.36056
DOI https://doi.org/10.36948/ijfmr.2025.v07i01.36056
Short DOI https://doi.org/g834d3

Share this