KubeCon Paris 2024: Exploring the AI Frontier

As the plane touched down in Paris, the anticipation of KubeCon Europe 2024 surged through me. This trip was a dive into the next wave of tech evolution, especially into the heart of AI advancements in the Kubernetes ecosystem. The theme for this year’s conference, GenAI. The stage was perfectly set for a conference that promised to be not just informative, but transformative.

The nights leading up to my trip were restless, filled with intense preparation for my presentation AI Assisted Runbooks – Instigating Precision and Efficiency in Kubernetes Operations, focused on implementing local Llama 7B RAG on AWS GPU systems and using gopaddle + AI integration to troubleshoot Kubernetes issues. However, the thrill of what was to come far outweighed my exhaustion. Arriving in Paris, I was greeted by the vastness of its airport and an unexpected challenge – the language barrier. Yet, the sight of KubeCon banners instilled a sense of pride and excitement in me.

The Grand Demo Day 🙂 

KubeCon Europe had become the prime spot for enthusiasts eager to delve into the convergence of Kubernetes and AI. My talk on February 20th showed just how keen everyone was, with more than 620 people signing up and the room filled to the brim. Even though AI was something new for a lot of folks there, their enthusiasm to learn about setting up AI locally on their infrastructure was clear.

My talk focused on creating an IT-compliant governance framework for AI interactions, followed by a demonstration of obtaining precise responses from Llama 2 7B using a RAG pipeline to troubleshoot Kubernetes issues. I developed a dataset from various Kubernetes sources and devised a RAG pipeline. Utilizing an AWS GPU instance and a preconfigured AMI with CUDA drivers, I demonstrated a Pod failure, comparing the responses of Llama2 7B with those from my RAG pipeline. The results showed that my RAG pipeline yielded much more accurate responses. I will get into the specifics of the implementation in a separate blog post.

Key Highlights

A Decade of Kubernetes and the AI Revolution

Priyanka’s keynote on February 20th celebrated a notable landmark in Kubernetes history: we will hit the 10th anniversary of Kubernetes on 6th June. With an attendance of over 12,000 people, it was clear that we were part of the largest KubeCon ever. Priyanka emphasized on the journey from development to production, the scaling challenges, and the ongoing AI revolution. Her demo of an interference model to capture the audience and convert the image to a text was quite interesting.

The introduction of the Cloud Native AI Workgroup and the AI architecture white paper were key reference points. In Priyanka’s concluding remarks, she highlighted collaborations and advancements within the Kubernetes and AI community by introducing – Jeffrey Morgan, the founder of Ollama; Paige Bailey from Google; and Timothee Lacroix, the founder of Mistral.ai. It was indeed interesting to learn that Mistral has been using Kubernetes for training, finetuning and inferencing. 

Accelerating AI with Kubernetes

A keynote that resonated with me was by Kevin Klues & Sanjay Chatterjee from NVIDIA on “Accelerating AI Workloads with GPUs in Kubernetes“. Their insights into GPU sharing, dynamic resource allocation, and the intricacies of scheduling AI workloads at scale and the need for topology aware scheduling, fault tolerance and multi-dimensional optimization were eye-opening.

The AI Hub

As a continuation of the keynote, the AI Hub hosted a round table with Jeffrey, Paige, and Timothee. One of the primary concerns raised by the audience was the integrity of the data, compliances around that and the methods for evaluating the accuracy of the models. Jeffrey did a demo of Ollama. Ollama’s approach to describing models akin to Dockerfiles and its model registry resonated with familiar Docker concepts. Paige announced Gemma, an open-source model from Google and showcased the potential of AI and customized Jupyter notebooks on Google Colab to use Gemma. 

What amazed me was Janakiram’s demo of a RAG pipeline for a hypothetical airline customer experience chatbot. I learned that he had set up a full-stack private infrastructure with GPUs and had Kubernetes running the AI models.

Reflections

As I reflect on my time at KubeCon, the excitement and inspiration from those few days continue to fuel my passion for AI and Kubernetes. The journey was a testament to the power of community, innovation, and the relentless pursuit of knowledge. KubeCon Paris 2024 was, undoubtedly, a milestone event in the journey of Kubernetes and AI, marking the beginning of a new chapter in cloud-native technologies.

Leave a Reply

Discover more from Low-Code Kubernetes IDE with AI Assistant

Subscribe now to keep reading and get access to the full archive.

Continue reading