OxTalks will soon move to the new Halo platform and will become 'Oxford Events.' There will be a need for an OxTalks freeze. This was previously planned for Friday 14th November – a new date will be shared as soon as it is available (full details will be available on the Staff Gateway).
In the meantime, the OxTalks site will remain active and events will continue to be published.
If staff have any questions about the Oxford Events launch, please contact halo@digital.ox.ac.uk
Large language models (LLMs) have taken the world by storm, enabling new applications, intensifying GPU shortages, and raising concerns about the accuracy of their outputs. In this talk, I will present several projects I have worked on to address these challenges. Specifically, I will focus on Ray, a distributed framework for scaling AI workloads, vLLM and SGLang, two high-throughput inference engines for LLMs, and LMArena, a platform for accurate LLM benchmarking. I will conclude with key lessons learned and outline directions for future research.
Professor Ion Stoica is a Professor in the Electrical Engineering and Computer Science Department at the University of California, Berkeley, and holds the Xu Bao Chancellor Chair. He is the Director of the Sky Computing Lab, and the Executive Chairman and Co-founder of Databricks, and Anyscale. Professor Stoica’s current research focuses on AI systems and cloud computing. His work includes open-source projects vLLM, SGLang, Chatbot Arena, SkyPilot, Ray and Apache Spark. He is a Member of the National Academy of Engineering, an Honorary Member of the Romanian Academy, and an ACM Fellow.