AI Summit — Session & Speakers
Track B - Session 8

The Journey to Scaling Up AI Inference

AI Summit Seoul 2025 30 min

Session Overview

As adoption of generative AI accelerates and agentic AI systems add new inference demands, the greatest challenge lies in scaling workloads from prototypes to production, where costs, latency, and GPU management complexity often stall deployment. This talk explores essential strategies such as quantization, batching, caching, and hardware aware optimization that bridge the gap between research performance and production grade performance and reliability. Drawing on lessons from large scale deployments, we highlight how these strategies enable developers to achieve higher throughput, lower costs, and predictable outcomes. We conclude by showing how these principles are realized in FriendliAI, powered by a purpose built inference stack that abstracts infrastructure complexity and consistently delivers unmatched performance at scale.

Speaker

Jeon Byung-gon
CEO
FriendliAI

Jeon Byung-gon, Founder & CEO of FriendliAI, is a leading innovator in AI inference platforms, bridging research and industry. He oversees the platform architecture and core technologies for efficient model deployment, inference optimization, and operational automation. His background spans academia and top tech labs: Professor of Computer Science and Engineering at Seoul National University; Visiting Researcher at Facebook; Senior Researcher at Microsoft; Researcher at Yahoo! and Intel; and he holds a Ph.D. in Computer Science from UC Berkeley.

Register
Session details may be updated as the event approaches. Final schedule to be announced on the official site.