Recap: 2nd vLLM Korea Meetup 2026
The weather is getting warmer, and outdoor activities are picking up. That also means more in-person meetups where we get to see familiar faces! 😊
On April 2, 2026, SqueezeBits co-hosted the 2nd vLLM Korea Meetup alongside Rebellions, Red Hat, and the PyTorch Korea User Group. vLLM drew huge developer interest last year, with two hands-on sessions in the second half alone. We rolled up our sleeves to support the growth of Korea's AI developer community. Despite being a weekday evening, the event received over 300 pre-registrations. Attendees brought enthusiastic energy as the sessions covered everything from research to real-world use cases, wrapping up on a high note.
The Present and Future of vLLM and the Importance of Production
Over the past six months, the vLLM ecosystem has grown stronger through tight integration with Korean hardware accelerators.
Hongseok Kim, CSA at Rebellions, kicked off the first talk. Rebellions released dedicated hardware plugins for their neural processing units (NPUs), ATOM and REBEL, making a major contribution to the ecosystem. Their next-generation NPU lineup, REBEL100, drew particular attention for matching NVIDIA H200-level performance. He also highlighted how the vLLM KR community has been hosting regular hands-on workshops, building a solid foundation for Korean developers to continuously learn and share the latest technologies.
In the second session, Li Ming from Red Hat shared global project contributions and various efforts to improve usability. The "vllm-playground" project grabbed attention; it lets anyone test vLLM features and visualize performance on the web without a complex setup. The session showed the global community's effort to lower barriers: expanding hardware support to include Rebellions NPUs, and strengthening integration with Red Hat OpenShift AI, an enterprise-grade AI platform.
In the third session, Taesu Kim, CTO of SqueezeBits, presented on "vLLM Production Stack." He covered what vLLM Production Stack offers in real production environments and how it plans to scale going forward. vLLM has moved beyond simple model serving. With dramatic speed improvements through LMCache and cost optimization via Scale-to-zero, it now delivers the operational capabilities and scalability that production services demand.
A First: Two-Track Meetup with Diverse Sessions!
After the shared sessions, the dedicated technical sessions began! The highlight of this meetup was the two-track format, where attendees could choose sessions based on their interests. One track covered technical challenges in the open-source ecosystem, while the other explored real-world business cases. We worried about trying this format for the first time, but it let us deliver a wide spectrum of knowledge within a limited timeframe and maximize attendee satisfaction.
Track 1: vLLM in the Open-Source Ecosystem
In the first session, Jooho Lee from XCENA presented KV Cache optimization strategies for resolving data processing bottlenecks. He went beyond simply speeding up computation. He proposed a "memory-centric" infrastructure direction that uses CXL memory and LMCache to store and reuse large-scale data more efficiently. This case showed that large language model (LLM) infrastructure is evolving beyond basic compute optimization.
Next, Inseo Song from Upstage shared the journey of turning their open-source model "Solar" into a production service. Upstage is also leading Korea's Independent AI Foundation Model initiative. He emphasized that engineering matters just as much as model training. He shared practical know-how on handling diverse user requirements reliably through Chat Template design and vLLM integration.
Track 2: Real-World Business Cases with vLLM
Track 2 opened with Sungsu Kim from Samsung Electronics tackling the most sensitive issue in enterprise environments: security. Semiconductor companies face particular challenges using external AI services. He shared how Samsung built a private LLM on in-house GPU infrastructure and deployed an air-gapped operation strategy serving over 4,000 employees. He candidly shared the real-world difficulties they encountered, which resonated deeply with the audience.
In the final session, Jaeeun Gil from Naver Cloud presented the serving case for the HyperCLOVA X Omni model. To serve this multimodal model that processes text, images, and audio, the team adopted a Disaggregated Serving architecture that separates each stage independently. She also shared how they optimized bottleneck points to achieve over 3x performance gains. This talk confirmed that LLM serving is expanding into a complex pipeline optimization challenge.
It was a meaningful event! The open-source ecosystem's efforts to mature vLLM's technology came together with real-world cases where teams tackled both security and performance.
Meetup Preparation and High Satisfaction
Thanks to thorough preparation, attendee satisfaction across the board was amazing. Many shared that they want to join the next vLLM event. After all, it is rare to find a community-driven event purely for the vLLM community, rather than corporate promotion.
Every session had nonstop questions. During breaks and networking time, attendees exchanged technical ideas and showed passion well into the evening. I couldn't help feeling a bit proud seeing that energy.
Reflections from the Organizing Team
This meetup also demonstrated the community's growth through volunteer staff. Beyond the SqueezeBits and Rebellions organizing teams, we recruited four volunteer staff members via LinkedIn — all passionate about vLLM.
I worried at first about coordinating so many people. But the volunteer staff helped across every stage, from pre-event promotion to event logistics and wrap-up. Their contributions truly made this community meetup shine.
I loved being part of a vLLM community event I had always been interested in. As a staff member, I got to see how the meetup comes together behind the scenes and talk with people who share the same interests. It was a great learning experience for me, too. Glad to hear the event ran smoothly.
It was a wonderful time gaining insights from such a valuable event and contributing in a small way. Watching speakers and attendees network freely left a real impression. If vLLM Meetup grows into an ongoing community and recruits staff again, I would definitely apply.
Community growth starts when people come together.
The vLLM Korea community was built by people who share a passion for vLLM, exchange challenges, and share insights with each other. We hope events like this continue to help talented Korean developers grow together and help the vLLM community expand naturally.
Thank you to everyone who made this event possible. We will be back with even richer content at the next vLLM Meetup.
See you next time! 🙌