LLM Inference Engine
High-performance LLM serving system with dynamic batching, quantization support, and distributed inference capabilities.
Kyrie Chen
I write about machine learning, LLM systems, AI infrastructure, and the cities and journeys that stay with me after the trip is over.
Engineer, researcher, traveler, husband, and father. This site is where technical notes and personal essays live together.
我的技术分享和心得记录了我求学和工作过程中的每一步;而对旅行与生活的分享,记录那些值得被时间反复回望的地方与片刻。
这里既有大模型、推理系统、训练技巧和工程化实现,也有旅行中的城市、海岸线、街巷和那些被记住的日常。
High-performance LLM serving system with dynamic batching, quantization support, and distributed inference capabilities.
Production-ready retrieval-augmented generation system with vector database, embedding optimization, and context management.
Distributed training infrastructure with experiment tracking, hyperparameter tuning, and model versioning.
Collection of NLP utilities for text processing, sentiment analysis, and entity recognition with multilingual support.
2026-04-01
2026-03-22
2026-01-16
2025-11-27
2025-11-05
2025-10-15