Coding Agents for GPU Kernel Generation
Note: This article is being updated. Please check back for the latest version. Recently, I participated in the FlashInfer AI Kernel Generation Contest (FlashInfer Contest, 2026). This blog post is not a tutorial on CUDA kernel optimization, and I am not a GPU operator development expert. My main purpose in joining the contest was to use a highly verifiable task environment with clear feedback to study how coding agents can continuously produce high-quality GPU kernels in a closed-loop workflow. The full technical report is Harness Engineering for LLM-Driven GPU Kernel Generation (Shui et al., 2026), and the public repository is mlsys26-flashinfer-contest. ...