Haoyang Ma Blog

Haoyang Ma BlogPersonal website of Haoyang Ma, a research engineer working on LLM agents and systems.https://haoyang9804.github.io/现代孔乙己 - softmax 的 N 种 CUDA 写法https://haoyang9804.github.io/blog/%E7%8E%B0%E4%BB%A3%E5%AD%94%E4%B9%99%E5%B7%B1---softmax%E7%9A%84n%E7%A7%8Dcuda%E5%86%99%E6%B3%95/https://haoyang9804.github.io/blog/%E7%8E%B0%E4%BB%A3%E5%AD%94%E4%B9%99%E5%B7%B1---softmax%E7%9A%84n%E7%A7%8Dcuda%E5%86%99%E6%B3%95/一次从 naive softmax 到多 kernel reduce 的 CUDA kernel 性能实验笔记。Fri, 22 May 2026 00:00:00 GMT