# Blog LLM Index - [Hello, Layout! ; Visualizing Memory in CuTe](https://www.dcbaslani.xyz/blog/01_hello_layout/) - Markdown: https://www.dcbaslani.xyz/blog/01_hello_layout/index.md - Text: https://www.dcbaslani.xyz/blog/01_hello_layout/index.txt - [The Art of Slicing ; Partitioning Data Across Blocks and Threads](https://www.dcbaslani.xyz/blog/02_the_art_of_slicing/) - Markdown: https://www.dcbaslani.xyz/blog/02_the_art_of_slicing/index.md - Text: https://www.dcbaslani.xyz/blog/02_the_art_of_slicing/index.txt - [The Naive Copy ; Scalar vs. Vectorized Memory Movement](https://www.dcbaslani.xyz/blog/03_the_naive_copy/) - Markdown: https://www.dcbaslani.xyz/blog/03_the_naive_copy/index.md - Text: https://www.dcbaslani.xyz/blog/03_the_naive_copy/index.txt - [The Parallel Copy ; Orchestrating Threads with TiledCopy](https://www.dcbaslani.xyz/blog/04_the_parallel_copy/) - Markdown: https://www.dcbaslani.xyz/blog/04_the_parallel_copy/index.md - Text: https://www.dcbaslani.xyz/blog/04_the_parallel_copy/index.txt - [Swizzling ; Avoiding Shared Memory Bank Conflicts](https://www.dcbaslani.xyz/blog/05_swizzling/) - Markdown: https://www.dcbaslani.xyz/blog/05_swizzling/index.md - Text: https://www.dcbaslani.xyz/blog/05_swizzling/index.txt - [Hello, MMA — Your First Tensor Core Instruction](https://www.dcbaslani.xyz/blog/06_hello_mma/) - Markdown: https://www.dcbaslani.xyz/blog/06_hello_mma/index.md - Text: https://www.dcbaslani.xyz/blog/06_hello_mma/index.txt - [The Global GEMM — Putting It All Together](https://www.dcbaslani.xyz/blog/07_the_global_gemm/) - Markdown: https://www.dcbaslani.xyz/blog/07_the_global_gemm/index.md - Text: https://www.dcbaslani.xyz/blog/07_the_global_gemm/index.txt - [The TMA Revolution (Async Copy)](https://www.dcbaslani.xyz/blog/08_the_tma_revolution/) - Markdown: https://www.dcbaslani.xyz/blog/08_the_tma_revolution/index.md - Text: https://www.dcbaslani.xyz/blog/08_the_tma_revolution/index.txt - [WGMMA ; Warpgroup MMA](https://www.dcbaslani.xyz/blog/09_wgmma/) - Markdown: https://www.dcbaslani.xyz/blog/09_wgmma/index.md - Text: https://www.dcbaslani.xyz/blog/09_wgmma/index.txt - [Cute-DSL: I Wrote a CUDA Kernel in Python and My GPU Didn't Even Cry](https://www.dcbaslani.xyz/blog/cute-dsl-blog/) - Markdown: https://www.dcbaslani.xyz/blog/cute-dsl-blog/index.md - Text: https://www.dcbaslani.xyz/blog/cute-dsl-blog/index.txt - [Breaking PyTorch Boundaries: Fusing RMSNorm and GDN in Triton for Qwen 3.5](https://www.dcbaslani.xyz/blog/qwen_3.5/) - Markdown: https://www.dcbaslani.xyz/blog/qwen_3.5/index.md - Text: https://www.dcbaslani.xyz/blog/qwen_3.5/index.txt - [The Feynman GPU Lectures](https://www.dcbaslani.xyz/blog/gpu_masterclass/) - Markdown: https://www.dcbaslani.xyz/blog/gpu_masterclass/index.md - Text: https://www.dcbaslani.xyz/blog/gpu_masterclass/index.txt