Skip to content
Open
Show file tree
Hide file tree
Changes from 85 commits
Commits
Show all changes
88 commits
Select commit Hold shift + click to select a range
46794c4
[Enhancement] Refactor buffer index handling for improved precision a…
Jul 29, 2025
499daa3
Remove obsolete test script for AMD example, streamlining the example…
Jul 29, 2025
555537a
Remove unused dtype_size variable in AMD example script to streamline…
Jul 29, 2025
f84bc97
Add input configuration file and update AMD example script for enhanc…
Jul 30, 2025
21cf0c3
Remove input configuration file and obsolete test script; enhance AMD…
Jul 30, 2025
9b2fab3
Refactor AMD example script for FlashAttention-2
Jul 30, 2025
24e08ae
Refactor formatting in AMD FlashAttention example script
Jul 30, 2025
bc2663a
Update example_amd_flash_attn_fwd.py
LeiWang1999 Jul 31, 2025
4d427d9
Enhance AMD example script and update CI workflows
Aug 18, 2025
4fd8529
Merge branch 'main' into main
Alex4210987 Aug 18, 2025
cf99bef
Remove redundant tool cache cleanup step in AMD CI workflow
Aug 18, 2025
e839192
Remove `torch` dependency from `requirements-rocm.txt` to streamline …
Aug 18, 2025
70f3f6a
Add new AMD FlashAttention example and test script
Aug 23, 2025
2bf7961
Update configurations in `example_amd_flash_attn_fwd.py` for autotuner
Aug 23, 2025
f7f6131
Update submodule 'tvm' to commit 6ccc74f622c7ec4ac25d430d0f6546e7b9ed…
Aug 24, 2025
91e9548
Update submodule 'tvm' to commit 14ff70ab142b9e5a31bbf9c7923c8a697d41…
Aug 24, 2025
460c64f
Merge branch 'tile-ai:main' into main
Alex4210987 Aug 24, 2025
8eefca0
Merge branch 'tile-ai:main' into main
Alex4210987 Sep 3, 2025
7bd45c5
Add example for AMD Flash Attention backward pass implementation
Sep 3, 2025
4cf8c30
Merge branch 'amd_dev'
Sep 3, 2025
bc22219
Merge branch 'main' of https://github.com/Alex4210987/tilelang
Sep 3, 2025
50b97e1
Enhance AMD Flash Attention example with additional testing capabilities
Sep 3, 2025
05305f2
Update submodule TVM to commit a64a5926a6e59f5417ef2501f9d88b467337cf6a
Sep 3, 2025
923fc6d
Refactor HIP intrinsic rules to CUDA
Sep 3, 2025
7b7fda3
Update AMD CI workflow to uninstall specific PyTorch packages before …
Sep 3, 2025
1008679
Remove unused shared memory allocations in AMD Flash Attention backwa…
Sep 3, 2025
f490b4a
Remove unnecessary pip uninstall command from AMD CI workflow
Sep 3, 2025
b39ada8
Refactor DispatchHIPWarpActiveMask function in HIP intrinsic rules
Sep 3, 2025
d62b898
Refactor formatting of HIP intrinsic rule registrations
Sep 3, 2025
e7b0f30
Update file name and documentation for HIP intrinsic rules
Sep 3, 2025
8c73c9c
Enhance DispatchHIPShuffle function with clang-analyzer comments
Sep 3, 2025
c8aec22
lint fix
LeiWang1999 Sep 4, 2025
4549e0e
Merge branch 'main' of https://github.com/tile-ai/tilelang into Alex4…
LeiWang1999 Sep 4, 2025
ccadc2e
fix
LeiWang1999 Sep 4, 2025
b491082
Enhance autotuner configurations in example_amd_flash_attn_fwd.py by …
Sep 7, 2025
3289910
Add backward attention example to test script
Sep 7, 2025
10870e1
Refactor FlashAttention implementation in example_amd_flash_attn_bwd.…
Sep 7, 2025
f20cd33
Enhance FlashAttention backward implementation in example_amd_flash_a…
Sep 7, 2025
570c6c9
Enhance FlashAttention backward implementation in example_amd_flash_a…
Sep 7, 2025
fff5543
Refactor FlashAttention implementation in example_amd_flash_attn_bwd.…
Sep 8, 2025
d5e3b6b
Enhance FlashAttention backward implementation in example_amd_flash_a…
Sep 10, 2025
3f15c59
Expand autotuner configurations in example_amd_flash_attn_bwd.py and …
Sep 10, 2025
0582143
Enhance performance calculations and benchmarking in example_amd_flas…
Sep 10, 2025
e8f0d9f
Remove forward attention test commands from test.sh and retain backwa…
Sep 11, 2025
335bbc6
Refactor FlashAttention forward and backward implementations in examp…
Sep 18, 2025
cf8cc88
Refactor FlashAttention implementation in example_amd_flash_attn_bwd.py
Sep 20, 2025
3a00c4d
Enhance FlashAttention backward implementation in example_amd_flash_a…
Sep 20, 2025
3b839d2
Refactor configuration and tensor operations in example_amd_flash_att…
Sep 30, 2025
4c11021
Merge remote-tracking branch 'upstream/main'
Sep 30, 2025
bc9a5fb
Enhance HIP code generation and FP8 type support
Sep 30, 2025
dd5b64f
Enhance FP8 type support and clarify accumulator handling in HIP
Sep 30, 2025
42e5538
Remove deprecated files and update print statements for clarity in ex…
Oct 10, 2025
9d53c8a
Update print statement formatting for clarity in example_amd_flash_at…
Oct 10, 2025
cd3b6b5
Remove redundant verification results summary print statement in exam…
Oct 10, 2025
3072de6
Fix formatting inconsistencies in example_amd_flash_attn_bwd.py and e…
Oct 10, 2025
1913abb
Refactor and enhance HIP code generation for improved FP8 support
Oct 10, 2025
acaf988
Fix formatting issue in HIP code generation for MFMA call
Oct 10, 2025
4bc49cd
Refactor HIP code generation and enhance FP8 type handling
Oct 10, 2025
ae39e35
Merge branch 'main' into main
Alex4210987 Oct 10, 2025
0c0fa53
Remove unnecessary blank line in example_amd_flash_attn_bwd.py for im…
Oct 10, 2025
9a4a08f
Merge branch 'main' of https://github.com/Alex4210987/tilelang
Oct 10, 2025
8b345ae
Refactor backward attention implementation in example_amd_flash_attn_…
Oct 10, 2025
c34315c
Fix formatting by removing an unnecessary blank line in example_amd_f…
Oct 10, 2025
cd1564d
Merge branch 'main' of https://github.com/tile-ai/tilelang into Alex4…
LeiWang1999 Oct 14, 2025
426df21
Merge branch 'tile-ai:main' into main
Alex4210987 Oct 15, 2025
40eabcd
Add additional test cases for `assert_tl_matmul_correctness` with `fl…
Oct 15, 2025
5b6bcaa
Refactor test case formatting for `assert_tl_matmul_correctness` in `…
Oct 15, 2025
5440a1d
Refactor memory allocation in flash attention examples to use registe…
Oct 30, 2025
3ae6316
Enhance flash attention backward example and GEMM implementation by a…
Nov 1, 2025
f4a9fe4
Enhance flash attention backward example by adding runtime HIP code d…
Nov 1, 2025
c139e7d
Merge remote-tracking branch 'upstream/main'
Nov 1, 2025
46e3a15
Refactor flash attention backward example to use shared memory for V …
Nov 10, 2025
886e822
Enhance flash attention backward example by adding HIP code compilati…
Nov 12, 2025
ede28a3
Update flash attention backward example to use float16 for tensor ope…
Nov 12, 2025
94ec505
Add body_sr function to GEMM implementation for optimized shared memo…
Nov 13, 2025
eb4130b
Refactor flash attention backward example to utilize fragment allocat…
Nov 13, 2025
e34448c
Merge remote-tracking branch 'upstream/main'
Nov 20, 2025
02742c7
[Enhancement] Optimize flash attention backward pass and improve kern…
Nov 20, 2025
9f62b3d
[Enhancement] Improve configuration validation and integrate Delta co…
Nov 24, 2025
1bf181e
[Refactor] Enhance flash attention backward and forward pass implemen…
Nov 30, 2025
88184fa
[Enhancement] Refine flash attention backward pass implementation
Dec 4, 2025
b50e792
Merge remote-tracking branch 'upstream/main'
Dec 12, 2025
56606af
[Refactor] Clean up code formatting and improve readability in AMD fl…
Dec 12, 2025
0c29cd8
[Enhancement] Enable fast math optimizations in AMD flash attention a…
Dec 14, 2025
8a1a9fd
[Enhancement] Add fast math support and improve LibraryGenerator func…
Dec 14, 2025
ce2ea16
[Refactor] Simplify LibraryGenerator methods by removing redundant code
Dec 18, 2025
0dee00a
Fix missing newline at end of file in LibraryGenerator class
Dec 18, 2025
cb6af25
Merge branch 'main' into main
Alex4210987 Dec 18, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Loading