NVIDIA / Fuser Public

Notifications You must be signed in to change notification settings
Fork 61
Star 339

Code
Issues 252
Pull requests 162
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Wiki
Security
Insights

Pull requests: NVIDIA/Fuser

Labels 49 Milestones 0

New pull request New

162 Open 3,614 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Vectorize fp4 even if the last dim is not contiguous

#4669 opened Jun 24, 2025 by zasdfgbnm

Loading…

FP4 support for discontiguous tensors

#4668 opened Jun 24, 2025 by zasdfgbnm

Loading…

FP4 support in pointwise scheduler

#4666 opened Jun 24, 2025 by zasdfgbnm • Draft

Add new data type Float8_e8m0fnu

#4665 opened Jun 24, 2025 by zasdfgbnm

Loading…

fix warp specialized tma for ln bwd

#4663 opened Jun 24, 2025 by liqiangxl

Loading…

Create nvfp4 grouped gemm bindings in direct bindings Cutlass Matmuls Thunder-Inference-Demo

#4662 opened Jun 24, 2025 by rdspring1 • Draft

2 tasks

Removing the 2D assumption of index put accumulate

#4660 opened Jun 19, 2025 by naoyam

Loading…

Unify propagation rules in propagateShardingsPass

#4652 opened Jun 17, 2025 by Priya2698

Loading…

Host IR LLVM Lowering 1: Build Config Change & Initial Allocate support

#4651 opened Jun 17, 2025 by wolfcomos

Loading…

Adding a new benchmark for softmax fwd to track performance for large inner dimensions

#4650 opened Jun 17, 2025 by protonu • Draft

[RFC] Add Cutlass MXFP8 Grouped Gemm to nvfuser_direct python bindings Cutlass Matmuls Thunder-Inference-Demo

#4649 opened Jun 17, 2025 by rdspring1

Loading…

Option to run benchmarks with nsys

#4648 opened Jun 16, 2025 by Priya2698

Loading…

Implement Warp-Specialized Ping-Pong Matmul Matmuls

#4646 opened Jun 16, 2025 by rdspring1 • Draft

Fix condition for smem limit in kernel executor

#4644 opened Jun 16, 2025 by jacobhinkle • Draft

ws tma normalization dynamic shape

#4638 opened Jun 13, 2025 by liqiangxl • Draft

Add fused Embedding and RMSNorm benchmarks Python Benchmarks

#4637 opened Jun 13, 2025 by IvanYashchuk

Loading…

Add embedding_indexing benchmark and Llama 4 Maverick configuration Python Benchmarks

#4636 opened Jun 13, 2025 by IvanYashchuk

Loading…

Use CUPTI-python in lieu of torch.profiler

#4614 opened Jun 11, 2025 by Priya2698

Loading…

[WIP] Always do CGA split in persistent Hopper matmul

#4610 opened Jun 10, 2025 by jacobhinkle • Draft

[WIP] Inline stmatrix with TMA store

#4609 opened Jun 10, 2025 by jacobhinkle • Draft

auto select between warp specialized and multi-wave approaches

#4603 opened Jun 9, 2025 by liqiangxl

Loading…

[DO NOT REVIEW] Adding index_shuffling

#4588 opened Jun 6, 2025 by jjsjann123 • Draft

Privatize squeeze in addition to upcast

#4583 opened Jun 5, 2025 by protonu

Loading…

LLVM lowering

#4581 opened Jun 5, 2025 by wolfcomos

Loading…

Add option to insert resharding after

#4574 opened Jun 4, 2025 by samnordmann

Loading…

Previous 1 2 3 4 5 6 7 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!