pytorch / FBGEMM Public

Notifications You must be signed in to change notification settings
Fork 554
Star 1.3k

Code
Issues 39
Pull requests 424
Discussions
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: pytorch/FBGEMM

Labels 40 Milestones 0

New pull request New

424 Open 3,293 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Add a workaround for stochastic rounding for AMD GPUs cla signed fb-exported

#3908 opened Apr 1, 2025 by sryap

Loading…

Debug stochastic rounding issue 2 cla signed fb-exported

#3907 opened Apr 1, 2025 by sryap

Loading…

Debug stochastic rounding issue cla signed fb-exported

#3906 opened Apr 1, 2025 by sryap

Loading…

[fbgemm_gpu] Add docs for GenAI package cla signed

#3905 opened Mar 31, 2025 by q10

Loading…

Enable slow accumulation in fp8 grouped gemm cla signed fb-exported

#3904 opened Mar 31, 2025 by jwfromm

Loading…

Handle 0 inputs for gmm cla signed fb-exported

#3901 opened Mar 31, 2025 by jasonjk-park

Loading…

support permute_multi_embedding_function on torch.export cla signed fb-exported

#3897 opened Mar 28, 2025 by zejunh

Loading…

FBGEMM fp8 ck GEMM fix for irregular GEMM shapes cla signed fb-exported

#3894 opened Mar 28, 2025 by zjing14

Loading…

Debug A100 too many resources requested for launch issue cla signed fb-exported

#3893 opened Mar 28, 2025 by jianyuh

Loading…

Add NEON transpose kernel for half-precision cla signed

#3892 opened Mar 28, 2025 by skykongkong8

Loading…

[fbgemm_gpu] Update Nova jobs cla signed

#3890 opened Mar 27, 2025 by q10

Loading…

[fbgemm_gpu][DO NOT MERGE] Build for vLLM cla signed

#3887 opened Mar 26, 2025 by q10

Loading…

Integrate D71065405 and D71079311 into stochastic rounding cla signed fb-exported

#3882 opened Mar 26, 2025 by q10

Loading…

nested dispatching of segment_csr on cpu/gpu cla signed fb-exported

#3881 opened Mar 26, 2025 by jeetkanjani7

Loading…

Improve Fused8BitRowwiseQuantizedSBFloatToFloatOrHalfNeon by 2%-10% cla signed fb-exported

#3879 opened Mar 25, 2025 by Nicoshev

Loading…

int4 kv cla signed fb-exported

#3878 opened Mar 25, 2025 by Aya-ZIbra

Loading…

Back out "Replace LR access with wrapper" cla signed fb-exported

#3857 opened Mar 20, 2025 by spcyppt

Loading…

Fix CUDA kernel index data type in deeplearning/fbgemm/fbgemm_gpu/fb/src/programmable_kernel/op_merge_bucketized_dense.cuh +10 cla signed fb-exported

#3843 opened Mar 18, 2025 by r-barnes

Loading…

Refactoring of NoPE cla signed fb-exported

#3840 opened Mar 17, 2025 by Aya-ZIbra

Loading…

Cleanups for the EEG-based TBE benchmark CLI, pt 3 cla signed fb-exported

#3837 opened Mar 17, 2025 by q10

Loading…

Revert D71179541: Multisect successfully blamed "D71179541: [fmoe] update the sorting kernel for bf16 ck fmoe kernel" for one test failure cla signed fb-exported

#3833 opened Mar 17, 2025 by zjing14

Loading…

Comment out cache_weights assignment cla signed fb-exported

#3824 opened Mar 15, 2025 by sryap

Loading…

Set cache_precision = FP16 cla signed fb-exported

#3823 opened Mar 15, 2025 by sryap

Loading…

Unifying TBE API using List (Frontend) - reland ci-no-td cla signed fb-exported

#3821 opened Mar 14, 2025 by spcyppt

Loading…

[CUTLASS] Roll cutlass version back a bit to hopefully fix compilation errors. cla signed

#3816 opened Mar 14, 2025 by jwfromm

Loading…

Previous 1 2 3 4 5 … 16 17 Next

Previous Next

ProTip! Updated in the last three days: updated:>2025-03-29.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly