-
Notifications
You must be signed in to change notification settings - Fork 729
Issues: sgl-project/sglang
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug] Decode Throughput Inconsistency Between bench_serving and Engine Logs
#3050
opened Jan 22, 2025 by
leepoly
5 tasks done
[Help wanted] CANN'T capture GPU activities using
nsight system
#3049
opened Jan 22, 2025 by
sleepwalker2017
[Bug] Qwen2-VL-7B with sglang Performance Degradation
#3041
opened Jan 22, 2025 by
yileld
5 tasks done
[Feature] Support Beam Search
enhancement
New feature or request
#3032
opened Jan 21, 2025 by
laixinn
2 of 4 tasks
[Feature] FP8 weight only w8a16 quantization native support
quant
LLM Quantization
#3007
opened Jan 20, 2025 by
arunpatala
2 tasks done
what is the most efficient way to do with a 72b model and 8 * A100 ?
#3002
opened Jan 20, 2025 by
Chandler-Bing
[Feature] Add docs for Offline Engine token-in token-out
documentation
Improvements or additions to documentation
good first issue
Good for newcomers
RLHF
Using SGLang for post training
#2968
opened Jan 18, 2025 by
zhaochenyang20
2 tasks
[Feature] remove vllm _custom_ops
good first issue
Good for newcomers
help wanted
Extra attention is needed
high priority
#2965
opened Jan 18, 2025 by
zhyncs
7 tasks
[Bug] Regex isn't precluding parentheticals. And maybe more.
help wanted
Extra attention is needed
#2957
opened Jan 17, 2025 by
cinjon
5 tasks done
[Bug] JSONResponse fails if the probability distribution is very spiky.
#2955
opened Jan 17, 2025 by
cinjon
5 tasks done
[Feature] Add docs for local accuracy tests
documentation
Improvements or additions to documentation
good first issue
Good for newcomers
#2953
opened Jan 17, 2025 by
zhaochenyang20
2 tasks
[Feature] Enhancement on Sparse Attention and KV-Cache Compression
#2946
opened Jan 17, 2025 by
shadowpa0327
2 tasks done
[Bug] Unrecognized keys in Extra attention is needed
rope_scaling
for 'rope_type'='yarn': {'original_max_position_embeddings'}
help wanted
#2943
opened Jan 17, 2025 by
rangehow
5 tasks done
[Feature] support EAGLE 2 with Triton Backend
good first issue
Good for newcomers
help wanted
Extra attention is needed
high priority
#2940
opened Jan 17, 2025 by
zhyncs
2 tasks
[Bug] KeyError: 'lm_head.weight' when loading quantized llama 3.2 3B and 1B models
#2935
opened Jan 17, 2025 by
arunpatala
5 tasks done
[Bug] def get_nvgpu_memory_capacity() causes crash on NVIDIA H100 MIG
#2933
opened Jan 17, 2025 by
dsingal0
5 tasks done
[Feature] (WIP)Support disaggregated serving to separate prefill and decoding
#2932
opened Jan 17, 2025 by
zhaohaidao
2 tasks
[Bug] [OpenAI compatible API] Chunks of tokens aren't being split into separate indexes when specifying n > 1 generations
documentation
Improvements or additions to documentation
#2912
opened Jan 16, 2025 by
accupham
5 tasks done
Previous Next
ProTip!
Find all open issues with in progress development work with linked:pr.