sgl-project / sglang Public

Notifications You must be signed in to change notification settings
Fork 729
Star 7.6k

Code
Issues 193
Pull requests 43
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: sgl-project/sglang

Development Roadmap (2024 Q4)

#1487 opened Sep 21, 2024 by Ying1123

Open 22

[Feature] DeepSeek V3 optimization

#2591 opened Dec 26, 2024 by zhyncs

Open 18

Labels 28 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

193 Open 726 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug] Decode Throughput Inconsistency Between bench_serving and Engine Logs

#3050 opened Jan 22, 2025 by leepoly

5 tasks done

[Help wanted] CANN'T capture GPU activities using nsight system

#3049 opened Jan 22, 2025 by sleepwalker2017

[Feature] Reasoning model API support

#3043 opened Jan 22, 2025 by lambert0312

2 tasks done

[Bug] Qwen2-VL-7B with sglang Performance Degradation

#3041 opened Jan 22, 2025 by yileld

5 tasks done

[Feature]

#3040 opened Jan 22, 2025 by moxiegushi

2 tasks

[Feature] Support Beam Search enhancement

New feature or request

#3032 opened Jan 21, 2025 by laixinn

2 of 4 tasks

Can router support --api-key parameter router

#3031 opened Jan 21, 2025 by lambert0312

[Feature] FP8 weight only w8a16 quantization native support quant

LLM Quantization

#3007 opened Jan 20, 2025 by arunpatala

2 tasks done

what is the most efficient way to do with a 72b model and 8 * A100 ?

#3002 opened Jan 20, 2025 by Chandler-Bing

[Feature] Add docs for Offline Engine token-in token-out documentation

Improvements or additions to documentation

good first issue

Good for newcomers

RLHF

Using SGLang for post training

#2968 opened Jan 18, 2025 by zhaochenyang20

2 tasks

[Feature] remove vllm _custom_ops good first issue

Good for newcomers

help wanted

Extra attention is needed

high priority

#2965 opened Jan 18, 2025 by zhyncs

7 tasks

QVQ Prefill stage slow

#2961 opened Jan 18, 2025 by WuNein

[Bug] Regex isn't precluding parentheticals. And maybe more. help wanted

Extra attention is needed

#2957 opened Jan 17, 2025 by cinjon

5 tasks done

[Bug] JSONResponse fails if the probability distribution is very spiky.

#2955 opened Jan 17, 2025 by cinjon

5 tasks done

[Feature] Add docs for local accuracy tests documentation

Improvements or additions to documentation

good first issue

Good for newcomers

#2953 opened Jan 17, 2025 by zhaochenyang20

2 tasks

[Feature] Enhancement on Sparse Attention and KV-Cache Compression

#2946 opened Jan 17, 2025 by shadowpa0327

2 tasks done

[Bug] Unrecognized keys in rope_scaling for 'rope_type'='yarn': {'original_max_position_embeddings'} help wanted

Extra attention is needed

#2943 opened Jan 17, 2025 by rangehow

5 tasks done

[Feature] support EAGLE 2 with Triton Backend good first issue

Good for newcomers

help wanted

Extra attention is needed

high priority

#2940 opened Jan 17, 2025 by zhyncs

2 tasks

[Bug] KeyError: 'lm_head.weight' when loading quantized llama 3.2 3B and 1B models

#2935 opened Jan 17, 2025 by arunpatala

5 tasks done

[Bug] def get_nvgpu_memory_capacity() causes crash on NVIDIA H100 MIG

#2933 opened Jan 17, 2025 by dsingal0

5 tasks done

[Feature] (WIP)Support disaggregated serving to separate prefill and decoding

#2932 opened Jan 17, 2025 by zhaohaidao

2 tasks

[Bug] tensor_model_parallel_all_reduce' is not defined

#2931 opened Jan 17, 2025 by bakch92

[Feature] Lora optimization

#2929 opened Jan 16, 2025 by Fridge003

2 of 11 tasks

Warning while running Deepseek-V3 amd

#2921 opened Jan 16, 2025 by ishaandatta

[Bug] [OpenAI compatible API] Chunks of tokens aren't being split into separate indexes when specifying n > 1 generations documentation

Improvements or additions to documentation

#2912 opened Jan 16, 2025 by accupham

5 tasks done

Previous 1 2 3 4 5 6 7 8 Next

Previous Next

ProTip! Find all open issues with in progress development work with linked:pr.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly