Skip to content

Actions: huggingface/text-generation-inference

All workflows

Actions

Loading...

Showing runs from all workflows
7,295 workflow runs
7,295 workflow runs
Event

Filter by event

Status

Filter by status

Branch
Actor

Filter by actor

server: use chunked inputs
Server Tests #1912: Pull request #1985 opened by danieldk
May 31, 2024 08:59 11m 25s feature/server-chunks
May 31, 2024 08:59 11m 25s
server: use chunked inputs
Automatic Documentation for Launcher #1204: Pull request #1985 opened by danieldk
May 31, 2024 08:59 6m 29s feature/server-chunks
May 31, 2024 08:59 6m 29s
server: use chunked inputs
Build and push docker image to internal registry #2664: Pull request #1985 opened by danieldk
May 31, 2024 08:59 3m 28s feature/server-chunks
May 31, 2024 08:59 3m 28s
Close stale issues and PRs
Close stale issues and PRs #178: Scheduled
May 31, 2024 01:47 20s main
May 31, 2024 01:47 20s
Nightly load test
Nightly load test #273: Scheduled
May 31, 2024 00:19 10m 59s main
May 31, 2024 00:19 10m 59s
router: send the input as chunks to the backend
Server Tests #1911: Pull request #1981 synchronize by danieldk
May 30, 2024 12:31 13m 2s feature/chunked-input
May 30, 2024 12:31 13m 2s
router: send the input as chunks to the backend
Automatic Documentation for Launcher #1203: Pull request #1981 synchronize by danieldk
May 30, 2024 12:31 1m 34s feature/chunked-input
May 30, 2024 12:31 1m 34s
router: send the input as chunks to the backend
Build and push docker image to internal registry #2663: Pull request #1981 synchronize by danieldk
May 30, 2024 12:31 1h 59m 1s feature/chunked-input
May 30, 2024 12:31 1h 59m 1s
router: send the input as chunks to the backend
Automatic Documentation for Launcher #1202: Pull request #1981 opened by danieldk
May 30, 2024 12:04 1m 37s feature/chunked-input
May 30, 2024 12:04 1m 37s
router: send the input as chunks to the backend
Build and push docker image to internal registry #2662: Pull request #1981 opened by danieldk
May 30, 2024 12:04 7m 10s feature/chunked-input
May 30, 2024 12:04 7m 10s
router: send the input as chunks to the backend
Server Tests #1910: Pull request #1981 opened by danieldk
May 30, 2024 12:04 13m 36s feature/chunked-input
May 30, 2024 12:04 13m 36s
Upload PR Documentation
Upload PR Documentation #470: completed by fxmarty
May 30, 2024 11:39 31s
May 30, 2024 11:39 31s
Update documentation version to 2.0.4
Build PR Documentation #538: Pull request #1980 opened by fxmarty
May 30, 2024 11:38 37s fxmarty:update-version-doc
May 30, 2024 11:38 37s
Update documentation version to 2.0.4
Automatic Documentation for Launcher #1201: Pull request #1980 opened by fxmarty
May 30, 2024 11:38 1m 36s fxmarty:update-version-doc
May 30, 2024 11:38 1m 36s
[Major Change][Undecided yet] Move to FlashDecoding instead of PagedAttention kernel.
Build and push docker image to internal registry #2661: Pull request #1940 synchronize by Narsil
May 30, 2024 09:51 1h 59m 41s flashdecoding
May 30, 2024 09:51 1h 59m 41s
[Major Change][Undecided yet] Move to FlashDecoding instead of PagedAttention kernel.
Server Tests #1909: Pull request #1940 synchronize by Narsil
May 30, 2024 09:51 11m 39s flashdecoding
May 30, 2024 09:51 11m 39s
[Major Change][Undecided yet] Move to FlashDecoding instead of PagedAttention kernel.
Automatic Documentation for Launcher #1200: Pull request #1940 synchronize by Narsil
May 30, 2024 09:51 1m 30s flashdecoding
May 30, 2024 09:51 1m 30s
Gemma GPTQ checks: skip logprob checks
Build documentation #93: Commit 967ced2 pushed by danieldk
May 30, 2024 09:28 38s main
May 30, 2024 09:28 38s
Gemma GPTQ checks: skip logprob checks
Build and push docker image to internal registry #2660: Commit 967ced2 pushed by danieldk
May 30, 2024 09:28 1h 7m 20s main
May 30, 2024 09:28 1h 7m 20s
pages build and deployment
pages-build-deployment #658: by danieldk
May 30, 2024 09:28 38s
May 30, 2024 09:28 38s
[Major Change][Undecided yet] Move to FlashDecoding instead of PagedAttention kernel.
Build and push docker image to internal registry #2659: Pull request #1940 synchronize by Narsil
May 30, 2024 09:25 28m 33s flashdecoding
May 30, 2024 09:25 28m 33s
[Major Change][Undecided yet] Move to FlashDecoding instead of PagedAttention kernel.
Automatic Documentation for Launcher #1199: Pull request #1940 synchronize by Narsil
May 30, 2024 09:25 1m 32s flashdecoding
May 30, 2024 09:25 1m 32s
[Major Change][Undecided yet] Move to FlashDecoding instead of PagedAttention kernel.
Server Tests #1908: Pull request #1940 synchronize by Narsil
May 30, 2024 09:25 10m 52s flashdecoding
May 30, 2024 09:25 10m 52s
Upload PR Documentation
Upload PR Documentation #469: completed by danieldk
May 30, 2024 07:12 32s
May 30, 2024 07:12 32s
Add support for exl2-quantized models
Build PR Documentation #537: Pull request #1965 synchronize by danieldk
May 30, 2024 07:11 1m 7s feature/exl2
May 30, 2024 07:11 1m 7s