Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch dev Command to Use llamafile #559

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Conversation

gorkem
Copy link
Contributor

@gorkem gorkem commented Oct 24, 2024

This pull request updates the dev command to utilize llamafile. It introduces two methods for packaging the harness:

Download Mode (Default): Downloads harness components dynamically first time the dev command is invoked.
Offline Mode: Prepackages harness components for offline use, allowing development without internet access.

  • Introduces a new build tag embed_harness to compile binaries specifically for offline mode. This mode embeds the llamafile and UI assets directly into the binary

  • Modifies the existing build process to generate both default and offline mode binaries. Produces separate archives for each mode.

  • go generate command downloads the llamafile for the specified version. Prepares the downloaded llamafile for both embedding (offline mode) and uploading (download mode).

  • Introduces a new script to streamline the uploading of llamafile binaries to the designated download location.

What is missing

  • Needs more testing on Wins
  • llamafile upload has not been automated
  • no download indicators for the default mode.

Related Issues

@gorkem gorkem requested a review from amisevsk October 24, 2024 16:42
@@ -4,7 +4,7 @@
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some formatter replaced these spaces with a tab

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go fmt does it somehow inconsistently

@@ -31,6 +42,20 @@ archives:
files:
- LICENSE
- README.md
- id: kit-archive-offile
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- id: kit-archive-offile
- id: kit-archive-offline

Comment on lines +33 to +56
OWNER=$1
REPO=$2
RELEASE_TAG=$3
ASSET_NAME=$4
OUTPUT_DIR=$5

echo "Downloading asset '${ASSET_NAME}' from release '${RELEASE_TAG}' of repository '${OWNER}/${REPO}'"

DOWNLOAD_URL="https://github.com/${OWNER}/${REPO}/releases/download/${RELEASE_TAG}/llamafile-${RELEASE_TAG}"

mkdir -p ${OUTPUT_DIR}

curl -L -o ${OUTPUT_DIR}/llamafile ${DOWNLOAD_URL}

echo "Asset '${ASSET_NAME}' downloaded successfully to: ${OUTPUT_DIR}/${ASSET_NAME}"

echo "${RELEASE_TAG}" > ${OUTPUT_DIR}/llamafile.version

COMPRESSED_FILE="llamafile.tar.gz"
compress ${OUTPUT_DIR} ../${COMPRESSED_FILE}

echo "Compressed asset saved to: ${COMPRESSED_FILE}"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage of $ASSET_NAME is not clear here -- it looks like it's actually hard-coded to llamafile in commands. Maybe the argument can just be dropped?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was we could use it to download other binaries or checksums. We did not need that at the moment but kept it just in case.

Comment on lines +97 to +102
cmd = exec.Command("sh", "-c",
fmt.Sprintf("./llamafile --server --model %s --host %s --port %d --path %s --nobrowser --unsecure",
modelPath, harness.Host, harness.Port, uiHome),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be wrapped in a sh -c?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, because of the way llamafile works.

pkg/lib/harness/llm-harness.go Outdated Show resolved Hide resolved
Comment on lines +148 to +151
_, err = io.Copy(out, resp.Body)
if err != nil {
return fmt.Errorf("failed to write to file %s: %w", filePath, err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the response includes a size for the body, you could probably get a decent-looking progress bar by wrapping resp.Body via WrapReadCloser (though the info line would probably be wrong

Uses llamafile for harness functionality. Introduces
downloading the harness in addition to embedding to
allow for smaller binaries. Removes llama.cpp submodule
Adds script for uploading llamafile to downloads.
Updates release scripts to include both embedded (offline)
and download version.
Adds checksum download and verify. Also minor
improvements to error handling
Removes the wins/linux restriction. Fixes the
archive package for wins offline.
Fix the windows build archive and the exec command.
Minor updates to address review comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dev command for windows dev-command for linux
2 participants