EdgeInfer

This is a repo for Android application of Powerinfer.

Setup and Installation

Prerequisites

Android Studio 2024.2.1 or later

Installation

Clone the repo to your local machine using the following command:

git clone https://github.com/Si1w/EdgeInfer.git

Open the project in Android Studio.
Build and run the project in Android Studio.

OR

Download the APK file from the release page.

ADD YOUR OWN MODEL

Open the file ./android/app/src/main/java/com/example/edgeinfer/MainActivity.java.
Find the following code:

val models = listOf(
      Downloadable(
            name = "Bamboo 7B (Q4)",
            Uri.parse(
                  "https://huggingface.co/PowerInfer/Bamboo-base-v0.1-gguf/" +
                  "resolve/main/bamboo-7b-v0.1.Q4_0.powerinfer.gguf?download=true"
            ),
            File(getExternalFilesDir(null), "bamboo-7b-v0.1.Q4_0.powerinfer.gguf")
      ),
      Downloadable(
            name = "Bamboo DPO (Q4)",
            Uri.parse(
                  "https://huggingface.co/PowerInfer/Bamboo-DPO-v0.1-gguf/" +
                  "resolve/main/bamboo-7b-dpo-v0.1.Q4_0.powerinfer.gguf?download=true"
            ),
            File(getExternalFilesDir(null), "bamboo-7b-dpo-v0.1.Q4_0.powerinfer.gguf")
      )
)

TIPS: If the model is not supported by PowerInfer, we need to add the ARCH code in llama.cpp. There is a similar example here

Add your own model by adding a new Downloadable object to the models list.

Paper and Citation

More technical details can be found in paper.

@misc{song2023powerinfer,
      title={PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU},
      author={Yixin Song and Zeyu Mi and Haotong Xie and Haibo Chen},
      year={2023},
      eprint={2312.12456},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.devops		.devops
android		android
ci		ci
cmake		cmake
common		common
docs		docs
examples		examples
gguf-py		gguf-py
grammars		grammars
media		media
pocs		pocs
powerinfer-py		powerinfer-py
prompts		prompts
scripts		scripts
spm-headers		spm-headers
tests		tests
.dockerignore		.dockerignore
.ecrc		.ecrc
.editorconfig		.editorconfig
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
Package.swift		Package.swift
README.md		README.md
SHA256SUMS		SHA256SUMS
atomic_windows.h		atomic_windows.h
build.zig		build.zig
codecov.yml		codecov.yml
convert-dense.py		convert-dense.py
convert-hf-to-powerinfer-gguf.py		convert-hf-to-powerinfer-gguf.py
convert.py		convert.py
flake.lock		flake.lock
flake.nix		flake.nix
ggml-alloc.c		ggml-alloc.c
ggml-alloc.h		ggml-alloc.h
ggml-backend-impl.h		ggml-backend-impl.h
ggml-backend.c		ggml-backend.c
ggml-backend.h		ggml-backend.h
ggml-cuda.cu		ggml-cuda.cu
ggml-cuda.h		ggml-cuda.h
ggml-impl.h		ggml-impl.h
ggml-metal.h		ggml-metal.h
ggml-metal.m		ggml-metal.m
ggml-metal.metal		ggml-metal.metal
ggml-mpi.c		ggml-mpi.c
ggml-mpi.h		ggml-mpi.h
ggml-opencl.cpp		ggml-opencl.cpp
ggml-opencl.h		ggml-opencl.h
ggml-quants.c		ggml-quants.c
ggml-quants.h		ggml-quants.h
ggml.c		ggml.c
ggml.h		ggml.h
llama.cpp		llama.cpp
llama.h		llama.h
mypy.ini		mypy.ini
requirements.txt		requirements.txt
run_with_preset.py		run_with_preset.py
unicode.h		unicode.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EdgeInfer

Setup and Installation

Prerequisites

Installation

ADD YOUR OWN MODEL

Paper and Citation

About

Releases 2

Packages

Languages

License

Si1w/EdgeInfer

Folders and files

Latest commit

History

Repository files navigation

EdgeInfer

Setup and Installation

Prerequisites

Installation

ADD YOUR OWN MODEL

Paper and Citation

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages