This is a repo for Android application of Powerinfer.
- Android Studio 2024.2.1 or later
- Clone the repo to your local machine using the following command:
git clone https://github.com/Si1w/EdgeInfer.git
-
Open the project in Android Studio.
-
Build and run the project in Android Studio.
OR
- Download the APK file from the release page.
-
Open the file
./android/app/src/main/java/com/example/edgeinfer/MainActivity.java
. -
Find the following code:
val models = listOf(
Downloadable(
name = "Bamboo 7B (Q4)",
Uri.parse(
"https://huggingface.co/PowerInfer/Bamboo-base-v0.1-gguf/" +
"resolve/main/bamboo-7b-v0.1.Q4_0.powerinfer.gguf?download=true"
),
File(getExternalFilesDir(null), "bamboo-7b-v0.1.Q4_0.powerinfer.gguf")
),
Downloadable(
name = "Bamboo DPO (Q4)",
Uri.parse(
"https://huggingface.co/PowerInfer/Bamboo-DPO-v0.1-gguf/" +
"resolve/main/bamboo-7b-dpo-v0.1.Q4_0.powerinfer.gguf?download=true"
),
File(getExternalFilesDir(null), "bamboo-7b-dpo-v0.1.Q4_0.powerinfer.gguf")
)
)
TIPS: If the model is not supported by PowerInfer, we need to add the ARCH
code in llama.cpp
. There is a similar example here
- Add your own model by adding a new
Downloadable
object to themodels
list.
More technical details can be found in paper.
@misc{song2023powerinfer,
title={PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU},
author={Yixin Song and Zeyu Mi and Haotong Xie and Haibo Chen},
year={2023},
eprint={2312.12456},
archivePrefix={arXiv},
primaryClass={cs.LG}
}