Skip to content

BrightXiaoHan/optimum-ascend

Repository files navigation

optimum-ascend

Optimized inference with Ascend and Hugging Face

Note: This project is in the early development stage. Many features are still not yet refined and lack testing.

Installation

Install optimum with onnxruntime accelerator

pip install --upgrade-strategy eager install optimum[onnxruntime]

Install this repo

python -m pip install git+https://github.com/BrightXiaoHan/optimum-ascend.git

Note: It is recommended to install and run this repo in the pre-built Ascend CANN container environment.

Quick Start

Model conversion can be used through the Optimum command-line interface:

optimum-cli export ascend -m moka-ai/m3e-base ./m3e-base-ascend --task feature-extraction --soc-version "Ascend310P3"

Note that you need to specify the correct soc version. You can check the soc version by running the npu-smi info command.

To load a converted model hosted locally or on the 🤗 hub, you can do as follows :

from optimum.ascend import AscendModelForFeatureExtraction
from transformers import AutoTokenizer

MODEL_NAME = "moka-ai/m3e-base"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AscendModelForFeatureExtraction.from_pretrained(
    MODEL_NAME,
    export=True,
    task="feature-extraction",
    max_batch_size=8,
    max_sequence_length=512,
)
model_inputs = tokenizer(
    ["你好"],
    padding="longest",
    truncation=True,
    max_length=512,
    return_tensors="np",
)

outputs = model(**model_inputs)
om_output = outputs["sentence_embedding"]

Running the examples

Check out the examples directory to see how 🤗 Optimum Ascend can be used to optimize models and accelerate inference.

Do not forget to install requirements for every example:

cd <example-folder>
pip install -r requirements.txt

About

Optimized inference with Ascend and Hugging Face

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages