Skip to content

Latest commit

 

History

History
69 lines (48 loc) · 5.33 KB

the-diy-version-w-gory-details.md

File metadata and controls

69 lines (48 loc) · 5.33 KB

The DIY version (with gory details)

Follow the prerequisites below to set up your environment before running the code:

  1. Python 3.11: Setup a Python 3.11 virtual environment and install FMBench.

    python -m venv .fmbench
    pip install fmbench
    
  2. S3 buckets for test data, scripts, and results: Create two buckets within your AWS account:

    • Read bucket: This bucket contains tokenizer files, prompt template, source data and deployment scripts stored in a directory structure as shown below. FMBench needs to have read access to this bucket.

      s3://<read-bucket-name>
          ├── source_data/
          ├── source_data/<source-data-file-name>.json
          ├── prompt_template/
          ├── prompt_template/prompt_template.txt
          ├── scripts/
          ├── scripts/<deployment-script-name>.py
          ├── tokenizer/
          ├── tokenizer/tokenizer.json
          ├── tokenizer/config.json
      
      • The details of the bucket structure is as follows:

        1. Source Data Directory: Create a source_data directory that stores the dataset you want to benchmark with. FMBench uses Q&A datasets from the LongBench dataset or alternatively from this link. Support for bring your own dataset will be added soon.

          • Download the different files specified in the LongBench dataset into the source_data directory. Following is a good list to get started with:

            • 2wikimqa
            • hotpotqa
            • narrativeqa
            • triviaqa

            Store these files in the source_data directory.

        2. Prompt Template Directory: Create a prompt_template directory that contains a prompt_template.txt file. This .txt file contains the prompt template that your specific model supports. FMBench already supports the prompt template compatible with Llama models.

        3. Scripts Directory: FMBench also supports a bring your own script (BYOS) mode for deploying models that are not natively available via SageMaker JumpStart i.e. anything not included in this list. Here are the steps to use BYOS.

          1. Create a Python script to deploy your model on a SageMaker endpoint. This script needs to have a deploy function that 2_deploy_model.ipynb can invoke. See p4d_hf_tgi.py for reference.

          2. Place your deployment script in the scripts directory in your read bucket. If your script deploys a model directly from HuggingFace and needs to have access to a HuggingFace auth token, then create a file called hf_token.txt and put the auth token in that file. The .gitignore file in this repo has rules to not commit the hf_token.txt to the repo. Today, FMBench provides inference scripts for:

            Deployment scripts for the options above are available in the scripts directory, you can use these as reference for creating your own deployment scripts as well.

        4. Tokenizer Directory: Place the tokenizer.json, config.json and any other files required for your model's tokenizer in the tokenizer directory. The tokenizer for your model should be compatible with the tokenizers package. FMBench uses AutoTokenizer.from_pretrained to load the tokenizer.

          As an example, to use the Llama 2 Tokenizer for counting prompt and generation tokens for the Llama 2 family of models: Accept the License here: meta approval form and download the tokenizer.json and config.json files from Hugging Face website and place them in the tokenizer directory.

    • Write bucket: All prompt payloads, model endpoint and metrics generated by FMBench are stored in this bucket. FMBench requires write permissions to store the results in this bucket. No directory structure needs to be pre-created in this bucket, everything is created by FMBench at runtime.

      s3://<write-bucket-name>
          ├── <test-name>
          ├── <test-name>/data
          ├── <test-name>/data/metrics
          ├── <test-name>/data/models
          ├── <test-name>/data/prompts