Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: CompletionProvider API #379

Open
wants to merge 105 commits into
base: main
Choose a base branch
from

Conversation

FFFiend
Copy link
Collaborator

@FFFiend FFFiend commented Jul 12, 2023

What kind of change does this PR introduce?
Added a rough sketch of what the infrastructure surrounding the models we use might look like. Addresses #252 , #69 and #368 and may also be tangentially/directly relevant to other issues

Checklist

  • My code follows the style guidelines of OpenAdapt
  • I have performed a self-review of my code
  • If applicable, I have added tests to prove my fix is functional/effective
  • I have linted my code locally prior to submission
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (e.g. README.md, requirements.txt)
  • New and existing unit tests pass locally with my changes

How can your code be run and tested?

Other information

@OpenAdaptAI OpenAdaptAI deleted a comment from cr-gpt bot Jul 12, 2023
@OpenAdaptAI OpenAdaptAI deleted a comment from cr-gpt bot Jul 12, 2023
@OpenAdaptAI OpenAdaptAI deleted a comment from cr-gpt bot Jul 12, 2023
Comment on lines 1 to 2
PRIORITY = ["CPU", "GPU", "HOSTED"]
from loguru import logger
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PRIORITY = ["CPU", "GPU", "HOSTED"]
from loguru import logger
from loguru import logger
PRIORITY = ["CPU", "GPU", "HOSTED"]

@abrichr
Copy link
Contributor

abrichr commented Jul 18, 2023

Thanks for getting started on this @FFFiend !

What do you think about:

class Modality:
    TEXT = "TEXT"
    IMAGE = "IMAGE"

MODALITY_BY_DB_MODEL = {
    models.ActionEvent: Modality.TEXT,
    models.Screenshot: Modality.IMAGE,
    models.WindowEvent: Modality.TEXT,
}

ML_MODELS = [
    {
        "name": "gpt-4",
        "provider": openai.ml.openai,
        # "tune_func": openadapt.ml.openai.gpt4.tune,
        # "infer_func": openadapt.ml.openai.gpt4.infer,
        "modalities": openadapt.ml.openai.gpt4.MODALITIES,  #  gpt4/__init__.py: MODALITIES =  [Modality.TEXT],
    },
     {
        "name": "BLIP2",
        "provider": openai.ml.huggingface,
        # "tune_func": openadapt.ml.huggingface.blip2.tune,
        # "infer_func": openadapt.ml.huggingface.blip2.infer,
        "modalities": openadapt.ml.huggingface.blip2.MODALITIES,  #  huggingface/__init__.py: MODALITIES =  [Modality.TEXT, Modality.IMAGE],
    },
    ...
]

    
def select_ml_model(recording_ids):
    recording_data = get_data_by_modality(recording_ids)
    for ml_model in ML_MODELS:
        # evaluate the performance of the given models autoregressively on an entire recording
        # (optionally with start_timestamp/end_timestamp parameters for evaluating on a section of a recording)

@FFFiend
Copy link
Collaborator Author

FFFiend commented Aug 30, 2023

@FFFiend it looks like this PR includes several unrelated changes (e.g. alembic/versions/5139d7df38f6_add_recording_task_description.py).

Can you please remove all unrelated code? 🙏

@abrichr removed all unrelated code, however privacy provider tests (fixed in #488 ) are failing, so just waiting on that to get merged 👍

@KrishPatel13
Copy link
Collaborator

@FFFiend #488 is merged now. Please update the branch then continue :-)

@FFFiend FFFiend requested a review from abrichr August 30, 2023 19:10
@FFFiend
Copy link
Collaborator Author

FFFiend commented Aug 30, 2023

@abrichr Ready to merge.

class CompletionProviderFactory:
"""CompletionProviderFactory class."""

def get_for_modality(modality: Modality) -> list[CompletionProvider]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add @staticmethod

"""To be raised when a model is not supported by the infrastructure."""

def __init__(self, model_name: str) -> None:
"""Init method for the ModelNotImplementedError class."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove

super().__init__(f"{model_name} is not currently supported")


class GpuNotAvailableError(BaseException):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename to GPUNotAvailableError

@@ -0,0 +1 @@
"""HuggingFace ml package."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please name the submodule openadapt.ml.huggingface?

Capability.INFERENCE,
]
Modalities: list[Modality] = [Modality.TEXT]
Availabilities: list[Availability] = [Availability.HOSTED]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be Availabilities: list[Availability] = [Availability.LOCAL_CPU, Availability.LOCAL_GPU]?

We would need a separate provider for HOSTED models.

Can you please rename this LocalHuggingFaceProvider, and add a TODO above the class to implement one or more HostedHuggingFaceProvider to support Inference API and Inference Endpoints at https://huggingface.co/docs/huggingface_hub/guides/inference?

prompt: str,
model_path: str,
task_description: str,
use_pipeline: bool = True,
Copy link
Contributor

@abrichr abrichr Aug 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove if unused 🙏

self,
prompt: str,
model_path: str,
task_description: str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename to huggingface_task to differentiate between https://github.com/OpenAdaptAI/OpenAdapt/blob/main/openadapt/models.py#L46

def finetune(self, prompt_completion_pair: list[dict[str]]) -> None:
"""Fine-tune the GPT model."""
# waiting on FineTuning PR which is currently
# a work in progress.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please modify this comment to say:

TODO: implement once fine tuning is merged

valid_inference_models = openai.Model.list()["data"]
for model_dict in valid_inference_models:
if model_dict["id"] == gpt_model_name:
if gpt_model_name == "gpt-4" or gpt_model_name == "gpt-3.5-turbo":
Copy link
Contributor

@abrichr abrichr Aug 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if gpt_model_name in CHAT_MODEL_NAMES:

Ideally we would not have to define CHAT_MODEL_NAMES from scratch but load it from the openai library.

@@ -0,0 +1,47 @@
"""Conda Image creation tool."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please move this to openadapt/ml/utils/modal.py?

from openadapt.ml.huggingface_models.generic_provider import GenericHuggingFaceProvider
from openadapt.ml.open_ai.gpt.gptprovider import GPTCompletionProvider

test_gpt_provider = GPTCompletionProvider()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please instantiate these inside of a @pytest.fixture?

@staticmethod
def get_for_modality(modality: Modality) -> list[CompletionProvider]:
"""Gets all available CompletionProviders with the given modality."""
completion_providers = CompletionProvider.get_children()
Copy link
Collaborator Author

@FFFiend FFFiend Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abrichr On adding self to the CompletionProvider class, one would have to pass in parameters to the CompletionProvider that adhere to its attributes types here inside the factory (else it throws the following error):

E   pydantic.error_wrappers.ValidationError: 4 validation errors for CompletionProvider
E   Name
E     field required (type=value_error.missing)
E   Capabilities
E     field required (type=value_error.missing)
E   Modalities
E     field required (type=value_error.missing)
E   Availabilities
E     field required (type=value_error.missing)

pydantic/main.py:341: ValidationError

so we would be generating a CompletionProvider object with specific parameters. Not sure if that's adheres to: https://www.digitalocean.com/community/tutorials/factory-design-pattern-in-java

@FFFiend FFFiend requested a review from abrichr August 31, 2023 20:26
@abrichr
Copy link
Contributor

abrichr commented Dec 13, 2023

This is now stale and needs to be updated to support images.

Related: #527

@FFFiend do you have any bandwidth? 🙏

@abrichr
Copy link
Contributor

abrichr commented Dec 14, 2023

@FFFiend would love your thoughts on #534 🙏 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants