Description
Feature request
Hi!
Currently, microsoft/Florence-2-large-ft or related models cannot be loaded with HF pipeline("image-to-text") as its config is not recognised by AutoModelForVision2Seq.
When attempting to load it, Transformers raises:
“Unrecognised configuration class Florence2Config for this kind of AutoModel: AutoModelForVision2Seq.”
Florence-2 also requires trust_remote_code=True to be passed to the functions.
The current standard method works by loading Florence-2 with AutoModelForCausalLM and AutoProcessor, but this adds another flow if you are already using pipeline, Lora support also works well, having these in the pipeline would making it an amazing addition for its capable tasks.
Thanks!
Model:
https://huggingface.co/microsoft/Florence-2-large
Motivation
Adding support for pipeline with these models would give it another great set of options with tasks while lowering the barrier for entry, as the pipeline is a great feature that simplifies the writing and reusability of code for people. (Like me!)
Thanks again for all the amazing work.
Your contribution
I can test any proposed updates.