Question about LLaVA-NeXT-Video fine-tuning example

Hello,

I tried following your example shown here https://github.com/NielsRogge/Transformers-Tutorials/blob/master/LLaVA-NeXT-Video/Fine_tune_LLaVa_NeXT_Video_with_HFTrainer.ipynb

Without changing a line of code, the tutorial currently emits the following error: ValueError: Video features and video tokens do not match: tokens: 1004, features 4608


I google'd around but didn't find if anyone has identified a clean solution for this yet. Wondering if you've had any luck

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about LLaVA-NeXT-Video fine-tuning example #485

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question about LLaVA-NeXT-Video fine-tuning example #485

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions