Data structure #16

floppycracken · 2024-06-25T14:15:48Z

Thanks for the great work and sharing the code.
I wanted to ask about how to prepare the data for the case, the output from the model is both image and text. This would be similar to the case of seedx-ppt model.

geyuying · 2024-07-21T06:18:12Z

Hi, we currently support the following dataloader with the specified data structure.

For "build_llava_jsonl_datapipes" dataloader, each folder stores a number of jsonl files, each jsonl file contains 10K pieces of content, with an example of the content as follows:

{"image": "coco/train2017/000000033471.jpg", "data": ["What are the colors of the bus in the image?", "The bus in the image is white and red.", "What feature can be seen on the back of the bus?", "The back of the bus features an advertisement.", "Is the bus driving down the street or pulled off to the side?", "The bus is driving down the street, which is crowded with people and other vehicles."]}

For "build_caption_datapipes_with_pixels" dataloder, each folder stores a number of .tar files and reads image-text pairs in the form of webdataset.

For "build_single_turn_edit_datapipes" dataloder, each folder stores a number of jsonl files, each jsonl file contains 10K pieces of content, with an example of the content as follows:

{"source_image": "source_images/f6f4d0669694df5b.jpg", "target_image": "target_images/f6f4d0669694df5b.jpg", "instruction": "Erase the car that is parked in front of the Roebuck building."}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data structure #16

Data structure #16

floppycracken commented Jun 25, 2024

geyuying commented Jul 21, 2024

Data structure #16

Data structure #16

Comments

floppycracken commented Jun 25, 2024

geyuying commented Jul 21, 2024