Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added GQA as eval dataset #298

Closed
wants to merge 64 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
e19133d
deepspeed running
anas-awadalla Aug 25, 2023
870f20c
more progress
anas-awadalla Aug 26, 2023
f9162a0
added ds checkpointing
anas-awadalla Aug 26, 2023
ded3485
more progress
anas-awadalla Aug 30, 2023
3672042
mllm
Aug 30, 2023
99c350f
merge deepspeed
anas-awadalla Aug 30, 2023
2f634f0
rewrite src: add VLM, Kosmos, Flamingo
i-gao Sep 7, 2023
7261639
fix kosmos models
i-gao Sep 11, 2023
09977ba
cosmetic: num_params helper fn
i-gao Sep 11, 2023
6bb9071
revert to deepspeed branch code for train/
i-gao Sep 11, 2023
7984adb
add BLIP
i-gao Sep 12, 2023
7eab26a
minor train script fixes
i-gao Sep 12, 2023
aed0f21
fix vocab len issues
i-gao Sep 13, 2023
47c8e19
fixes
i-gao Sep 13, 2023
11ab894
big refactor of training code
i-gao Sep 15, 2023
cd4f3aa
many fixes + rewrite FSDP for torch nightly
i-gao Sep 16, 2023
74686a7
fixes
i-gao Sep 16, 2023
61f5a3d
fixes
i-gao Sep 16, 2023
ccfcb0f
run linter & fix gradient ckpting
i-gao Sep 16, 2023
303e707
no need to untie embeddings for fsdp
i-gao Sep 16, 2023
fc660e7
add in missing kwarg
i-gao Sep 16, 2023
be9a4dd
Merge branch deepspeed: eval code only
i-gao Sep 16, 2023
b0ff9a4
update eval code to match new src args
i-gao Sep 16, 2023
92bc4b7
update documentation and example scripts
i-gao Sep 16, 2023
60a82d7
fix deepspeed train script
anas-awadalla Sep 17, 2023
82d1c69
removed non default loss scale window
anas-awadalla Sep 17, 2023
4875822
init flamingo embeds new weights
anas-awadalla Sep 17, 2023
8f2f040
init flamingo embeds new weights
anas-awadalla Sep 17, 2023
beba4d2
Merge branch 'main' into mllm
anas-awadalla Sep 17, 2023
b81379f
fix mmc4 sim threshold arg
anas-awadalla Sep 17, 2023
f91c14a
add z-loss
anas-awadalla Sep 17, 2023
df96979
Merge pull request #262 from mlfoundations/add-z-loss
anas-awadalla Sep 17, 2023
bcc5a8f
Update eval README.md
i-gao Sep 17, 2023
770e653
have a default stdev for init
Sep 17, 2023
ef268be
Update run_train_deepspeed.sh
anas-awadalla Sep 17, 2023
da07e35
fix loss impl and model vocab size
Sep 17, 2023
3fcda82
Merge branch 'mllm' of https://github.com/mlfoundations/open_flamingo…
Sep 17, 2023
bcd2cf5
remove ds act checkpointing exception
Sep 18, 2023
9b1a764
fixes from PR review
i-gao Sep 19, 2023
866a780
Merge branch 'mllm' of github.com:mlfoundations/open_flamingo into mllm
i-gao Sep 19, 2023
5ad05c4
add weight/bias init to decouple linear
anas-awadalla Sep 20, 2023
939d460
Language stream changes (#264)
anas-awadalla Sep 21, 2023
ae76178
grad checkpointing + ds saving patch (we should find a cleaner solution)
anas-awadalla Sep 21, 2023
d29c8b8
Update run_train_deepspeed.sh
anas-awadalla Oct 18, 2023
b7af1d6
clearer parameter count logging
anas-awadalla Oct 18, 2023
43ac961
Fix model vocab size (now it is len of tokenizer)
anas-awadalla Oct 18, 2023
e7684b5
Update code example
anas-awadalla Oct 18, 2023
735a880
fix LR schedule
anas-awadalla Oct 23, 2023
496e656
fix var naming in load_deepspeed_checkpoint
anas-awadalla Oct 24, 2023
c5feb97
Update losses.py
anas-awadalla Nov 30, 2023
dbb1ad8
train_utils media token fix
Dec 2, 2023
fa6af69
remove unnecessary model unwrap lines
Dec 2, 2023
eb6b8aa
Merge pull request #283 from mlfoundations/media_token_fix
anas-awadalla Dec 2, 2023
1e75320
remove deepspeed, some fixes, and llava
Feb 22, 2024
feba465
fix for siglip, llava, and lr decay
anas-awadalla Feb 24, 2024
0b1c926
remove z-loss mess
anas-awadalla Feb 24, 2024
79ad152
some more fixes
anas-awadalla Mar 17, 2024
3945c87
Update data.py
anas-awadalla Mar 17, 2024
52ca075
Update losses.py
anas-awadalla Mar 17, 2024
292afa1
fix flamingo init
anas-awadalla Mar 20, 2024
a72c96b
fix resampler projection
anas-awadalla Mar 21, 2024
c7a5ae5
Update helpers.py
anas-awadalla Mar 21, 2024
a5378a8
blip.py import and output truncation fix
Mar 28, 2024
358cecc
added gqa as eval dataset
May 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
run linter & fix gradient ckpting
i-gao committed Sep 16, 2023
commit ccfcb0f539969e07d208ede4f251466fa948e3b3
11 changes: 9 additions & 2 deletions open_flamingo/src/factory.py
Original file line number Diff line number Diff line change
@@ -181,6 +181,8 @@ def check_embedding_fns(lang_model):
if not has_fn(lang_model, "get_input_embeddings"):
if hasattr_recursive(lang_model, "transformer.wte"): # MPT
lang_model.get_input_embeddings = lambda: lang_model.transformer.wte
elif hasattr_recursive(lang_model, "model.decoder.embed_tokens"): # OPT
lang_model.get_input_embeddings = lambda: lang_model.decoder.embed_tokens
else:
raise ValueError(
"We require the language encoder to have a get_input_embeddings method but we couldn't determine the name of the input embeddings attribute. Please supply this manually in factory.py."
@@ -191,6 +193,10 @@ def check_embedding_fns(lang_model):
lang_model.set_input_embeddings = lambda x: setattr_recursive(
lang_model, "transformer.wte", x
)
elif hasattr_recursive(lang_model, "model.decoder.embed_tokens"): # OPT
lang_model.set_input_embeddings = lambda x: setattr_recursive(
lang_model, "model.decoder.embed_tokens", x
)
else:
raise ValueError(
"We require the language encoder to have a set_input_embeddings method but we couldn't determine the name of the input embeddings attribute. Please supply this manually in factory.py."
@@ -211,13 +217,14 @@ def check_embedding_fns(lang_model):
)
else:
raise ValueError(
"We require the language encoder to have a get_output_embeddings method but we couldn't determine the name of the output embeddings attribute. Please supply this manually in factory.py."
"We require the language encoder to have a set_output_embeddings method but we couldn't determine the name of the output embeddings attribute. Please supply this manually in factory.py."
)


def has_fn(model, fn_name):
"""Try to call the fn_name function on the model"""
try:
getattr(model, fn_name)()
return True
except:
return False
return False
4 changes: 2 additions & 2 deletions open_flamingo/src/vlm.py
Original file line number Diff line number Diff line change
@@ -75,7 +75,6 @@ def __init__(
self.lang_model.set_output_embeddings(out_embeds)

# gradient checkpointing
self._use_gradient_checkpointing = gradient_checkpointing
self.vision_tokenizer._use_gradient_checkpointing = gradient_checkpointing

def forward(
@@ -513,8 +512,9 @@ def __init__(
pad_token_id=pad_token_id,
gradient_checkpointing=gradient_checkpointing,
)
self.lang_model._use_gradient_checkpointing = gradient_checkpointing
self.decoder_layers_attr_name = decoder_layers_attr_name
for block in getattr_recursive(self.lang_model, self.decoder_layers_attr_name):
block._use_gradient_checkpointing = gradient_checkpointing
assert (
self.vis_embedding_dim == self.lang_embedding_dim
), "To place visual tokens direclty in the language stream, the visual and language tokens need to be the same dim."
2 changes: 0 additions & 2 deletions open_flamingo/train/train_utils.py
Original file line number Diff line number Diff line change
@@ -68,8 +68,6 @@ def train_one_epoch(
losses_to_log = {}
batch_metadata_to_log = {}
for dataset_ix, (images, (input_ids, attention_mask)) in enumerate(batches):
print(">> Dataset: ", datasets[dataset_ix].name, "Step: ", step_num)

# unpack the batch and move to device
images = images.to(device_id, dtype=cast_dtype, non_blocking=True)
input_ids = input_ids.to(device_id, non_blocking=True)