You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Great job! I'm curious if there are comparative experiments regarding window_block_indexes and out_feature_indexes settings. Why are the attention settings within the window specifically at layers 0, 1, 3, 6, 7, and 9?
For example, what impact would increasing or decreasing the number of window_block_indexes have on the metrics? Thanks.
The text was updated successfully, but these errors were encountered:
Great job! I'm curious if there are comparative experiments regarding window_block_indexes and out_feature_indexes settings. Why are the attention settings within the window specifically at layers 0, 1, 3, 6, 7, and 9?
For example, what impact would increasing or decreasing the number of window_block_indexes have on the metrics? Thanks.
The text was updated successfully, but these errors were encountered: