-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
what is 1-shot / half-shot /quarter-shot constraint in experiments? #185
Comments
非常期待回复,感谢 |
I also want to know what the compression targets are for GSM8K / BBH for 1-shot / half-shot etc., what is the target_token? I was also wondering what zero-shot means here, specifically for the LongBench benchmark: |
@iofu728 sorry for bothering, but what exactly is the definition of "zero-shot" in the context of the ZeroScrolls benchmark? As stated here, ZeroScrolls is already a zero-shot benchmark by itself:
so I am confused why there is an extra row for "zero-shot" for the ZeroScrolls benchmark in Table 2? |
I also want to know that how do summarization tasks |
Hi @21-10-4, @cornzz, and @dongziyu1016, thanks for your questions, and apologies for the delayed response.
res = []
for task in TASKS:
dataset = load_dataset("tau/zero_scrolls", task)["validation"]
for ii, jj in tqdm(enumerate(dataset), total=len(dataset)):
(prompt, question), output = get_zero_scrolls(jj, task)
if not question:
question = encoding.decode(encoding.encode(prompt)[:200])
res.append({"id": ii, "task": task, "prompt": question, "output": output})
json.dump(res, open("prompt/zero_scrolls/zero_shot.json", "w")) |
@iofu728 thanks for your response! I have some follow-up questions:
I do not quite understand this, as in the results table for GSM8K for 1-shot constraint the value in the "Tokens" column for LLMLingua-2 is 457. However, the longest demonstration in the uncompressed CoT prompt (prompt_hardest.txt) is already only 429 tokens long, so it cannot be the case that 1-shot constraint actually means only one of the demonstration is retained? (And I assume that only the token counts for the CoT demonstrations is counted in the "Tokens" column, as the value for Full-Shot exactly corresponds to the token count of
How exactly is the prompt built for the zero-shot case, is the
Could you clarify, does this mean you keep 25 tokens from the beginning of the context / document and 25 tokens from the end and cut out the middle?
I do not understand, how do I find out which part of the code context corresponds to the question? The prompt template given in What exactly does that mean, "no context information is used", especially for summary tasks, there has to be something to summarize? Given your code example, which I do not quite understand (what is |
我还是无法理解。1-shot constraint代表the original token(包含一个示例) ,half-shot constraint指什么,半个示例?
Originally posted by @21-10-4 in #164 (comment)
The text was updated successfully, but these errors were encountered: