what is 1-shot / half-shot /quarter-shot constraint in experiments？ #185

21-10-4 · 2024-09-23T11:32:37Z

我还是无法理解。1-shot constraint代表the original token（包含一个示例），half-shot constraint指什么，半个示例？

Originally posted by @21-10-4 in #164 (comment)

21-10-4 · 2024-09-23T11:42:11Z

非常期待回复，感谢

cornzz · 2024-09-27T11:43:36Z

I also want to know what the compression targets are for GSM8K / BBH for 1-shot / half-shot etc., what is the target_token?

I was also wondering what zero-shot means here, specifically for the LongBench benchmark:
I suppose its clear for the tasks where there is context and input given, in that case one would just leave context empty and only insert input in the prompt? But what about the summarization tasks, or the lcc task, where there is only context but no input at all?

cornzz · 2024-09-30T16:04:11Z

@iofu728 sorry for bothering, but what exactly is the definition of "zero-shot" in the context of the ZeroScrolls benchmark? As stated here, ZeroScrolls is already a zero-shot benchmark by itself:

"ZeroSCROLLS is a zero-shot benchmark for natural language understanding over long texts."

so I am confused why there is an extra row for "zero-shot" for the ZeroScrolls benchmark in Table 2?

dongziyu1016 · 2024-10-09T03:23:53Z

我还想知道零样本在这里是什么意思，特别是对于 LongBench 基准：我认为对于有context和input给定的任务来说，它很明显，在这种情况下，人们只需留空context并只插入input提示即可？但是对于总结任务，或者一开始lcc只有context和没有的任务呢？input

I also want to know that how do summarization tasks

iofu728 · 2024-10-22T12:41:43Z

Hi @21-10-4, @cornzz, and @dongziyu1016, thanks for your questions, and apologies for the delayed response.

"1-shot", "half-shot", and "quarter-shot" refer to the number of tokens used in the prompt. "1-shot" means only one example is retained, while "half-shot" and "quarter-shot" indicate that the compressed tokens are equivalent to half and one-quarter of the average tokens used by a demonstration, respectively.
Zero-shot refers to not using any context or demonstrations beyond the question. For summarization, we retain 25 tokens before and after the document, while for LCC, we only retain the code context corresponding to the question.
Apologies for the confusion. In ZeroScrolls, zero-shot means no context information is used. You can refer to the following code for further details:

    res = []
    for task in TASKS:
        dataset = load_dataset("tau/zero_scrolls", task)["validation"]
        for ii, jj in tqdm(enumerate(dataset), total=len(dataset)):
            (prompt, question), output = get_zero_scrolls(jj, task)
            if not question:
                question = encoding.decode(encoding.encode(prompt)[:200])
            res.append({"id": ii, "task": task, "prompt": question, "output": output})
    json.dump(res, open("prompt/zero_scrolls/zero_shot.json", "w"))

cornzz · 2024-10-26T17:10:51Z

@iofu728 thanks for your response! I have some follow-up questions:

"1-shot" means only one example is retained

I do not quite understand this, as in the results table for GSM8K for 1-shot constraint the value in the "Tokens" column for LLMLingua-2 is 457. However, the longest demonstration in the uncompressed CoT prompt (prompt_hardest.txt) is already only 429 tokens long, so it cannot be the case that 1-shot constraint actually means only one of the demonstration is retained? (And I assume that only the token counts for the CoT demonstrations is counted in the "Tokens" column, as the value for Full-Shot exactly corresponds to the token count of prompt_hardest.txt)

Zero-shot refers to not using any context or demonstrations beyond the question.

How exactly is the prompt built for the zero-shot case, is the {context} placeholder in the prompt template literally just filled with an empty string, so that e.g. the Narrative QA prompt is Story: \n\nNow, answer the question based on the story...? This leads to instruct models answering "There is no story provided.", not even attempting to generate some answer. Is this intended?

For summarization, we retain 25 tokens before and after the document

Could you clarify, does this mean you keep 25 tokens from the beginning of the context / document and 25 tokens from the end and cut out the middle?

while for LCC, we only retain the code context corresponding to the question

I do not understand, how do I find out which part of the code context corresponds to the question? The prompt template given in eval_longbench.py is "Please complete the code given below. \n{context}Next line of code:\n". Perhaps you meant the repobench-p task, where there also is a question field given for each sample containing the relevant code, while the context field only contains more code for context?
Could you also clarify how zero_shot works for all other tasks, but especially the following: Passage Count, Passage Retrieval?

What exactly does that mean, "no context information is used", especially for summary tasks, there has to be something to summarize? Given your code example, which I do not quite understand (what is get_zero_scrolls() and what exactly does it return?), it seems that at least 200 tokens are retained in case question is empty? Does that means there are 200 tokens of context retained for the zero-shot case?

iofu728 self-assigned this Oct 22, 2024

iofu728 added the question Further information is requested label Oct 22, 2024

pzs19 mentioned this issue Nov 11, 2024

[Question]: Reproduction of Big Bench Hard with LLMLingua-2 #191

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

what is 1-shot / half-shot /quarter-shot constraint in experiments？ #185

what is 1-shot / half-shot /quarter-shot constraint in experiments？ #185

21-10-4 commented Sep 23, 2024

21-10-4 commented Sep 23, 2024

cornzz commented Sep 27, 2024 •

edited

Loading

cornzz commented Sep 30, 2024 •

edited

Loading

dongziyu1016 commented Oct 9, 2024

iofu728 commented Oct 22, 2024

cornzz commented Oct 26, 2024 •

edited

Loading

what is 1-shot / half-shot /quarter-shot constraint in experiments？ #185

what is 1-shot / half-shot /quarter-shot constraint in experiments？ #185

Comments

21-10-4 commented Sep 23, 2024

21-10-4 commented Sep 23, 2024

cornzz commented Sep 27, 2024 • edited Loading

cornzz commented Sep 30, 2024 • edited Loading

dongziyu1016 commented Oct 9, 2024

iofu728 commented Oct 22, 2024

cornzz commented Oct 26, 2024 • edited Loading

cornzz commented Sep 27, 2024 •

edited

Loading

cornzz commented Sep 30, 2024 •

edited

Loading

cornzz commented Oct 26, 2024 •

edited

Loading