Add structured outputs. #7

sayakpaul · 2025-02-06T14:32:21Z

Will fix #3.

Facing:

Traceback (most recent call last):
  File "/Users/sayakpaul/Downloads/AdaptSum/main.py", line 191, in <module>
    demo = main(args)
  File "/Users/sayakpaul/Downloads/AdaptSum/main.py", line 178, in main
    with gr.Column("chat-window", elem_id="chat-window"):
TypeError: __init__() takes 1 positional argument but 2 positional arguments (and 1 keyword-only argument) were given

Happens with main branch too. @deep-diver could you check?

deep-diver · 2025-02-06T14:34:32Z

I am on it!

deep-diver · 2025-02-06T14:36:42Z

@sayakpaul which version of Gradio do you use?

sayakpaul · 2025-02-06T14:50:46Z

3.36.1

deep-diver · 2025-02-06T15:15:06Z

oh. it is kinda old version.
could you please update it to 5.14.0?

sayakpaul · 2025-02-07T02:40:03Z

That leads to:

ERROR: No matching distribution found for gradio==5.14.0

What am I missing? I am on Mac.

Meanwhile, I ran a simple test with the following:

from configs.responses import SummaryResponses
from google import genai
import os

client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))
response = client.models.generate_content(
    model="gemini-1.5-flash",
    contents=[
       ["Summarize Shakespeare's life work in a few sentences"]
    ],
    config={'response_mime_type': 'application/json',
        'response_schema': list[SummaryResponses],
    },
)
print(response.text)
print(response.parsed)

Gives me:

[{"previous_summary": "","updated_summary": "William Shakespeare's body of work includes 39 plays, 154 sonnets, two long narrative poems, and a few other verses, showcasing a mastery of language and exploration of universal human themes such as love, loss, ambition, and revenge. His plays are categorized into comedies, tragedies, and histories, each demonstrating his profound understanding of human nature and dramatic construction. His contributions to English literature and the theater are immeasurable and continue to resonate globally."}]
[SummaryResponses(previous_summary='', updated_summary="William Shakespeare's body of work includes 39 plays, 154 sonnets, two long narrative poems, and a few other verses, showcasing a mastery of language and exploration of universal human themes such as love, loss, ambition, and revenge. His plays are categorized into comedies, tragedies, and histories, each demonstrating his profound understanding of human nature and dramatic construction. His contributions to English literature and the theater are immeasurable and continue to resonate globally.")]

sayakpaul · 2025-02-07T03:22:28Z

Okay Python 3.10 resolved the issue.

Tried with the ViT paper:

When I printed the response.parsed here, I got:

response.parsed=[SummaryResponses(previous_summary="This paper introduces Vision Transformer (ViT), a pure transformer network for image recognition.  Unlike previous approaches that combined transformers with convolutional neural networks (CNNs), ViT processes images directly by splitting them into patches and treating those patches as tokens, similar to words in natural language processing (NLP). The key finding is that while ViT's performance on mid-sized datasets like ImageNet is initially modest compared to CNNs,  pre-training ViT on very large datasets (14M-300M images) dramatically improves its accuracy.  When pre-trained at scale and transferred to various image recognition benchmarks (ImageNet, CIFAR-100, VTAB, etc.), ViT achieves state-of-the-art results, surpassing or matching the performance of ResNet-like CNNs while requiring substantially less computational resources for training.  The authors attribute this success to the scalability of the transformer architecture and the advantage of large-scale training over relying on CNNs' inductive biases.  The paper also explores variations of the architecture, including hybrid models combining CNNs and transformers, and investigates the impact of pre-training dataset size and self-supervised learning.  Overall, the study demonstrates the significant potential of transformers for large-scale image recognition.", updated_summary='This ICLR 2021 paper, "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale," introduces Vision Transformer (ViT), a novel architecture applying the Transformer directly to image patches without using convolutional neural networks (CNNs).  ViT treats image patches as "tokens," linearly projects them into embeddings, adds positional embeddings, and feeds the sequence into a Transformer encoder.  The key architectural innovation is this direct Transformer application to image data, removing inductive biases of CNNs.  When trained on large datasets (14M-300M images), ViT achieves excellent results on various image recognition benchmarks, outperforming CNN-based models with fewer training resources.  The paper explores a hybrid CNN-Transformer approach, but its main contribution is demonstrating a pure Transformer architecture\'s effectiveness for large-scale image recognition.')]

So, I guess it's doing what it is supposed to be doing? What's next?

deep-diver · 2025-02-07T03:58:07Z

configs/responses.py

+    previous_summary: str
+    updated_summary: str


since this is the "response", I think we don't need previous_summary and updated_summary. Instead, just summary.

How do we keep track of the previous summary in a better manner otherwise?

LLM only generates summary. LLM does not generate previous summary (having previous_summary in response means that we ask LLM to generate previous summary). it is something to be given to the LLM as input (we don't give it as input currently though).

Not sure I understand this bit:

it is something to be given to the LLM as input (we don't give it as input currently though).

We're already doing it here no:

AdaptSum/main.py

Line 69 in 8ec75ed

previous_summary=state['summary'],

Or am I missing something?

@deep-diver

main.py

deep-diver

Thanks!

the code (in main.py) below stores the summary from the response:

state['summary'] = response.text
state['summary_history'].append(response.text)

now, response.text is a JSON formatted string. Hence, we need to parse and extract the value under summary key, then replace response.text in the original codes with it. something like below:

state['summary'] = response.parsed.summary
state['summary_history'].append(response.parsed.summary)

deep-diver

Thanks for the update!!

sayakpaul · 2025-02-08T11:59:08Z

@deep-diver I updated the schema to only include summary. Could you review once again?

deep-diver · 2025-02-09T02:39:38Z

@deep-diver I updated the schema to only include summary. Could you review once again?

Looks good to me!
We can discuss more later, but I think this PR is ready to be merged :)

updates

bae45d1

sayakpaul requested a review from deep-diver February 7, 2025 03:22

sayakpaul marked this pull request as ready for review February 7, 2025 03:22

deep-diver reviewed Feb 7, 2025

View reviewed changes

main.py Outdated Show resolved Hide resolved

deep-diver requested changes Feb 7, 2025

View reviewed changes

updates

8ec75ed

sayakpaul requested a review from deep-diver February 8, 2025 02:42

deep-diver approved these changes Feb 8, 2025

View reviewed changes

updates

bf71bbe

sayakpaul merged commit b8711d7 into main Feb 9, 2025

sayakpaul deleted the structured-outs branch February 9, 2025 03:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add structured outputs. #7

Add structured outputs. #7

sayakpaul commented Feb 6, 2025

deep-diver commented Feb 6, 2025

deep-diver commented Feb 6, 2025

sayakpaul commented Feb 6, 2025

deep-diver commented Feb 6, 2025

sayakpaul commented Feb 7, 2025

sayakpaul commented Feb 7, 2025

deep-diver Feb 7, 2025

sayakpaul Feb 8, 2025

deep-diver Feb 8, 2025

sayakpaul Feb 8, 2025

sayakpaul Feb 9, 2025

deep-diver left a comment

deep-diver left a comment

sayakpaul commented Feb 8, 2025

deep-diver commented Feb 9, 2025

Add structured outputs. #7

Add structured outputs. #7

Conversation

sayakpaul commented Feb 6, 2025

deep-diver commented Feb 6, 2025

deep-diver commented Feb 6, 2025

sayakpaul commented Feb 6, 2025

deep-diver commented Feb 6, 2025

sayakpaul commented Feb 7, 2025

sayakpaul commented Feb 7, 2025

deep-diver Feb 7, 2025

Choose a reason for hiding this comment

sayakpaul Feb 8, 2025

Choose a reason for hiding this comment

deep-diver Feb 8, 2025

Choose a reason for hiding this comment

sayakpaul Feb 8, 2025

Choose a reason for hiding this comment

sayakpaul Feb 9, 2025

Choose a reason for hiding this comment

deep-diver left a comment

Choose a reason for hiding this comment

deep-diver left a comment

Choose a reason for hiding this comment

sayakpaul commented Feb 8, 2025

deep-diver commented Feb 9, 2025