Skip to content

Add citations #58

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Add citations #58

wants to merge 9 commits into from

Conversation

galopyz
Copy link

@galopyz galopyz commented Feb 4, 2025

Hi, @ncoop57

Adding an example to use citations in claudette as stated in #53.
I am still new to git, claudette, and open source projects, so I might have messed up something.

I only provided an example to use plain text. PDF takes many tokens, but if we want to have a PDF example, I could find a small one and show how it can be used.
When using citations, Claude responds with alternating text and citation blocks. The response doesn't look pretty because only the first text block is shown without any citation blocks.

Please let me know if I need to make any corrections.
Thanks.

Copy link

gitnotebooks bot commented Feb 4, 2025

Found 1 changed notebook. Review the changes at https://app.gitnotebooks.com/AnswerDotAI/claudette/pull/58

@ncoop57
Copy link
Contributor

ncoop57 commented Feb 5, 2025

Thanks @galopyz for this PR! its coming together nicely 🤓 How about we work on getting the response to look better especially when it does a lot of citations? If you look at Simon Willison's blog post on it, he has a little section on rendering these citations: https://simonwillison.net/2025/Jan/24/anthropics-new-citations-api/#rendering-the-citations. How about you see if you could add something similar?

@galopyz
Copy link
Author

galopyz commented Feb 7, 2025

Hi, @ncoop57
I have added those changes to display messages with citations as block quotes.
I have merged my branch from AnswerDotAI:main because mine was behind, and there are many changes in gitnotebooks. Should I revert this change?

Let me know if anything needs to be changed. And should I run the whole notebooks? To make sure I didn't break anything? I tried my best to follow coding style and not break anything.

@ncoop57
Copy link
Contributor

ncoop57 commented Feb 23, 2025

@galopyz thanks for the awesome PR!! I fixed the formatting a bit for the citations and a slight bug in the contents function.

@jph00 want your thoughts on how these citations are now handled in claudette. Is automatically formatting them with inline citations for the response okay or do you think we need to handle it a different way?

@ncoop57 ncoop57 requested a review from jph00 February 23, 2025 15:21
Copy link
Contributor

@jph00 jph00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how I feel about changing contents() like this -- it feels a bit too much like a fixed format. E.g. I suspect I'd prefer to use markdown footnotes instead -- different people will have different preferences for how citations are handled. Any thoughts on how it could be made more flexible?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#| hide

from anthropic.types import Model

Note that imports and non-import statements must not be mixed in a single cell. So this cell should be split in two.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah @austinvhuang this is actually originally your code I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude is capable of providing detailed citations when answering questions about documents, helping you track and verify information sources in responses.

Claude expects an documents to have the following structure

"a document"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llms_ctx = xget("https://claudette.answer.ai/llms-ctx.txt").text[:2_048]
doc = mk_doc(llms_ctx, title="Claudette LLM Context", context="This is a trustworthy document.", citation=True)
doc

Let's trim this output - it's too long for good docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#| exports

Each exported cell should have an example of use with it, and also prose explaining it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c = Chat(model)
q = "What is Claudette?"
r = c([doc, q])
r

This output is a bit too lengthy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#| exports
def cite_msgs(msg) -> str:

This function name should probably have "markdown" or "md" somewhere in it -- it's specifically a function to create markdown output.

@galopyz
Copy link
Author

galopyz commented Feb 25, 2025

@jph00
I think we can provide a good default for handling citations. For instance, we can use the markdown footnote.
If users want to change, they can pass an option to Chat to handle inline citation, endnote or custom. For custom option, they can provide a function that's applied on the response message content.

We can also add the citation option to Chat.__call__.

How does it sound? Am I providing too many options for users when they may not need them? Or should I let users take full control of the response message?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants