Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Folders improvements post title #11467

Merged
merged 3 commits into from
Mar 19, 2025
Merged

Folders improvements post title #11467

merged 3 commits into from
Mar 19, 2025

Conversation

spolu
Copy link
Contributor

@spolu spolu commented Mar 19, 2025

Description

FIxes: https://github.com/dust-tt/tasks/issues/2420

In the context of folders manual uploads:

  • Stop injecting title:X tags since we have titles now
  • Stop injecting $title: X as prefix because it's not idempotent so if you re-edit your doc it piles up. It's a bit of a loss but it wasn't really principled. Also we we're not doing it in api/v1 so it's better aligned this way.
  • Remove references to name in the font-end which don't make any sense now that we have title.

I kept documentId at creation to be equal to title. We could generate an sId but it's not required so leaving as is.

Tests

Tested locally. Covered to some extent by tests.

Risk

Low

Deploy Plan

  • deploy front

@spolu spolu requested a review from philipperolet March 19, 2025 12:32
Copy link
Contributor

@philipperolet philipperolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! Left a few coms

const titleInTags = nonNullTags
.find((t) => t.startsWith("title:"))
?.substring(6);
if (!titleInTags) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a will to enforce consistency between title field and title: tag, that's manifested in v1 e.g. here

This would mark a move to stop relying on the title: tag, and deprecate it. Good thing, IMO, but since this tag is ancient we might see a few bumps down the road

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could follow up by a card to remove dependency on title: tags completely in code (i.e. ignore it if set)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we're doing anything anymore with title: at this stage. In any case we always handled smoothly it's lack thereof. Note that the link you shared is not used by API/v1 and in in API/v1 we don't enforce any presence of the title tag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the upsertTable in my link is used by v1 /csv I think; there's also this one

which is not enforcing indeed, just monitoring
My comment was not meant on changing code, but suggesting we could take the opportunity to clean up all references to title: in the code (about ~30 refs or so)

if (titleInTags && titleInTags !== title) {
logger.warn(
{ dataSourceId: dataSource.sId, documentId, titleInTags, title },
"Inconsistency between tags and title."
);
}

// Add selection of tags as prefix to the section if they are present.
let tagsPrefix = "";
["title", "author"].forEach((t) => {
Copy link
Contributor

@philipperolet philipperolet Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression that we send either a clean section or separated text / title here on edit?

So I'm not sure we duplicate those prefixes when we edit a document (via the modal at least). Good to stop relying on the title tag, but maybe still worth adding the title (not tag) as prefix and keeping the author?

Copy link
Contributor Author

@spolu spolu Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it really feels broken if you reopen a document in the modal. So it's definitely not a long standing solution. I introduced it and I think it was a bad idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok you meant in the UX
I was just sad not to leverage balanced structure-preserving chunking on this one, could have an impact on relevance for large document upserts.
But not strongly held so approving

// For folders documents created from the app (this endpoint) we use the document title as
// ID. This is inherited behavior that is perfectly valid but we might want to move to
// generating IDs in the future.
document_id: title,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the limit with doing that is that if I create a document that has the same title as another in the same datasource, it updates the first 1 instead of creating a new one or failing.

Pre-existing issue to this PR though, will have a look

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is the current behavior

@philipperolet philipperolet self-requested a review March 19, 2025 14:13
Copy link
Contributor

@philipperolet philipperolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@spolu spolu merged commit 72b0c0f into main Mar 19, 2025
11 checks passed
@spolu spolu deleted the spolu-folders_improvments branch March 19, 2025 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants