Skip to content

New pic #4858

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

New pic #4858

wants to merge 15 commits into from

Conversation

ctrlz526
Copy link

知识库导入图片

Copy link

cla-assistant bot commented May 21, 2025

CLA assistant check
All committers have signed the CLA.

Copy link

cla-assistant bot commented May 21, 2025

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ c121914yu
❌ ctrlz526
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link

Preview mcp_server Image: ghcr.io/labring/fastgpt-pr:fatsgpt_mcp_server_45d5a4cb497f1fff09d13ae5f52fbe88e56e3d41

Copy link

Preview sandbox Image: ghcr.io/labring/fastgpt-pr:fatsgpt_sandbox_45d5a4cb497f1fff09d13ae5f52fbe88e56e3d41

Copy link

Preview mcp_server Image: ghcr.io/labring/fastgpt-pr:fatsgpt_mcp_server_59347f25e843055ef21c70578b6152900c056902

Copy link

Preview sandbox Image: ghcr.io/labring/fastgpt-pr:fatsgpt_sandbox_59347f25e843055ef21c70578b6152900c056902

Copy link

Preview mcp_server Image: ghcr.io/labring/fastgpt-pr:fatsgpt_mcp_server_75d0e7c0dcd72f65ab8fdb3cac6805b24bb206da

Copy link

Preview sandbox Image: ghcr.io/labring/fastgpt-pr:fatsgpt_sandbox_75d0e7c0dcd72f65ab8fdb3cac6805b24bb206da

Copy link

Preview mcp_server Image: ghcr.io/labring/fastgpt-pr:fatsgpt_mcp_server_aa7c498ba040190c4b75b4c651b0d7f9bb0b0f47

Copy link

Preview sandbox Image: ghcr.io/labring/fastgpt-pr:fatsgpt_sandbox_aa7c498ba040190c4b75b4c651b0d7f9bb0b0f47

Copy link

Preview fastgpt Image: ghcr.io/labring/fastgpt-pr:fatsgpt_aa7c498ba040190c4b75b4c651b0d7f9bb0b0f47

Copy link

Preview mcp_server Image: ghcr.io/labring/fastgpt-pr:fatsgpt_mcp_server_0a930f59e4f2eaefe759038acb76d6f10f14bdd1

Copy link

Preview sandbox Image: ghcr.io/labring/fastgpt-pr:fatsgpt_sandbox_0a930f59e4f2eaefe759038acb76d6f10f14bdd1

Copy link

Preview fastgpt Image: ghcr.io/labring/fastgpt-pr:fatsgpt_0a930f59e4f2eaefe759038acb76d6f10f14bdd1

Copy link

Preview mcp_server Image: ghcr.io/labring/fastgpt-pr:fatsgpt_mcp_server_b39c5f6e96ad40b9efddeb43c6ccdb0f85634d71

Copy link

Preview sandbox Image: ghcr.io/labring/fastgpt-pr:fatsgpt_sandbox_b39c5f6e96ad40b9efddeb43c6ccdb0f85634d71

Copy link

Preview fastgpt Image: ghcr.io/labring/fastgpt-pr:fatsgpt_b39c5f6e96ad40b9efddeb43c6ccdb0f85634d71

Copy link

Preview mcp_server Image: ghcr.io/labring/fastgpt-pr:fatsgpt_mcp_server_52abcd91efd8719a507cea45406f3fc9d88cb3ba

Copy link

Preview sandbox Image: ghcr.io/labring/fastgpt-pr:fatsgpt_sandbox_52abcd91efd8719a507cea45406f3fc9d88cb3ba

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个不用动

Copy link

Preview mcp_server Image: ghcr.io/labring/fastgpt-pr:fatsgpt_mcp_server_b39c5f6e96ad40b9efddeb43c6ccdb0f85634d71

Copy link

Preview sandbox Image: ghcr.io/labring/fastgpt-pr:fatsgpt_sandbox_b39c5f6e96ad40b9efddeb43c6ccdb0f85634d71

Copy link

Preview fastgpt Image: ghcr.io/labring/fastgpt-pr:fatsgpt_b39c5f6e96ad40b9efddeb43c6ccdb0f85634d71

Copy link

Preview mcp_server Image: ghcr.io/labring/fastgpt-pr:fatsgpt_mcp_server_87d0338b940df0c9f123f4c5f6fb02e7f49cf933

Copy link

Preview sandbox Image: ghcr.io/labring/fastgpt-pr:fatsgpt_sandbox_87d0338b940df0c9f123f4c5f6fb02e7f49cf933

Copy link

Deployment Status: ✅ Success
🔗 Preview URL: https://a1ac917b.fastgpt-8gr.pages.dev

Copy link

Preview mcp_server Image: ghcr.io/labring/fastgpt-pr:fatsgpt_mcp_server_bc119431e548d818ca2500ea69d7368037dc7d4b

Copy link

Deployment Status: ✅ Success
🔗 Preview URL: https://bc9a86f2.fastgpt-8gr.pages.dev

Copy link

Preview sandbox Image: ghcr.io/labring/fastgpt-pr:fatsgpt_sandbox_bc119431e548d818ca2500ea69d7368037dc7d4b

Copy link

Preview fastgpt Image: ghcr.io/labring/fastgpt-pr:fatsgpt_bc119431e548d818ca2500ea69d7368037dc7d4b

Copy link

Preview sandbox Image: ghcr.io/labring/fastgpt-pr:fatsgpt_sandbox_c97435df9891de0cd9cd5f7d8b7186b0466e3596

Copy link

Deployment Status: ✅ Success
🔗 Preview URL: https://5470a753.fastgpt-8gr.pages.dev

Copy link

Preview mcp_server Image: ghcr.io/labring/fastgpt-pr:fatsgpt_mcp_server_c97435df9891de0cd9cd5f7d8b7186b0466e3596

Copy link

Preview fastgpt Image: ghcr.io/labring/fastgpt-pr:fatsgpt_c97435df9891de0cd9cd5f7d8b7186b0466e3596

Copy link

Preview mcp_server Image: ghcr.io/labring/fastgpt-pr:fatsgpt_mcp_server_30df67e2c46be45b9b41e2f5243bcd7ec9a25ee7

Copy link

Preview sandbox Image: ghcr.io/labring/fastgpt-pr:fatsgpt_sandbox_30df67e2c46be45b9b41e2f5243bcd7ec9a25ee7

Copy link

Deployment Status: ✅ Success
🔗 Preview URL: https://0a88f0d9.fastgpt-8gr.pages.dev

Copy link

Preview mcp_server Image: ghcr.io/labring/fastgpt-pr:fatsgpt_mcp_server_8cb62531a56435a17182d4af7fe53eb609e7fb57

Copy link

Preview sandbox Image: ghcr.io/labring/fastgpt-pr:fatsgpt_sandbox_8cb62531a56435a17182d4af7fe53eb609e7fb57

Copy link

Deployment Status: ✅ Success
🔗 Preview URL: https://9b0bc6fd.fastgpt-8gr.pages.dev

Copy link

Preview fastgpt Image: ghcr.io/labring/fastgpt-pr:fatsgpt_8cb62531a56435a17182d4af7fe53eb609e7fb57

@@ -11,7 +11,7 @@ export const fileImgs = [
// { suffix: '.', src: '/imgs/files/file.svg' }
];

export function getFileIcon(name = '', defaultImg = 'file/fill/file') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个能改么

@@ -81,6 +81,7 @@ export type ApiDatasetCreateDatasetCollectionParams = ApiCreateDatasetCollection
};
export type FileIdCreateDatasetCollectionParams = ApiCreateDatasetCollectionParams & {
fileId: string;
collectionId?: string;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

创建集合为啥要带个 collectionId

@@ -129,6 +130,7 @@ export type PushDatasetDataChunkProps = {
a?: string; // bonus content
chunkIndex?: number;
indexes?: Omit<DatasetDataIndexItemType, 'dataId'>[];
imageFileId?: string; //file id preview
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

放在 a 字段下面

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

检查其他地方的也是,不要把 q,a,imageId 分开写

@@ -0,0 +1,19 @@
export interface DatasetCollectionImageSchema {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

系统都是用 type,非必要不用 interface

Copy link
Collaborator

@c121914yu c121914yu May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dataset/image/type.d.ts
这个表是图库,不是图片知识库。

@@ -0,0 +1,19 @@
export interface DatasetCollectionImageSchema {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DatasetImageSchema

@@ -56,6 +59,7 @@ export const createCollectionAndInsertData = async ({

billId?: string;
session?: ClientSession;
parentCollectionId?: string;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这是啥逻辑

@@ -156,7 +170,26 @@ export const createCollectionAndInsertData = async ({
return newBillId;
})();

// 5. insert to training queue
// 5. Update the collectionId field in the image record (for image collections)
if (createCollectionParams.metadata?.isImageCollection && createCollectionParams.fileId) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

都说了,不能主动指定啥 isImageCollection

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createImageCollection 是一个单独的 api,这个函数不应该有任何改动。

trainingType: DatasetCollectionSchemaType['trainingType'];
autoIndexes?: DatasetCollectionSchemaType['autoIndexes'];
imageIndex?: DatasetCollectionSchemaType['imageIndex'];
export const getTrainingModeByCollection = ({
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不用改,trainingType 已经有图片处理了

@@ -372,3 +411,38 @@ export async function delCollection({
);
});
}

export async function pushImageFileToTrainingQueue({
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要批量加,不能一个个加

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

重写这个文件。

  1. 在 rawText 改成可选。 rawText 统计增加一个 imageIdList 的字段,表示插入的图片 ID。
  2. 修复第一步:const chunks = 有 rawText 的话走分块。没有的话,把图片 ID 作为 chunks,chunks 结构改成: q,a,imageId
  3. 最后批量删除过期索引


/* ============= dataset images ========== */

export async function createDatasetImage({
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

放到 dataset/image/controller.ts 里

return String(image._id);
}

export async function getDatasetImage(imageId: string): Promise<any> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不能写 any


export async function getDatasetImage(imageId: string): Promise<any> {
try {
if (!imageId || imageId.length !== 24) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Types.ObjectId(id).isValid() 可以校验

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Types 从 mongoose 引入

@@ -145,3 +147,69 @@ try {
}

export const MongoDataset = getMongoModel<DatasetSchemaType>(DatasetCollectionName, DatasetSchema);

export const DatasetCollectionImageCollectionName = 'dataset_collection_images';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

写到 dataset/image/schema.ts 里

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已经有组件了

@@ -184,6 +184,8 @@
"comfirm_leave_page": "确认离开该页面?",
"comfirn_create": "确认创建",
"commercial_function_tip": "请升级商业版后使用该功能:https://doc.fastgpt.cn/docs/commercial/intro/",
"common.Create Failed": "创建集合失败",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

放 dataset 里

@@ -1,4 +1,5 @@
{
" error?.message || t('core.dataset.collection.Create Failed'),\n status.core.dataset.collection.Create Failed": "",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

异常数据

@@ -54,11 +54,14 @@ const QuoteList = React.memo(function QuoteList({
const processedData = rawSearch.map((item) => {
if (chatItemDataId && quoteList) {
const currentFilterItem = quoteList.find((res) => res._id === item.id);
return {
const result = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不用再单独起变量名

// indexes: DatasetDataSchemaType['indexes'];
imageFileId?: string;
imageSize?: number;
previewUrl?: string;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imagePreviewUrl

{t('dataset:retain_collection')}
</Button>
)}
{datasetDetail.type !== 'websiteDataset' && !!collection?.chunkSize && (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不要乱改逻辑

@@ -283,11 +327,79 @@ const DataCard = () => {

{/* Data content */}
<Box wordBreak={'break-all'} fontSize={'sm'}>
<Markdown source={item.q} isDisabled />
{!!item.a && (
{isImageCollection ? (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不用管是不是 imageCollection,直接根据 data 里有没有 imagePreviewUrl 决定。有的话就渲染图片模式

alignSelf="stretch"
borderRadius="md"
overflow="hidden"
bg="var(--Gray-Modern-100, #F4F4F7)"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bg={'myGray.100'}

padding="8px 8px 10px 8px"
justifyContent="center"
alignItems="center"
alignSelf="stretch"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

直接标签,

<Box
flex="1 0 0"
color="var(--Gray-Modern-800, #1D2532)"
fontFamily="PingFang SC"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不用指定字体。
fontSize={"sm"}
lineheight,letterSpacing,maxHeight 不用指定

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

大部分情况下,只需要指定 padding,margin fontSize,fontweight,color 即可。其他样式都是全局的了。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create/images

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有用么?

@@ -230,6 +233,15 @@ export async function insertData2Dataset({
{ session, ordered: true }
);

// 4. Remove TTL from image if imageFileId exists (prevent image expiration during training)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

insertData 不需要删 ttl,插入集合时候已经删过了。
需要再 insertData接口里进行 TTL 删除

await deleteDatasetDataVector({
teamId: data.teamId,
idList: data.indexes.map((item) => item.dataId)
});

// 3. If there are any image files, delete the image records.
if (data.imageFileId) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不要 catch,就是需要抛错,事务是保障全部成功,如果失败就要回退。


// Handle imageFileId update
if (imageFileId !== undefined) {
await MongoDatasetData.findByIdAndUpdate(dataId, {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

无效命令

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

还需要删除原data 里的图片

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants