Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(chat-attachment): Upload PDF file as Chat Attachment #149

Open
wants to merge 43 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
dd59f6e
Upload files from frontend to server
kalpadhwaryu Nov 18, 2024
20dc9c0
Merge branch 'main' of https://github.com/xynehq/xyne into pdf-upload
kalpadhwaryu Nov 18, 2024
cc5d49f
Fix selected files UI
kalpadhwaryu Nov 18, 2024
c83dcd6
Fix Staged Files UI
kalpadhwaryu Nov 19, 2024
d8fe300
Add File Type in Staged UI
kalpadhwaryu Nov 19, 2024
25490e0
Ingest uploaded PDF in vespa
kalpadhwaryu Nov 19, 2024
4253d54
Add uploaded file to downloads folder and delete them when done
kalpadhwaryu Nov 20, 2024
3967d39
Add uploaded file metadata in postgres
kalpadhwaryu Nov 20, 2024
817802b
Fix attachments metadata & send it also as sources
kalpadhwaryu Nov 21, 2024
48c9542
Use Toasts for errors
kalpadhwaryu Nov 21, 2024
5686951
Add UI for file upload in messages
kalpadhwaryu Nov 21, 2024
eea9cbe
Add new chatAttachment schema & insert accordingly
kalpadhwaryu Nov 25, 2024
ce11bb1
Add tanstack router context & use state from it
kalpadhwaryu Nov 26, 2024
530edc1
Use for..of instead of forEach
kalpadhwaryu Nov 26, 2024
066e368
Add chatAttachment context
kalpadhwaryu Nov 26, 2024
1e02727
Small Fix and Comments
kalpadhwaryu Nov 26, 2024
5adbd85
Add attachments also to message as attachments
kalpadhwaryu Nov 26, 2024
3d53b21
Merge branch 'main' of https://github.com/xynehq/xyne into pdf-upload
kalpadhwaryu Nov 26, 2024
56ab70d
Updated chatAttachment schema
kalpadhwaryu Nov 26, 2024
2e5540e
New chatAttachment schema
kalpadhwaryu Nov 26, 2024
744ba98
Remove permissions from schema and add chatId, messageId to chat Atta…
kalpadhwaryu Nov 27, 2024
f65c5f5
Add attachments metadata to only message, fix docId and title
kalpadhwaryu Nov 27, 2024
853347f
Add a loader to indicate file upload
kalpadhwaryu Nov 28, 2024
8359f7c
Add query, setQuery as global state
kalpadhwaryu Nov 28, 2024
5865789
Diable sending if streaming is on
kalpadhwaryu Nov 28, 2024
e1870aa
Add searchVespaWithChatAttach fn
kalpadhwaryu Nov 29, 2024
77c6cd0
Fix getName & getIcon for showing attachments
kalpadhwaryu Nov 29, 2024
debea4e
Add hasAttachments flag to decide search function to use
kalpadhwaryu Nov 29, 2024
a8feffc
Improve chat update logic
kalpadhwaryu Nov 29, 2024
858e669
Fix types and remove unncessary code
kalpadhwaryu Nov 29, 2024
f708394
Fix types for searchToCitation fn
kalpadhwaryu Nov 29, 2024
308c5a1
Fix zValidator for Upload api & add todo
kalpadhwaryu Nov 29, 2024
8217a93
Merge branch 'main' of https://github.com/xynehq/xyne into pdf-upload
kalpadhwaryu Nov 29, 2024
e94012e
Change Models & Remove unnecessary code
kalpadhwaryu Dec 2, 2024
f9bd178
Use MessageApiV2
kalpadhwaryu Dec 3, 2024
2387083
Restore MessageApi
kalpadhwaryu Dec 3, 2024
2e5a687
Merge branch 'main' of https://github.com/xynehq/xyne into pdf-upload
kalpadhwaryu Dec 3, 2024
51b8a9f
Merge main branch
kalpadhwaryu Jan 19, 2025
9a299f9
Merge branch 'main' of https://github.com/xynehq/xyne into pdf-upload
kalpadhwaryu Jan 19, 2025
ecfc704
Remove unnecessary code
kalpadhwaryu Jan 19, 2025
a040cff
Fix types
kalpadhwaryu Jan 20, 2025
fc1978b
Merge branch 'main' of https://github.com/xynehq/xyne into pdf-upload
kalpadhwaryu Jan 20, 2025
2db5863
Merge branch 'main' of https://github.com/xynehq/xyne into pdf-upload
kalpadhwaryu Jan 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix types and remove unncessary code
kalpadhwaryu committed Nov 29, 2024
commit 858e669a48c90da06799dbbd6ea8883fd7a9372a
50 changes: 26 additions & 24 deletions frontend/src/StateProvider.tsx
Original file line number Diff line number Diff line change
@@ -7,7 +7,7 @@ export const StateContext = createContext<{
stagedFiles: File[]
setStagedFiles: React.Dispatch<React.SetStateAction<File[]>>
handleFileRemove: (index: number) => void
handleFileSelection: (file: File) => void
handleFileSelection: (event: React.ChangeEvent<HTMLInputElement>) => void
loading: boolean
setLoading: React.Dispatch<React.SetStateAction<boolean>>
query: string
@@ -23,31 +23,33 @@ export const StateContextProvider: React.FC<{ children: React.ReactNode }> = ({
const [query, setQuery] = useState<string>("")
const { toast } = useToast()

const handleFileSelection = (event) => {
const files = Array.from(event.target!.files) as File[]
const handleFileSelection = (event: React.ChangeEvent<HTMLInputElement>) => {
const files = event?.target?.files
const validFiles: File[] = [] // Array to hold files that pass validation

files.forEach((file: File) => {
// File size check: 20 MB limit
const fileSizeInMB = file.size / (1024 * 1024)
if (fileSizeInMB > 20) {
toast({
title: `File Too Large`,
description: `The file "${file.name}" exceeds the 20MB size limit. Please choose a smaller file.`,
variant: "destructive",
})
} else if (!isSupportedFileType(file.type)) {
// Check for unsupported file types
toast({
title: "File Type not supported",
description: `The file "${file.name}" is of type "${file.type}", which is not supported. Please upload a valid file type.`,
variant: "destructive",
})
} else {
// If valid, add the file to the validFiles array
validFiles.push(file)
}
})
if (files) {
Array.from(files).forEach((file) => {
// File size check: 20 MB limit
const fileSizeInMB = file.size / (1024 * 1024)
if (fileSizeInMB > 20) {
toast({
title: `File Too Large`,
description: `The file "${file.name}" exceeds the 20MB size limit. Please choose a smaller file.`,
variant: "destructive",
})
} else if (!isSupportedFileType(file.type)) {
// Check for unsupported file types
toast({
title: "File Type not supported",
description: `The file "${file.name}" is of type "${file.type}", which is not supported. Please upload a valid file type.`,
variant: "destructive",
})
} else {
// If valid, add the file to the validFiles array
validFiles.push(file)
}
})
}

setStagedFiles((prev) => {
if (prev.length + validFiles.length > 5) {
2 changes: 1 addition & 1 deletion frontend/src/components/ChatBox.tsx
Original file line number Diff line number Diff line change
@@ -14,7 +14,7 @@ interface ChatBoxProps {
handleSend: (messageToSend: string) => void
stagedFiles: File[]
handleFileRemove: (index: number) => void
handleFileSelection: (event: any) => void // todo fix any
handleFileSelection: (event: React.ChangeEvent<HTMLInputElement>) => void
loading: boolean
isStreaming?: boolean
}
13 changes: 4 additions & 9 deletions frontend/src/routes/_authenticated/chat.tsx
Original file line number Diff line number Diff line change
@@ -10,7 +10,7 @@ import {
} from "@tanstack/react-router"
import { Bookmark, Copy, Ellipsis, Eye, EyeOff, File } from "lucide-react"
import { useEffect, useRef, useState } from "react"
import { ChatSSEvents, SelectPublicMessage, Citation } from "shared/types"
import { ChatSSEvents, SelectPublicMessage, Citation, AttachmentMetadata } from "shared/types"
import AssistantLogo from "@/assets/assistant-logo.svg"
import Retry from "@/assets/retry.svg"
import { PublicUser, PublicWorkspace } from "shared/types"
@@ -154,9 +154,6 @@ export const ChatPage = ({ user, workspace }: ChatPageProps) => {
setLoading(true)
uploadedFilesMetadata = await handleFileUpload()
setStagedFiles([])
if (uploadedFilesMetadata.length !== 0) {
console.log(`handleFileUpload ran sucessfully...`)
}
setLoading(false)
}

@@ -175,7 +172,6 @@ export const ChatPage = ({ user, workspace }: ChatPageProps) => {
setCurrentResp({ resp: "" })
currentRespRef.current = { resp: "", sources: [] }

console.log(`Message Create API called...`)
const url = new URL(`/api/v1/message/create`, window.location.origin)
if (chatId) {
url.searchParams.append("chatId", chatId)
@@ -520,7 +516,6 @@ export const ChatPage = ({ user, workspace }: ChatPageProps) => {
{currentResp && (
<ChatMessage
message={currentResp.resp}
attachments={currentResp.attachments}
citations={currentResp.sources?.map((c: Citation) => c.url)}
isUser={false}
responseDone={false}
@@ -642,7 +637,7 @@ const ChatMessage = ({
sourcesVisible,
}: {
message: string
attachments: any
attachments?: AttachmentMetadata[]
isUser: boolean
responseDone: boolean
isRetrying?: boolean
@@ -685,10 +680,10 @@ const ChatMessage = ({
>
{isUser ? (
<>
{attachments?.length > 0 && (
{attachments && attachments?.length > 0 && (
<div className="flex-col w-full">
<ul className="flex flex-col space-y-2 pb-2">
{attachments.map((attachment, index) => (
{attachments?.map((attachment, index) => (
<li
key={index}
className="flex items-center p-2 border rounded border-gray-300 min-w-[200px] max-w-[300px]"
6 changes: 3 additions & 3 deletions server/ai/context.ts
Original file line number Diff line number Diff line change
@@ -6,6 +6,7 @@ import {
mailSchema,
userSchema,
VespaSearchResultsSchema,
type VespaChatAttachmentSearch,
type VespaEventSearch,
type VespaFileSearch,
type VespaMailSearch,
@@ -39,15 +40,14 @@ ${fields.chunks_summary && fields.chunks_summary.length ? `Content: ${fields.chu
}

const constructChatAttachmentContext = (
fields: VespaFileSearch,
fields: VespaChatAttachmentSearch,
relevance: number,
): string => {
return `Title: ${fields.title ? `Title: ${fields.title}` : ""}
Created: ${getRelativeTime(fields.createdAt)}
Updated At: ${getRelativeTime(fields.updatedAt)}
${fields.ownerEmail ? `Owner Email: ${fields.ownerEmail}` : ""}
${fields.mimeType ? `Mime Type: ${fields.mimeType}` : ""}
${fields.permissions ? `Permissions: ${fields.permissions.join(", ")}` : ""}
${fields.chunks_summary && fields.chunks_summary.length ? `Content: ${fields.chunks_summary.slice(0, maxSummaryChunks).join("\n")}` : ""}
\nvespa relevance score: ${relevance}\n`
}
@@ -390,4 +390,4 @@ export const replaceLinks = (text: string): string => {
return match
}
})
}
}
15 changes: 10 additions & 5 deletions server/api/chat.ts
Original file line number Diff line number Diff line change
@@ -278,13 +278,12 @@ const handlePDFFile = async (file: Blob, userEmail: string) => {

export const UploadFilesApi = async (c: Context) => {
try {
//todo schema
const { sub } = c.get(JwtPayloadKey)
const email = sub

const formData = await c.req.formData()
const files = formData.getAll("files") as File[]
const metadata: any[] = []
const metadata: AttachmentMetadata[] = []

for (const file of files) {
// Parse file according to its type
@@ -301,7 +300,6 @@ export const UploadFilesApi = async (c: Context) => {
}
}

Logger.info(`Returning metadata`)
return c.json({ attachmentsMetadata: metadata })
} catch (error) {
const errMsg = getErrorMessage(error)
@@ -372,6 +370,15 @@ const MinimalCitationSchema = z.object({

export type Citation = z.infer<typeof MinimalCitationSchema>

const AttachmentMetadataSchema = z.object({
docId: z.string(),
fileName: z.string(),
fileSize: z.number(),
fileType: z.string(),
})

export type AttachmentMetadata = z.infer<typeof AttachmentMetadataSchema>

interface CitationResponse {
answer?: string
citations?: number[]
@@ -411,7 +418,6 @@ const searchToCitation = (
})
} else if (result.fields.sddocname === chatAttachmentSchema) {
citations.push({
// todo
title: fields?.title,
sddocname: fields?.sddocname,
mimeType: fields?.mimeType,
@@ -915,7 +921,6 @@ export const MessageApiV2 = async (c: Context) => {
// if the value exists then we send the error to the frontend via it
let stream: any
try {
Logger.info(`MessageApiV2 called....`)
const { sub, workspaceId } = c.get(JwtPayloadKey)
const email = sub
// @ts-ignore
4 changes: 0 additions & 4 deletions server/api/search.ts
Original file line number Diff line number Diff line change
@@ -163,10 +163,6 @@ export const SearchApi = async (c: Context) => {
)
}

console.log("\nresults")
console.log(results.root.children)
console.log("results\n")

// TODO: deduplicate for google admin and contacts
const newResults = VespaSearchResponseToSearchResult(results)
newResults.groupCount = groupCount
7 changes: 6 additions & 1 deletion server/search/types.ts
Original file line number Diff line number Diff line change
@@ -274,7 +274,11 @@ export const VespaChatAttachmentSearchSchema = VespaChatAttachmentSchema.extend(
{
sddocname: z.literal("chatAttachment"),
},
).merge(defaultVespaFieldsSchema)
)
.merge(defaultVespaFieldsSchema)
.extend({
chunks_summary: z.array(z.string()).optional(),
})

export const VespaMailGetSchema = VespaMailSchema.merge(
defaultVespaFieldsSchema,
@@ -427,6 +431,7 @@ export type VespaSearchResponse = z.infer<typeof VespaSearchResponseSchema>

export type VespaFileGet = z.infer<typeof VespaFileGetSchema>
export type VespaFileSearch = z.infer<typeof VespaFileSearchSchema>
export type VespaChatAttachmentSearch = z.infer<typeof VespaChatAttachmentSearchSchema>
export type VespaMailSearch = z.infer<typeof VespaMailSearchSchema>
export type VespaEventSearch = z.infer<typeof VespaEventSearchSchema>
export type VespaFile = z.infer<typeof VespaFileSchema>
2 changes: 1 addition & 1 deletion server/shared/types.ts
Original file line number Diff line number Diff line change
@@ -25,7 +25,7 @@ import { z } from "zod"
// @ts-ignore
export type { MessageReqType } from "@/api/search"
// @ts-ignore
export type { Citation } from "@/api/chat"
export type { Citation, AttachmentMetadata } from "@/api/chat"
export type {
SelectPublicMessage,
PublicUser,