Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error viewing uploaded file #48

Open
kyleavery opened this issue Mar 27, 2024 · 5 comments
Open

Error viewing uploaded file #48

kyleavery opened this issue Mar 27, 2024 · 5 comments

Comments

@kyleavery
Copy link

I setup a fresh instance of Nemesis and uploaded a PDF file. Accessing the file by UUID or browsing to /Files returns the following error message. I've completely rebuilt Nemesis (minikube delete...) but the result is the same.

Error retrieving file_data_enriched from the database: (psycopg2.errors.UndefinedTable) relation "file_data_enriched" does not exist LINE 25: FROM file_data_enriched ^

[SQL: SELECT COUNT(*) FROM (SELECT file_data_enriched.project_id as project_id, file_data_enriched.source as source, file_data_enriched.timestamp as "timestamp", file_data_enriched.unique_db_id::varchar, file_data_enriched.agent_id as agent_id, file_data_enriched.object_id::varchar as object_id, file_data_enriched.path as path, file_data_enriched.name as name, file_data_enriched.size as size, file_data_enriched.md5 as md5, file_data_enriched.sha1 as sha1, file_data_enriched.sha256 as sha256, file_data_enriched.nemesis_file_type as nemesis_file_type, file_data_enriched.magic_type as magic_type, file_data_enriched.converted_pdf_id::varchar as converted_pdf_id, file_data_enriched.extracted_plaintext_id::varchar as extracted_plaintext_id, file_data_enriched.extracted_source_id::varchar as extracted_source_id, file_data_enriched.tags as tags, file_data_enriched.originating_object_id as originating_object_id, triage.value as triage, triage.unique_db_id as triage_unique_db_id, notes.value as notes

FROM file_data_enriched

LEFT JOIN triage ON file_data_enriched.unique_db_id = triage.unique_db_id

LEFT JOIN notes ON file_data_enriched.unique_db_id = notes.unique_db_id AND notes.value ILIKE %(notes)s

WHERE source ILIKE %(source)s AND project_id ILIKE %(project_id)s AND "timestamp" >= %(startdate)s AND "timestamp" <= %(enddate)s

AND (triage.value IS NULL OR triage.value = 'unknown')
AND originating_object_id = '00000000-0000-0000-0000-000000000000'
ORDER BY "timestamp" DESC) AS s]

[parameters: {'notes': '%%', 'source': '%', 'project_id': '%', 'startdate': datetime.datetime(2023, 12, 28, 0, 0), 'enddate': datetime.datetime(2024, 3, 28, 21, 34)}] (Background on this error at: https://sqlalche.me/e/20/f405)
@kyleavery
Copy link
Author

Update: I tried uploading a simple .txt file and the result was the same. Doesn't seem to be related to PDF files specifically or the size of my original file.

@leechristensen
Copy link
Collaborator

Odd, sounds like the Postgres schema isn't even initialized for some some reason. Some questions:

  1. What OS are you using?
  2. What specs did you give your k8s cluster? We recommend 3 CPU and at least 12GB RAM
  3. Does your k8s host OS actually have the resources in question 2?
  4. If you navigate to http://<NEMESIS>/hasura/console/data/sql and run select * from file_data_enriched; does it succeed?

If your cluster is still running, you could retry redeploying postgres and then repeating the query in question 4 above to ensure that the table is being created. Steps to redeploy postgres are as follows:

kubectl delete deployment postgres
sleep 5 # Wait for it to delete
kubectl apply -f kubernetes/postgres/deployment.yaml

@kyleavery
Copy link
Author

What OS are you using?

Debian 11

What specs did you give your k8s cluster?

Minikube has default 3 CPUs and 12GB memory. Confirmed with minikube config view vm-driver

Does your k8s host OS actually have the resources in question 2?

The host has 4 CPUs and 16GB memory.

If you navigate to http:///hasura/console/data/sql and run select * from file_data_enriched; does it succeed?

No. I ran the query and this was the result:

SQL Execution Failed
relation "file_data_enriched" does not exist.null

{
    "arguments": [],
    "error": {
        "description": null,
        "exec_status": "FatalError",
        "hint": null,
        "message": "relation \"file_data_enriched\" does not exist",
        "status_code": "42P01"
    },
    "prepared": false,
    "statement": "select * from file_data_enriched;"
}

If your cluster is still running, you could retry redeploying postgres and then repeating the query in question 4...

The result was the same. I also tried deleting the persistent volume, but that did not change anything for me.

@kyleavery
Copy link
Author

I haven't had time to pinpoint the exact setting, but the issue seems to be caused by the changes I made to my config file. I'm guessing it is the username or password for one of the services.

@kyleavery
Copy link
Author

It was the postgres_user variable. I can change everything else, including the postgres_password, and it works fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants