Skip to content

Remove base64 asset data when indexing wysiwyg fields #288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: 2.x
Choose a base branch
from

Conversation

BlackbitDevs
Copy link

@BlackbitDevs BlackbitDevs commented Feb 20, 2025

If you have a wysiwyg field in a data object, it can have HTML content like

<p>Some text</p>
<p><img src="blob:https://www.igp-powder.com/80000760-05b3-4ee8-aad8-2a263cd65829" alt=""></p>
<p>Some other text</p>

In the corresponding database table the image gets saved base64-encoded:

<p>Some text</p>
<p><img src="..."
<p>Some other text</p>

In our case this was a huge original image, so the base64 was > 50 MB.

When the object with this data is tried to be indexed, open search throws an error

symfony-messenger    14:11:47 WARNING   [pimcore.opensearch] Request Failure: ["method" => "POST","uri" => "https://opensearch-1.opensearch:9200/_bulk?refresh=wait_for","port" => 9200,"headers" => ["Host" => ["opensearch-1.opensearch"],"Content-Type" => ["application/json"],"Accept" => ["application/json"],"User-Agent" => ["opensearch-php/2.3.1 (Linux 6.1.124; PHP 8.2.27)"]],"HTTP code" => 413,"duration" => 0.001124,"error" => "Unknown 413 error from OpenSearch """]

HTTP 413 means "Content too large".

As the generic data index is used for searching, imho it does not make sense to store base64 information there. So this PR removes the base64 data from the Wysiwyg HTML.

@BlackbitDevs
Copy link
Author

Are the failing tests caused by this PR?

Copy link

@herbertroth
Copy link
Member

@BlackbitDevs Could you please rebase this PR to the 2.x branch? There won’t be a 1.x release. Thanks!

@herbertroth herbertroth deleted the branch pimcore:2.x April 7, 2025 13:22
@herbertroth herbertroth closed this Apr 7, 2025
@herbertroth herbertroth reopened this Apr 7, 2025
Copy link

sonarqubecloud bot commented Apr 7, 2025

@herbertroth herbertroth changed the base branch from 1.x to 2.x April 7, 2025 14:06
@BlackbitDevs
Copy link
Author

BlackbitDevs commented Apr 7, 2025

I think it worked because you changed the base branch of this PR. If there is anything for me to do, please notify me again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants