Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(file_upload): validate mimetype as configured #1459

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

qvalentin
Copy link
Contributor

@qvalentin qvalentin commented Oct 22, 2024

@qvalentin qvalentin force-pushed the fix/mimetype-validation-file-upload branch from 08f1f68 to 2e4689b Compare October 22, 2024 09:17
@qvalentin qvalentin force-pushed the fix/mimetype-validation-file-upload branch from 2e4689b to f094ebc Compare October 22, 2024 09:27
@qvalentin qvalentin marked this pull request as ready for review October 22, 2024 14:37
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. backend Pertains to the Python backend. labels Oct 22, 2024
@dokterbob dokterbob added evaluate-with-priority What's needed to address this one? unit-tests Has unit tests. security labels Oct 31, 2024
Copy link
Collaborator

@dokterbob dokterbob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests look very clean. Thanks this already great as it is, but I would prefer some slight changes to validation architecture -- separating HTTP logic from validation logic. Important as we'll add more such functionality going forward, good to have the API clean now.

file_response = await session.persist_file(
name=file.filename, content=content, mime=file.content_type
)

return JSONResponse(content=file_response)


def validate_file_upload(file: UploadFile):
if config.features.spontaneous_file_upload is None:
return # TODO: if it is not configured what should happen?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most elegant solution is to change the default value for config.features.spontaneous_file_upload to always be populated with SpontaneousFileUploadFeature then to have enabled default to False and move max_files and max_size_mb default values from the frontend to the backend config. This would remove the if ... is None in favour of more explicit code. But I get it if perhaps that's out of scope and left for a later PR.

The second best would be to follow the frontend in this and not allow uploads unless explicitly enabled. Definitely, we should try to avoid having a TODO in a PR. ;)

file_response = await session.persist_file(
name=file.filename, content=content, mime=file.content_type
)

return JSONResponse(content=file_response)


def validate_file_upload(file: UploadFile):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to have validation functions raise a well-defined Exception and to have a docstring specifying it's behaviour.

For example, in pydantic this is a ValueError and in Django it's a ValidationError. It should definitely not be a HTTPException as the file validation code should not have anything to do with HTTP, it's just validating an UploadFile.

We can catch validation errors and then re-raise them in a HTTP request handler (upload_file).


def validate_file_mime_type(file: UploadFile):
if config.features.spontaneous_file_upload.accept is None:
return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add an explicit comment here that we'll allow any mime type unless a value is defined.

validate_file_size(file)


def validate_file_mime_type(file: UploadFile):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring, please! :)

for pattern in accept:
if fnmatch.fnmatch(file.content_type, pattern):
return
else:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's either a list or a dict. Maybe add an assertion here so that assumption is clear (and the code will bulk if we ever haphazardly decide to change the typing).

@@ -1,19 +1,24 @@
import datetime # Added import for datetime
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's a good time to remove this redundant comment. ;)

@dokterbob dokterbob added the blocked Awaiting update or feedback from user after initial review/comments. label Nov 6, 2024
@qvalentin
Copy link
Contributor Author

Tests look very clean. Thanks this already great as it is, but I would prefer some slight changes to validation architecture -- separating HTTP logic from validation logic. Important as we'll add more such functionality going forward, good to have the API clean now.

Where would you put the validation logic?
It could be a method of SpontaneousFileUploadFeature or just some function somewhere else?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Pertains to the Python backend. blocked Awaiting update or feedback from user after initial review/comments. evaluate-with-priority What's needed to address this one? security size:L This PR changes 100-499 lines, ignoring generated files. unit-tests Has unit tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants