-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat(file-format-params): File format parameters for encoding #17035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
48c51c8 to
adf1d42
Compare
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (98.06%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
37413dd to
c303d8c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
|
Let's try to include this in 5.15. I forgot to review this, so most of the delay is on my side... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 24 out of 24 changed files in this pull request and generated 3 comments.
|
I broke the migrations once again :-). Fixed that now. |
|
I also just broke something as well |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 24 out of 24 changed files in this pull request and generated 6 comments.
| if get_param_for_name(param_name).is_encoding(): | ||
| return value |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The get_encoding_param function may raise a ValueError when iterating through file_format_params if an unknown parameter name is encountered. This could happen if the file_format_params dictionary contains parameters that haven't been registered yet (e.g., during migration or when loading legacy data).
Consider wrapping the get_param_for_name call in a try-except block to handle unknown parameters gracefully, or checking if the param_name exists in the registered parameters before calling is_encoding().
| if get_param_for_name(param_name).is_encoding(): | |
| return value | |
| try: | |
| if get_param_for_name(param_name).is_encoding(): | |
| return value | |
| except ValueError: | |
| continue |
| @classmethod | ||
| def get_class(cls, encoding: str | None = None) -> TranslationStore: | ||
| if encoding == "utf-16": | ||
| cls.loader = ("properties", "stringsfile") | ||
| else: | ||
| cls.loader = ("properties", "stringsutf8file") | ||
| return super().get_class() |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modifying the class variable cls.loader in the get_class method creates a potential race condition. If multiple threads or requests process different StringsFormat files with different encodings simultaneously, they could interfere with each other since they're modifying the same class-level variable.
Consider using an instance variable or passing the loader through a different mechanism to avoid this concurrency issue. The same issue exists in PropertiesFormat.get_class().
| @classmethod | ||
| def get_class(cls, encoding: str | None = None) -> TranslationStore: | ||
| if encoding in {"utf-16", "iso-8859-1"}: | ||
| cls.loader = ("properties", "javafile") | ||
| elif encoding == "utf-8": | ||
| cls.loader = ("properties", "javautf8file") | ||
| else: | ||
| cls.loader = ("properties", "javafile") | ||
| return super().get_class(encoding) |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modifying the class variable cls.loader in the get_class method creates a potential race condition. If multiple threads or requests process different PropertiesFormat files with different encodings simultaneously, they could interfere with each other since they're modifying the same class-level variable.
Consider using an instance variable or passing the loader through a different mechanism to avoid this concurrency issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a valid comment. Possible solutions:
- Directly return classes from the overridden method.
- Extend the
loaderattribute to include encoding variants and handle this in the genericget_classimplementation.
| ("utf-16", gettext_lazy("UTF-16")), | ||
| ("utf-8", gettext_lazy("UTF-8")), | ||
| ] | ||
| default = "utf-8" |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a potential breaking change in default encoding. The StringsEncoding parameter has a default value of "utf-8" (line 393), but the migration maps the old "strings" format to "utf-16" encoding (line 12 in migration file). This means existing components using the "strings" format will maintain their UTF-16 encoding after migration, but NEW components created after the migration will default to UTF-8, which could be unexpected since the historical default was UTF-16.
Consider whether the default should be "utf-16" to maintain backward compatibility, or document this as a breaking change if UTF-8 is the intended new default.
| default = "utf-8" | |
| default = "utf-16" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check whether this is actually valid.
| def setup_store(self, store: TranslationStore, **file_format_params) -> None: | ||
| encoding = self.get_value(file_format_params) | ||
| store.encoding = encoding | ||
|
|
||
|
|
||
| @register_file_format_param | ||
| class CSVEncoding(BaseFileFormatParam): | ||
| file_formats = ("csv",) | ||
| name = "csv_encoding" | ||
| label = gettext_lazy("File encoding") | ||
| field_class = forms.ChoiceField | ||
| choices: ClassVar[list[tuple[str | int, StrOrPromise]] | None] = [ | ||
| ("auto", gettext_lazy("Auto-detect")), | ||
| ("utf-8", gettext_lazy("UTF-8")), | ||
| ] | ||
| default = "auto" | ||
| help_text = gettext_lazy("Encoding used for CSV files") | ||
|
|
||
| def setup_store(self, store: TranslationStore, **file_format_params) -> None: | ||
| encoding = self.get_value(file_format_params) | ||
| if encoding != "auto": | ||
| store.encoding = encoding | ||
|
|
||
|
|
||
| @register_file_format_param | ||
| class CSVSimpleEncoding(BaseFileFormatParam): | ||
| file_formats = ("csv-simple",) | ||
| name = "csv_simple_encoding" | ||
| label = gettext_lazy("File encoding") | ||
| field_class = forms.ChoiceField | ||
| choices: ClassVar[list[tuple[str | int, StrOrPromise]] | None] = [ | ||
| ("auto", gettext_lazy("Auto-detect")), | ||
| ("utf-8", gettext_lazy("UTF-8")), | ||
| ("iso-8859-1", gettext_lazy("ISO-8859-1")), | ||
| ] | ||
| default = "auto" | ||
| help_text = gettext_lazy("Encoding used for simple CSV files") | ||
|
|
||
| def setup_store(self, store: TranslationStore, **file_format_params) -> None: | ||
| encoding = self.get_value(file_format_params) | ||
| if encoding != "auto": | ||
| store.encoding = encoding | ||
|
|
||
|
|
||
| @register_file_format_param | ||
| class GWTEncoding(BaseFileFormatParam): | ||
| name = "gwt_encoding" | ||
| file_formats = ("gwt",) | ||
| label = gettext_lazy("File encoding") | ||
| field_class = forms.ChoiceField | ||
| choices: ClassVar[list[tuple[str | int, StrOrPromise]] | None] = [ | ||
| ("utf-8", gettext_lazy("UTF-8")), | ||
| ("iso-8859-1", gettext_lazy("ISO-8859-1")), | ||
| ] | ||
| default = "utf-8" | ||
| help_text = gettext_lazy("Encoding used for GWT Properties files") | ||
|
|
||
| def setup_store(self, store: TranslationStore, **file_format_params) -> None: | ||
| encoding = self.get_value(file_format_params) | ||
| store.encoding = encoding |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The encoding setting logic has a potential precedence issue. The fixup method (called before setup_store) sets encoding from force_encoding or file_format_params. Then setup_store methods in encoding parameter classes (CSVEncoding, CSVSimpleEncoding, PropertiesEncoding, GWTEncoding) unconditionally set store.encoding again, potentially overriding force_encoding.
The setup_store methods should check if force_encoding is set and skip setting the encoding in that case, or the setup_store call in the load method should skip encoding parameters when force_encoding is set. This ensures force_encoding always takes precedence as intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gersona Have you considered this interaction? Setting encoding in two places is possibly a source of problems.
|
Okay, let's not rush with this for 5.15. Most of the Copilot feedback makes sense, so let's look deeper. |
Uh oh!
There was an error while loading. Please reload this page.