Skip to content

Conversation

@gersona
Copy link
Contributor

@gersona gersona commented Nov 23, 2025

@gersona gersona force-pushed the 15801_format_encoding_params branch 6 times, most recently from 48c51c8 to adf1d42 Compare December 4, 2025 16:22
@codecov
Copy link

codecov bot commented Dec 4, 2025

Codecov Report

❌ Patch coverage is 98.06763% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.73%. Comparing base (14440b3) to head (4c2d113).
⚠️ Report is 24 commits behind head on main.

Files with missing lines Patch % Lines
weblate/trans/file_format_params.py 96.05% 2 Missing and 1 partial ⚠️
weblate/formats/ttkit.py 98.36% 0 Missing and 1 partial ⚠️

❌ Your patch check has failed because the patch coverage (98.06%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@gersona gersona force-pushed the 15801_format_encoding_params branch from 37413dd to c303d8c Compare December 8, 2025 05:28
@gersona gersona marked this pull request as ready for review December 8, 2025 06:17
@gersona gersona requested a review from AliceVisek as a code owner December 8, 2025 06:17
@nijel nijel requested a review from Copilot December 8, 2025 07:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@nijel nijel added this to the 5.15 milestone Dec 11, 2025
@nijel nijel self-assigned this Dec 11, 2025
@nijel
Copy link
Member

nijel commented Dec 11, 2025

Let's try to include this in 5.15. I forgot to review this, so most of the delay is on my side...

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 24 changed files in this pull request and generated 3 comments.

@nijel
Copy link
Member

nijel commented Dec 12, 2025

I broke the migrations once again :-). Fixed that now.

@gersona
Copy link
Contributor Author

gersona commented Dec 12, 2025

I also just broke something as well

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 24 changed files in this pull request and generated 6 comments.

Comment on lines +131 to +132
if get_param_for_name(param_name).is_encoding():
return value
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The get_encoding_param function may raise a ValueError when iterating through file_format_params if an unknown parameter name is encountered. This could happen if the file_format_params dictionary contains parameters that haven't been registered yet (e.g., during migration or when loading legacy data).

Consider wrapping the get_param_for_name call in a try-except block to handle unknown parameters gracefully, or checking if the param_name exists in the registered parameters before calling is_encoding().

Suggested change
if get_param_for_name(param_name).is_encoding():
return value
try:
if get_param_for_name(param_name).is_encoding():
return value
except ValueError:
continue

Copilot uses AI. Check for mistakes.
Comment on lines +1432 to +1438
@classmethod
def get_class(cls, encoding: str | None = None) -> TranslationStore:
if encoding == "utf-16":
cls.loader = ("properties", "stringsfile")
else:
cls.loader = ("properties", "stringsutf8file")
return super().get_class()
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modifying the class variable cls.loader in the get_class method creates a potential race condition. If multiple threads or requests process different StringsFormat files with different encodings simultaneously, they could interfere with each other since they're modifying the same class-level variable.

Consider using an instance variable or passing the loader through a different mechanism to avoid this concurrency issue. The same issue exists in PropertiesFormat.get_class().

Copilot uses AI. Check for mistakes.
Comment on lines +1457 to +1465
@classmethod
def get_class(cls, encoding: str | None = None) -> TranslationStore:
if encoding in {"utf-16", "iso-8859-1"}:
cls.loader = ("properties", "javafile")
elif encoding == "utf-8":
cls.loader = ("properties", "javautf8file")
else:
cls.loader = ("properties", "javafile")
return super().get_class(encoding)
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modifying the class variable cls.loader in the get_class method creates a potential race condition. If multiple threads or requests process different PropertiesFormat files with different encodings simultaneously, they could interfere with each other since they're modifying the same class-level variable.

Consider using an instance variable or passing the loader through a different mechanism to avoid this concurrency issue.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a valid comment. Possible solutions:

  • Directly return classes from the overridden method.
  • Extend the loader attribute to include encoding variants and handle this in the generic get_class implementation.

("utf-16", gettext_lazy("UTF-16")),
("utf-8", gettext_lazy("UTF-8")),
]
default = "utf-8"
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a potential breaking change in default encoding. The StringsEncoding parameter has a default value of "utf-8" (line 393), but the migration maps the old "strings" format to "utf-16" encoding (line 12 in migration file). This means existing components using the "strings" format will maintain their UTF-16 encoding after migration, but NEW components created after the migration will default to UTF-8, which could be unexpected since the historical default was UTF-16.

Consider whether the default should be "utf-16" to maintain backward compatibility, or document this as a breaking change if UTF-8 is the intended new default.

Suggested change
default = "utf-8"
default = "utf-16"

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check whether this is actually valid.

Comment on lines +411 to +470
def setup_store(self, store: TranslationStore, **file_format_params) -> None:
encoding = self.get_value(file_format_params)
store.encoding = encoding


@register_file_format_param
class CSVEncoding(BaseFileFormatParam):
file_formats = ("csv",)
name = "csv_encoding"
label = gettext_lazy("File encoding")
field_class = forms.ChoiceField
choices: ClassVar[list[tuple[str | int, StrOrPromise]] | None] = [
("auto", gettext_lazy("Auto-detect")),
("utf-8", gettext_lazy("UTF-8")),
]
default = "auto"
help_text = gettext_lazy("Encoding used for CSV files")

def setup_store(self, store: TranslationStore, **file_format_params) -> None:
encoding = self.get_value(file_format_params)
if encoding != "auto":
store.encoding = encoding


@register_file_format_param
class CSVSimpleEncoding(BaseFileFormatParam):
file_formats = ("csv-simple",)
name = "csv_simple_encoding"
label = gettext_lazy("File encoding")
field_class = forms.ChoiceField
choices: ClassVar[list[tuple[str | int, StrOrPromise]] | None] = [
("auto", gettext_lazy("Auto-detect")),
("utf-8", gettext_lazy("UTF-8")),
("iso-8859-1", gettext_lazy("ISO-8859-1")),
]
default = "auto"
help_text = gettext_lazy("Encoding used for simple CSV files")

def setup_store(self, store: TranslationStore, **file_format_params) -> None:
encoding = self.get_value(file_format_params)
if encoding != "auto":
store.encoding = encoding


@register_file_format_param
class GWTEncoding(BaseFileFormatParam):
name = "gwt_encoding"
file_formats = ("gwt",)
label = gettext_lazy("File encoding")
field_class = forms.ChoiceField
choices: ClassVar[list[tuple[str | int, StrOrPromise]] | None] = [
("utf-8", gettext_lazy("UTF-8")),
("iso-8859-1", gettext_lazy("ISO-8859-1")),
]
default = "utf-8"
help_text = gettext_lazy("Encoding used for GWT Properties files")

def setup_store(self, store: TranslationStore, **file_format_params) -> None:
encoding = self.get_value(file_format_params)
store.encoding = encoding
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encoding setting logic has a potential precedence issue. The fixup method (called before setup_store) sets encoding from force_encoding or file_format_params. Then setup_store methods in encoding parameter classes (CSVEncoding, CSVSimpleEncoding, PropertiesEncoding, GWTEncoding) unconditionally set store.encoding again, potentially overriding force_encoding.

The setup_store methods should check if force_encoding is set and skip setting the encoding in that case, or the setup_store call in the load method should skip encoding parameters when force_encoding is set. This ensures force_encoding always takes precedence as intended.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gersona Have you considered this interaction? Setting encoding in two places is possibly a source of problems.

@nijel nijel modified the milestones: 5.15, 5.16 Dec 12, 2025
@nijel
Copy link
Member

nijel commented Dec 12, 2025

Okay, let's not rush with this for 5.15. Most of the Copilot feedback makes sense, so let's look deeper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use parameters for file format encoding

2 participants