Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Chinese characters in the printout information are displayed as garbled characters #45

Open
CuteLicense opened this issue Jul 1, 2023 · 5 comments

Comments

@CuteLicense
Copy link

64-bit Simplified Chinese version

2023-07-01_095459

@bfabiszewski
Copy link
Owner

Mobitool output is UTF-8. The problem may be with Windows console improperly displaying UTF-8 characters.
I don't have much experience with using these tools on Windows. It was mainly designed for unix console.
Maybe this would be helpful for you.
Anyway, on my system it looks correctly.

Screenshot 2023-07-01 at 13 13 23

@CuteLicense
Copy link
Author

system("chcp 65001>nul");
Using “chcp 65001” has solved part of the problem, but another problem has occurred and the file name is not displayed properly.

2023-07-02_113747

@bfabiszewski
Copy link
Owner

This is expected. Filenames in windows are encoded with UTF-16. If you change console encoding to UTF-8 the filenames will be garbled.

@CuteLicense
Copy link
Author

Solved, but I don't know how to submit a code merge request.

#ifdef _WIN32
int convert_to_utf8(const char* input_str, char** output_str)
{
    // Get the default code page of the system
    UINT code_page = GetACP();
    char input_encoding[10];
    sprintf(input_encoding, "CP%d", code_page);

    // Create a temporary string to store the input string
    char* temp_str = (char*)malloc((strlen(input_str) + 1) * sizeof(char));
    strcpy(temp_str, input_str);

    // If the default code page of the system is not UTF-8, convert the input string to UTF-8 encoding
    if (code_page != CP_UTF8) {
        // Try to interpret the input string as a Unicode-encoded string
        int unicode_len = MultiByteToWideChar(code_page, 0, temp_str, -1, NULL, 0);
        wchar_t* unicode_str = (wchar_t*)malloc(unicode_len * sizeof(wchar_t));
        if (MultiByteToWideChar(code_page, 0, temp_str, -1, unicode_str, unicode_len) == 0) {
            free(unicode_str);
            free(temp_str);
            printf("Failed to convert input string to Unicode encoding.\n");
            return 0;
        }

        // Convert the Unicode-encoded string from a wide-character string to a regular character string
        int output_len = WideCharToMultiByte(CP_UTF8, 0, unicode_str, -1, NULL, 0, NULL, NULL);
        *output_str = (char*)malloc(output_len * sizeof(char));
        if (WideCharToMultiByte(CP_UTF8, 0, unicode_str, -1, *output_str, output_len, NULL, NULL) == 0) {
            free(unicode_str);
            free(temp_str);
            free(*output_str);
            printf("Failed to convert input string to UTF-8 encoding.\n");
            return 0;
        }

        // Free memory
        free(unicode_str);

        return output_len;
    }
    // Otherwise, output the input string directly
    else {
        *output_str = (char*)malloc((strlen(temp_str) + 1) * sizeof(char));
        strcpy(*output_str, temp_str);
        free(temp_str);
        return strlen(*output_str);
    }
}
#endif
#ifdef _WIN32
    char* output_str = NULL;
    // Convert to UTF-8 encoding
    int output_len = convert_to_utf8(cover_path, &output_str);
    if (output_len > 0) {
        printf("Saving cover to %s\n", output_str);
        free(output_str);
    }
#else
    printf("Saving cover to %s\n", cover_path);	
#endif

@bfabiszewski
Copy link
Owner

谢谢!That really works in this particular case. But I am a bit worried about applying this workaround as a general solution.
While it works in case of Windows 10 UTF-8 terminal, I wonder if it would not cause any problems in different environments under _WIN32.
I just don't know what the guidelines are for console programs under Windows that need unicode output and input. The console I tested on did not have unicode support by default. Without the patch applied I at least saw correct output path displayed using original encoding. With the patch and without properly configured terminal (no UTF-8 or no proper font installed) I didn't even saw the path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants