The Chinese characters in the printout information are displayed as garbled characters #45

CuteLicense · 2023-07-01T02:01:43Z

64-bit Simplified Chinese version

bfabiszewski · 2023-07-01T11:17:11Z

Mobitool output is UTF-8. The problem may be with Windows console improperly displaying UTF-8 characters.
I don't have much experience with using these tools on Windows. It was mainly designed for unix console.
Maybe this would be helpful for you.
Anyway, on my system it looks correctly.

CuteLicense · 2023-07-02T03:46:18Z

system("chcp 65001>nul");
Using “chcp 65001” has solved part of the problem, but another problem has occurred and the file name is not displayed properly.

bfabiszewski · 2023-07-02T12:52:38Z

This is expected. Filenames in windows are encoded with UTF-16. If you change console encoding to UTF-8 the filenames will be garbled.

CuteLicense · 2023-07-08T07:10:24Z

Solved, but I don't know how to submit a code merge request.

#ifdef _WIN32
int convert_to_utf8(const char* input_str, char** output_str)
{
    // Get the default code page of the system
    UINT code_page = GetACP();
    char input_encoding[10];
    sprintf(input_encoding, "CP%d", code_page);

    // Create a temporary string to store the input string
    char* temp_str = (char*)malloc((strlen(input_str) + 1) * sizeof(char));
    strcpy(temp_str, input_str);

    // If the default code page of the system is not UTF-8, convert the input string to UTF-8 encoding
    if (code_page != CP_UTF8) {
        // Try to interpret the input string as a Unicode-encoded string
        int unicode_len = MultiByteToWideChar(code_page, 0, temp_str, -1, NULL, 0);
        wchar_t* unicode_str = (wchar_t*)malloc(unicode_len * sizeof(wchar_t));
        if (MultiByteToWideChar(code_page, 0, temp_str, -1, unicode_str, unicode_len) == 0) {
            free(unicode_str);
            free(temp_str);
            printf("Failed to convert input string to Unicode encoding.\n");
            return 0;
        }

        // Convert the Unicode-encoded string from a wide-character string to a regular character string
        int output_len = WideCharToMultiByte(CP_UTF8, 0, unicode_str, -1, NULL, 0, NULL, NULL);
        *output_str = (char*)malloc(output_len * sizeof(char));
        if (WideCharToMultiByte(CP_UTF8, 0, unicode_str, -1, *output_str, output_len, NULL, NULL) == 0) {
            free(unicode_str);
            free(temp_str);
            free(*output_str);
            printf("Failed to convert input string to UTF-8 encoding.\n");
            return 0;
        }

        // Free memory
        free(unicode_str);

        return output_len;
    }
    // Otherwise, output the input string directly
    else {
        *output_str = (char*)malloc((strlen(temp_str) + 1) * sizeof(char));
        strcpy(*output_str, temp_str);
        free(temp_str);
        return strlen(*output_str);
    }
}
#endif

#ifdef _WIN32
    char* output_str = NULL;
    // Convert to UTF-8 encoding
    int output_len = convert_to_utf8(cover_path, &output_str);
    if (output_len > 0) {
        printf("Saving cover to %s\n", output_str);
        free(output_str);
    }
#else
    printf("Saving cover to %s\n", cover_path);	
#endif

bfabiszewski · 2023-07-09T17:56:27Z

谢谢！That really works in this particular case. But I am a bit worried about applying this workaround as a general solution.
While it works in case of Windows 10 UTF-8 terminal, I wonder if it would not cause any problems in different environments under _WIN32.
I just don't know what the guidelines are for console programs under Windows that need unicode output and input. The console I tested on did not have unicode support by default. Without the patch applied I at least saw correct output path displayed using original encoding. With the patch and without properly configured terminal (no UTF-8 or no proper font installed) I didn't even saw the path.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Chinese characters in the printout information are displayed as garbled characters #45

The Chinese characters in the printout information are displayed as garbled characters #45

CuteLicense commented Jul 1, 2023

bfabiszewski commented Jul 1, 2023

CuteLicense commented Jul 2, 2023

bfabiszewski commented Jul 2, 2023

CuteLicense commented Jul 8, 2023

bfabiszewski commented Jul 9, 2023

The Chinese characters in the printout information are displayed as garbled characters #45

The Chinese characters in the printout information are displayed as garbled characters #45

Comments

CuteLicense commented Jul 1, 2023

bfabiszewski commented Jul 1, 2023

CuteLicense commented Jul 2, 2023

bfabiszewski commented Jul 2, 2023

CuteLicense commented Jul 8, 2023

bfabiszewski commented Jul 9, 2023