Skip to content

Unreadable non-UTF-8 output on localized MSVC #35785

@Boddlnagg

Description

@Boddlnagg
Contributor

A localized (e.g. German) MSVC can produce non-UTF-8 output, which can become almost unreadable in the way it's currently forwarded to the user.

This can be seen in an (otherwise unrelated) issue comment: rust-lang/cc-rs#87 (comment)
Note especially this line:

note: Non-UTF-8 output: LINK : fatal error LNK1104: Datei \"ucrt.lib\" kann nicht ge\xf6ffnet werden.\r\n

That output might be a lot longer for multiple LNK errors (one line per error, but the lines are not properly separated in the output, because they are converted to \r\n) and become really hard to read.
If possible, the output should be converted to Unicode in this case.

(previously reported as rust-lang/cargo#3012)


NOTE from @crlf0710: Please install Visual Studio English Language Pack side by side with your favorite language pack for UI

Activity

nagisa

nagisa commented on Aug 18, 2016

@nagisa
Member

I’m surprised we try to interpret any output in Windows as UTF-8 as opposed to UTF-16, like we should.

petrochenkov

petrochenkov commented on Aug 18, 2016

@petrochenkov
Contributor

I suspect that the MSVC compiler's raw output is encoded as Windows-1252 or Windows-1250 (for German) depending on the current console code page and not as UTF-16.

@Boddlnagg
Does cmd /c "chcp 65001" solve the encoding problem?

Boddlnagg

Boddlnagg commented on Aug 18, 2016

@Boddlnagg
ContributorAuthor

@petrochenkov

Does cmd /c "chcp 65001" solve the encoding problem?

No, neither in cmd nor in powershell (usually I'm working with the latter). It prints Aktive Codepage: 65001. but the encoding problems persist.

retep998

retep998 commented on Aug 18, 2016

@retep998
Member

Is it just MSVC which is localized, or is your system codepage itself different? If we know that a certain program's output is the system codepage we could use MultiByteToWideChar with CP_OEMCP to convert it. Perhaps we could check whether the output is UTF-8, and attempt to do the codepage conversion if it isn't.

Boddlnagg

Boddlnagg commented on Aug 18, 2016

@Boddlnagg
ContributorAuthor

@retep998 I am using a German localized Windows. How can I find out the system codepage? The default CP for console windows is "850 (OEM - Multilingual Lateinisch I)".

retep998

retep998 commented on Aug 18, 2016

@retep998
Member

@Boddlnagg You can call GetCPInfoEx with CP_ACP or CP_OEMCP.

Boddlnagg

Boddlnagg commented on Aug 18, 2016

@Boddlnagg
ContributorAuthor

@retep998 That returns codepage numbers 1252 for CP_ACP and 850 for CP_OEMCP.

retep998

retep998 commented on Aug 18, 2016

@retep998
Member

@Boddlnagg In which case the output from the linker appears to be CP_OEMCP, so now someone just has to add code to rustc which detects when the linker output isn't utf-8 on windows and use MultiByteToWideChar with CP_OEMCP instead.

danyx23

danyx23 commented on Jun 8, 2018

@danyx23

I ran into this problem as well (also on a German Windows 10). As a workaround it helps if you go to in Win 10 to Settings -> Region and Language -> Language and add English as a language and make it default ( you may have to log out and back in again). After that, programs like link.exe should output using a locale that works with rust as it is right now.

It would still be great if this could be fixed in rust :)

michaelfairley

michaelfairley commented on Jul 12, 2018

@michaelfairley
Contributor

Just saw another instance of this in Rust-SDL2/rust-sdl2#783, except with GBK/CP936/GB2312 (a common encoding in China), where SDL2.lib : warning LNK4272:\xbf\xe2\xbc\xc6\xcb\xe3\xbb\xfa\xc0\xe0\xd0\xcd\xa1\xb0x86\xa1\xb1\xd3\xeb\xc4\xbf\xb1\xea\xbc\xc6\xcb\xe3\xbb\xfa\xc0\xe0\xd0\xcd\xa1\xb0x64\xa1\xb1\xb3\xe5\xcd\xbb got printed instead of the desired SDL2.lib : warning LNK4272:库计算机类型“x86”与目标计算机类型“x64”冲突.

added a commit that references this issue on Jul 1, 2019

Rollup merge of rust-lang#62021 - crlf0710:msvc_link_output_improve, …

548132b

30 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-bugCategory: This is a bug.O-windows-msvcToolchain: MSVC, Operating system: Windows

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Participants

      @michaelfairley@lygstate@crlf0710@retep998@nagisa

      Issue actions

        Unreadable non-UTF-8 output on localized MSVC · Issue #35785 · rust-lang/rust