本文介绍了为什么csc.exe崩溃时,我最后一个输出编码为UTF8?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我不知道其他人是否以及为什么会发生这种情况。



使用此行运行一行程序 System.Console.WriteLine(System.Console.OutputEncoding.EncodingName); 我看到编码是西欧(DOS)





一些代码页列表
1200 Unicode 65001 utf-8 Windows -1252西欧(Windows) 850西欧DOS



我可以通过确保最后一行将代码页设置为850来处理它。虽然我会解释这是一个不充分的解决方案..



也想知道这是否是CSC的一些问题,其他人也有。或任何其他解决方案。



已添加



uuu1.cs

  // uuu1.cs 
class asdf
{
static void Main ()
{

System.Console.InputEncoding = System.Text.Encoding.UTF8;
System.Console.OutputEncoding = System.Text.Encoding.UTF8;

//不是unicode。 UTF8意味着重定向将工作

System.Console.WriteLine(ჵ);

//尝试重定向太..

//并尝试检查csc crash或不是
//System.Console.OutputEncoding=System.Text.Encoding .GetEncoding(850);
//System.Console.InputEncoding = System.Text.Encoding.GetEncoding(850);
//问题是,当它被注释,它打破重定向



}
}

添加行/取消注释最后一行,以便我拥有



System.Console.OutputEncoding = System.Text.Encoding.GetEncoding(850);



会停止崩溃,但是是一个不充分的解决方案,因为例如..如果我想将程序的输出重定向到一个文件,那么我需要UTF8所有



这适用于代码页850线未注释

  c:\blah> uuu1> r.r< ENTER> 
c:\blah> type r.r< ENTER>
c:\blah>ჵ



如果我取消注释最后一行,到850,然后确保csc在下次运行时不会崩溃,但重定向不工作,rr不包含该字符。



添加2



Han的回答让我注意到触发此错误的另一种方法。

  C:\Users\harvey \somecs3> csc< ENTER> 
Microsoft(R)Visual C#编译器版本4.0.30319.18408
for Microsoft(R).NET Framework 4.5
版权所有(C)Microsoft Corporation。版权所有。

警告CS2008:未指定源文件
错误CS1562:没有源的输出必须指定/ out选项

C:\Users\harvey\ somecs3> chcp 65001< ENTER>
活动代码页:65001

C:\Users\harvey \somecs3> csc< ENTER> < - CRASH

C:\Users \harvey \somecs3>


解决方案

编译器处理当它切换到UTF-8时必须输出文本到控制台。它有一个自我诊断,以确保从UTF-16编码字符串到控制台输出代码页的转换正确工作,它击败大红色按钮,当它没有。堆栈跟踪看起来像这样:

  csc.exe!OnCriticalInternalError()+ 0x4字节
csc.exe!ConsoleOutput :: WideToConsole()+ 0xdc51 bytes
csc.exe!ConsoleOutput :: print_internal()+ 0x2c bytes
csc.exe!ConsoleOutput :: print()+ 0x80 bytes
csc.exe! ConsoleOutput :: PrintString()+ 0xb5 bytes
csc.exe!ConsoleOutput :: PrintBanner()+ 0x50 bytes
csc.exe!_main()+ 0x2d0eb bytes

WideToConsole()的实际代码不可用,最接近的匹配是来自SSCLI20分发版本的此版本:

  / * 
*像WideCharToMultiByte,但转换为控制台代码页。返回长度,
* INCLUDING空终止符。
* /
int ConsoleOutput :: WideCharToConsole(LPCWSTR wideStr,LPSTR lpBuffer,int nBufferMax)
{
if(m_fUTF8Output){
if(nBufferMax == 0) {
return UTF8LengthOfUnicode(wideStr,(int)wcslen(wideStr))+ 1; // +1 for nul terminator
}
else {
int cchConverted = NULL_TERMINATED_MODE;
return UnicodeToUTF8(wideStr,& cchConverted,lpBuffer,nBufferMax);
}

}
else {
return WideCharToMultiByte(GetConsoleOutputCP(),0,wideStr,-1,lpBuffer,nBufferMax,0,0);
}
}

/ *
*将Unicode字符串转换为使用VSAlloc分配的控制台ANSI字符串
* /
HRESULT ConsoleOutput :: WideToConsole (LPCWSTR wideStr,CAllocBuffer& buffer)
{
int cch = WideCharToConsole(wideStr,NULL,0);
buffer.AllocCount(cch);
if(0 == WideCharToConsole(wideStr,buffer.GetData(),cch)){
VSFAIL(字符串大小如何变化?
//我们必须NULL终止输出,因为WideCharToMultiByte没有
buffer.SetAt(0,'\0');
return E_FAIL;
}
return S_OK;
}

崩溃发生在VSFAIL 。我可以看到返回E_FAIL语句。然而,它从我发布的版本,if()语句被修改,它看起来像VSFAIL()被替换为RETAILVERIFY()。当他们进行这些更改,可能在UnicodeToUTF8(),现在命名为UTF16ToUTF8()时出现了问题。重新强调,我发布的版本实际上并没有崩溃,你可以通过运行C:\Windows \Microsoft.NET\Framework\v2.0.50727\csc.exe自己看到。只有v4版本的csc.exe有这个错误。



实际的错误很难从机器代码挖掘,最好让微软担心。您可以在connect.microsoft.com上提交错误。我没有看到类似的报告,相当出色的btw。此错误的解决方法是使用CHCP更改代码页。


I am having am having or have run into a very strange thing.

I wonder if others have and why it's happening.

Having run a one line program with this line System.Console.WriteLine(System.Console.OutputEncoding.EncodingName); I see the Encoding is Western European (DOS)

Fine

Here is a list of some codepages1200 Unicode and 65001 utf-8 and Windows-1252 Western European (Windows) and 850 Western European DOS from https://msdn.microsoft.com/en-us/library/system.text.encoding(v=vs.110).aspx

Say I write a C sharp program to change the encoding to utf-8

class sdf
{
  static void Main(string[] args)
{
System.Console.WriteLine(System.Console.OutputEncoding.EncodingName);
  System.Console.OutputEncoding=System.Text.Encoding.GetEncoding(65001);
System.Console.WriteLine(System.Console.OutputEncoding.EncodingName);
}
}

It works, it prints

Western European (DOS)
Unicode (UTF-8)

Now when I run csc again, csc crashes.

I checked my RAM for 14 hours, 8 passes, with memtest. I ran chkdsk my hard drive, all fine. And this is definitely not those, this is a coding issue.I know that because if I open up a new cmd prompt, then run csc, it doesn't crash.

So running that c sharp program, changes the shell such that the next time just running csc crashes csc itself, in that big way.

If I compile the code below, then run it, then run csc, then run csc, or csc whatever.cs, I get csc crashing.

So close the cmd prompt, Open a new one.

This time, experiment with comment and uncommenting the second line of the program

I find that if the second line (the line that changes the codepage to 850 (DOS Western Europe), is there, then it it won't crash the next time I run csc.

Whereas if I comment out that second line, so the program exits having the codepage/encoding changed to UTF-8 then then next time csc runs, csc crashes.

// uncomment the last line, and then // this runs but makes csc crash next time.

class asdf
{
  static void Main()
  {

     System.Console.OutputEncoding = System.Text.Encoding.UTF8; //output and to utf8
     System.Console.OutputEncoding=System.Text.Encoding.GetEncoding(850); 
  }
}

I am not the only person that has run into something like this

though no explanation was found there https://social.msdn.microsoft.com/Forums/vstudio/en-US/0e5f477e-0c32-4e88-acf7-d53d43d5b566/c-command-line-compiler-cscexe-immediately-crashes-when-run-in-code-page-65001-utf8?forum=csharpgeneral

I can deal with it by making sure the last line sets the codepage to 850. Though as i'll explain that's an inadequate solution..

Also i'd like to know if this is some problem with CSC that others have too. Or any other solutions.

added

uuu1.cs

// uuu1.cs
class asdf
{
static void Main()
{

System.Console.InputEncoding  = System.Text.Encoding.UTF8;
System.Console.OutputEncoding = System.Text.Encoding.UTF8;

// not unicode.  UTF8 means redirection will then work

System.Console.WriteLine("ჵ");

// try redirecting too..

// and try  checking for csc crash or not
//System.Console.OutputEncoding=System.Text.Encoding.GetEncoding(850);
//System.Console.InputEncoding =System.Text.Encoding.GetEncoding(850);
//problem is that when that is commented, it breaks the redirection



}
}

Adding the line / uncomment the last lines so I have

System.Console.OutputEncoding=System.Text.Encoding.GetEncoding(850);

would stop the crash but is an inadequate solution, because for example.. If I want to redirect the output of a program to a file, then I need UTF8 all the way from beginning to end, otherwise it doesn't work

this works with the codepage 850 line uncommented

c:\blah>uuu1>r.r<ENTER>  
c:\blah>type r.r <ENTER>  
c:\blah>ჵ  

If I uncomment the last lines, thus changing the codepage to 850 then sure csc won't crash on the next run, but the redirection doesn't work and r.r doesn't contain that character.

Added 2

Han's answer makes me notice another way of triggering this error

C:\Users\harvey\somecs3>csc<ENTER>
Microsoft (R) Visual C# Compiler version 4.0.30319.18408
for Microsoft (R) .NET Framework 4.5
Copyright (C) Microsoft Corporation. All rights reserved.

warning CS2008: No source files specified
error CS1562: Outputs without source must have the /out option specified

C:\Users\harvey\somecs3>chcp  65001<ENTER>
Active code page: 65001

C:\Users\harvey\somecs3>csc<ENTER>  <-- CRASH

C:\Users\harvey\somecs3>
解决方案

Well, you found a bug in the way the C# compiler deals with having to output text to the console when it is switched to UTF-8. It has a self-diagnostic to ensure the conversion from an UTF-16 encoded string to the console output code page worked correctly, it slams the Big Red Button when it didn't. The stack trace looks like this:

csc.exe!OnCriticalInternalError()  + 0x4 bytes  
csc.exe!ConsoleOutput::WideToConsole()  + 0xdc51 bytes  
csc.exe!ConsoleOutput::print_internal()  + 0x2c bytes   
csc.exe!ConsoleOutput::print()  + 0x80 bytes    
csc.exe!ConsoleOutput::PrintString()  + 0xb5 bytes  
csc.exe!ConsoleOutput::PrintBanner()  + 0x50 bytes  
csc.exe!_main()  + 0x2d0eb bytes    

The actual code for WideToConsole() is not available, the closest match is this version from the SSCLI20 distribution:

/*
 * Like WideCharToMultiByte, but translates to the console code page. Returns length,
 * INCLUDING null terminator.
 */
int ConsoleOutput::WideCharToConsole(LPCWSTR wideStr, LPSTR lpBuffer, int nBufferMax)
{
    if (m_fUTF8Output) {
        if (nBufferMax == 0) {
            return UTF8LengthOfUnicode(wideStr, (int)wcslen(wideStr)) + 1; // +1 for nul terminator
        }
        else {
            int cchConverted = NULL_TERMINATED_MODE;
            return UnicodeToUTF8 (wideStr, &cchConverted, lpBuffer, nBufferMax);
        }

    }
    else {
        return WideCharToMultiByte(GetConsoleOutputCP(), 0, wideStr, -1, lpBuffer, nBufferMax, 0, 0);
    }
}

/*
 * Convert Unicode string to Console ANSI string allocated with VSAlloc
 */
HRESULT ConsoleOutput::WideToConsole(LPCWSTR wideStr, CAllocBuffer &buffer)
{
    int cch = WideCharToConsole(wideStr, NULL, 0);
    buffer.AllocCount(cch);
    if (0 == WideCharToConsole(wideStr, buffer.GetData(), cch)) {
        VSFAIL("How'd the string size change?");
        // We have to NULL terminate the output because WideCharToMultiByte didn't
        buffer.SetAt(0, '\0');
        return E_FAIL;
    }
    return S_OK;
}

The crash occurs somewhere around the VSFAIL() assert, judging from the machine code. I can see the return E_FAIL statement. It was however changed from the version I posted, the if() statement was modified and it looks like VSFAIL() was replaced by RETAILVERIFY(). Something broke when they made those changes, probably in UnicodeToUTF8() which is now named UTF16ToUTF8(). Re-emphasizing, the version I posted does not in fact crash, you can see for yourself by running C:\Windows\Microsoft.NET\Framework\v2.0.50727\csc.exe. Only the v4 version of csc.exe has this bug.

The actual bug is hard to dig out from the machine code, best to let Microsoft worry about that. You can file the bug at connect.microsoft.com. I don't see a report that resembles it, fairly remarkable btw. The workaround for this bug is to use CHCP to change the codepage back.

这篇关于为什么csc.exe崩溃时,我最后一个输出编码为UTF8?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

11-03 10:35