本文介绍了使用PrintWriter,我在浏览器中收到中文垃圾字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 PrintWriter ,如下所示在浏览器中输出:

  PrintWriter pw = response.getwriter(); 
StringBuffer sb = getTextFromDatabase();
pw.print(sb);

但是,这会打印以下中文垃圾字符:

I tried to use String instead of StringBuffer, but that didn't help. I also tried to set the content type header as follows

 response.setContentType("text/html;charset=UTF-8");

before getting the response writer, but that did also not help.

In the DB there are no issues with the data as I have used the same data for 2 different purposes. In one I get correct output, but in other I get the above junk. I have used the above code in JSP using scriptlets. I have also given content type for the JSP.

解决方案

Getting Chinese characters as Mojibake indicates that you're incorrectly showing UTF-16LE data as UTF-8. UTF16-LE stores each character in 4 bytes. In UTF-8, the 4-byte panels contains usually CJK (Chinese/Japanese/Korean) characters.

To fix this, you need to either show the data as UTF-16LE or to have stored the data in the DB as UTF-8 from the beginning on. Since you're attempting to display them as UTF-8, I think that your DB has to be reconfigured/converted to use UTF-8 instead of UTF-16LE.


Unrelated to the concrete problem, storing HTML (that was what those characters originally represent) in a database is really a bad idea ;) This was the original content:

<h3>Comments</h3> <table><tr bgcolor='#E7E7EF'><td>Posted On: 10-27-2010 14:03:51
, By: Yeshwant Nayak
(ynayak@cisco.com)
Excellent</td></tr><tr bgcolor='#E7E7EF'><td></td></tr><tr bgcolor='#E7E7EF'><td>Posted On: 10-27-2010 14:04:11
, By: Yeshwant Nayak
(ynayak@cisco.com)
very good</td></tr><tr bgcolor='#E7E7EF'><td></td></tr><tr bgcolor='#E7E7EF'><td>Posted On: 10-27-2010 14:17:36
, By: Yeshwant Nayak
(ynayak@cisco.com)
This is to test</td></tr></table><br /> <h3>Post Your Comment</h3> <form action="CommentsServlet" method="get" name="commentForm" onsubmit=" return ValidateForm();"> <table   width="300" height="300">    <tr><td><label for="name">Comment:<span class="mandTClass">*</span></label><br/><textarea name="content" id="commentTxtArea" class="textarea large" cols="28" rows="6" ></textarea></td></tr><tr><td><label for="name">Name:<span class="mandTClass">*</span></label><br/><input id="name" type="text" name="name" class="name" value="" maxlength="255"  size="36"/></td></tr><tr><td><label for="email">E-Mail:<span class="mandTClass">*</span></label><br/><input id="email" type="text" name="email" class="email" value="" maxlength="255"  size="36"/></td></tr><tr><td><input  type="submit"  name="post" value="Post"/></td></tr></table></form

Here's how you can turn this incorrectly encoded Chinese back to normal characters:

String incorrect = "格㸳潃浭湥獴⼼㍨‾琼扡敬㰾牴戠捧汯";
String original = new String(incorrect.getBytes("UTF-16LE"), "UTF-8");

Note that this should not be used as solution! It was just posted as an evidence of the root cause of the problem.

这篇关于使用PrintWriter,我在浏览器中收到中文垃圾字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

11-03 01:01