问题描述
我正在阅读包含UTF-8字符的HTML文档,但是当我访问文档的 innerHTML
时,所有bad 字符显示为 0xfffd
。我已经在所有主流浏览器中尝试过了,它的行为方式也是一样的。当我 alert()
innerHTML
时,它会将这些字符显示为带有?标记的 / em>。
令人惊讶的是,以下工作完美无缺,正确地在警告框中显示UTF-8字符,所以它不是 alert() / code>出现故障。
alert(Doppelg\\\änger!);为什么我无法使用 来访问UTF-8字符?为什么我不能使用 来访问UTF-8字符?的innerHTML
强>?或者有另一种方法可以在JavaScript中访问它们。解决方案首先,检查文档头是否包含。 b
$ b
< meta http-equiv =Content-Typecontent =text / html; charset = UTF-8>
您也可以用javascript读出元标签:
var metaTags = document.getElementsByTagName(META);
如果是这样,这就是行为的解释。您可以尝试将utf-8更改为ISO-8859-1:
< meta http-equiv =Content-Type content =text / html; charset = ISO-8859-1>
更好的做法是对HTML中的所有扩展字符进行编码。像这样:
函数encodeHTML(str){
var aStr = str.split(''),
i = aStr.length,
Ret = []; (--i){
var iC = aStr [i] .charCodeAt();
(iC< 65 || iC> 127 ||(iC> 90& iC< 97)){
aRet.push('&#'+ iC +';' );
} else {
aRet.push(aStr [i]);
}
}
返回aRet.reverse()。join('');
}
请注意,这个函数会将所有不是[a-zA- Z]。例如,该功能将在Doppelg& nger中对Doppelgänger进行编码。
I'm reading an HTML document that contains UTF-8 chars but when I access the innerHTML
of the document, all the "bad" chars show up as 0xfffd
. I've tried it in all the major browsers and it behaves the same way. When I alert()
the innerHTML
it shows those chars as a "diamond with a ? mark".
Surprisingly the following works perfectly, correctly displaying the UTF-8 char in the alert box, so its not alert()
is malfunctioning.
alert("Doppelg\u00e4nger!");
Why can't I access the UTF-8 chars using innerHTML
? Or is there another way to access them in JavaScript.
解决方案 First, check if the document header contains.
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
You can also read out the meta-tags with javascript:
var metaTags = document.getElementsByTagName("META");
If it does, this is the explanation of the behavior. You can try changing utf-8 to ISO-8859-1:
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
Better is to htmlEncode all extended characters in your HTML. Like this:
function encodeHTML(str){
var aStr = str.split(''),
i = aStr.length,
aRet = [];
while (--i) {
var iC = aStr[i].charCodeAt();
if (iC < 65 || iC > 127 || (iC>90 && iC<97)) {
aRet.push('&#'+iC+';');
} else {
aRet.push(aStr[i]);
}
}
return aRet.reverse().join('');
}
Mind you, this function will encode everything that is not [a-zA-Z]. This function will encode Doppelgänger in Doppelgänger for example.
这篇关于使用innerHTML读取UTF8字符会为所有字符返回0xfffd的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!