本文介绍了我如何...从网站正确提取Nowrgerian lanaguage文本(带有特殊字符)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 您好, 我想从以下网站提取Nowrgerian lanaguage文本: 编辑:Rohan Leuva [删除网站链接] 我准备了一个小型Windows应用程序,下面代码: WebClient web = new WebClient(); System.IO.Stream stream = web.OpenRead(http:// www.xyz.com/); 使用(System.IO.StreamReader reader = new System.IO.StreamReader(stream,Encoding.Default,true)) { text = reader.ReadToEnd(); } richTextBox1.Text = text; ==================================== =========================== 现在在上面的情况下,我不是即使使用Encoding.Default也能正确获取Norwgerian中出现的所有特殊字符。 Norwgerian特殊字符: « » ... $ b $bå $ b $bæ $ b $bé $ b $bø $ b $bà $ b $bæ 请建议我该如何继续前进。此外,我的ultitmate对象是从这些网站获得纯文本(没有任何标签),在这个问题上的任何额外支持也将有所帮助。 提前感谢您的保留时间和阅读我的问题。希望得到一些合适的解决方案。 问候, Ankit Hello,I want to extract Nowrgerian lanaguage text available from below websites: Rohan Leuva[Links to website removed]I have prepared a small windows application, with below code:WebClient web = new WebClient(); System.IO.Stream stream = web.OpenRead("http://www.xyz.com/"); using (System.IO.StreamReader reader = new System.IO.StreamReader(stream, Encoding.Default, true)) { text = reader.ReadToEnd(); } richTextBox1.Text = text;===============================================================Now in above case I am not able to get all special characters correctly that appear in Norwgerian, even by using Encoding.Default.Norwgerian special characters:«»…åæéøàæ Kindly suggest how should I move ahead. Also, my ultitmate object is to get plain text (without any tags) from these website, any additional support in this matter will also help.Thanks in advance for sparing time and reading my issue. Hope to get some suitable solution.Regards,Ankit推荐答案 尝试 Encoding.UTF8 或 Encoding.Unicode : Try Encoding.UTF8 or Encoding.Unicode :using (System.IO.StreamReader reader = new System.IO.StreamReader(stream, Encoding.UTF8, true)){text = reader.ReadToEnd();} 这篇关于我如何...从网站正确提取Nowrgerian lanaguage文本(带有特殊字符)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
11-01 10:23