本文介绍了Files.readAllBytes与Files.lines获取MalformedInputException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我会认为以下两种读取文件的方法应该是平等的。但他们没有。第二种方法是抛出 MalformedInputException



pre $ public static void main String [] args){
try {
String content = new String(Files.readAllBytes(Paths.get(_ template.txt)));
System.out.println(content);
} catch(IOException e){
e.printStackTrace();


try(Stream< String> lines = Files.lines(Paths.get(_ template.txt))){
lines.forEach(System.out: :的println);
} catch(IOException e){
e.printStackTrace();


$ / code $ / pre
$ b $ p

这是堆栈跟踪:

 线程main中的异常java.io.UncheckedIOException:java.nio.charset.MalformedInputException:输入长度= 1 
at java。 io.BufferedReader $ 1.hasNext(BufferedReader.java:574)
在java.util.Iterator.forEachRemaining(Iterator.java:115)$ b $在java.util.Spliterators $ IteratorSpliterator.forEachRemaining(Spliterators.java :1801)
at java.util.stream.ReferencePipeline $ Head.forEach(ReferencePipeline.java:580)
at Test.main(Test.java:19)
引起:java。 nio.charset.MalformedInputException:在java.nio.charset.CoderResult.throwException处输入length = 1
(CoderResult.java:281)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:在java.io.InputStreamReader.read中输入
(InputStreamReader.java:184)
在java.io中

at Sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.io.InputStreamReader.read .BufferedReader.fill(BufferedReader.java:161 )
在java.io.BufferedReader.readLine(BufferedReader.java:324)$ b $在java.io.BufferedReader.readLine(BufferedReader.java:389)
在java.io.BufferedReader $ 1 .hasNext(BufferedReader.java:571)
... 4 more

什么是这里有什么区别,我该如何解决这个问题? 解决方案

这与。电脑只处理数字。要存储文本,文本中的字符必须使用某种方案转换为数字。该方案被称为字符编码。有许多不同的字符编码;一些众所周知的标准字符编码是ASCII,ISO-8859-1和UTF-8。在第一个例子中,你读了所有的字节(数字)然后通过将它们传递给类 String 的构造函数,将它们转换为字符。这将使用系统的默认字符编码(无论您的操作系统是什么)将字节转换为字符。



在第二个示例中,使用 Files.lines(...),将使用UTF-8字符编码,根据。当在不是有效的UTF-8序列的文件中发现字节序列时,会得到一个 MalformedInputException



您的系统的默认字符编码可能是也可能不是UTF-8,因此可以解释行为上的差异。



您必须找出文件使用什么字符编码,然后明确地使用它。例如:

pre $ String content = new String(Files.readAllBytes(Paths.get(_ template.txt)),
StandardCharsets.ISO_8859_1);

第二个例子:

 流<字符串> lines = Files.lines(Paths.get(_ template.txt),
StandardCharsets.ISO_8859_1);


I would have thought that the following two approaches to read a file should behave equally. But they don't. The second approach is throwing a MalformedInputException.

public static void main(String[] args) {    
    try {
        String content = new String(Files.readAllBytes(Paths.get("_template.txt")));
        System.out.println(content);
    } catch (IOException e) {
        e.printStackTrace();
    }

    try(Stream<String> lines = Files.lines(Paths.get("_template.txt"))) {
        lines.forEach(System.out::println);
    } catch (IOException e) {
        e.printStackTrace();
    }
}

This is the stack trace:

Exception in thread "main" java.io.UncheckedIOException: java.nio.charset.MalformedInputException: Input length = 1
    at java.io.BufferedReader$1.hasNext(BufferedReader.java:574)
    at java.util.Iterator.forEachRemaining(Iterator.java:115)
    at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
    at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
    at Test.main(Test.java:19)
Caused by: java.nio.charset.MalformedInputException: Input length = 1
    at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
    at java.io.InputStreamReader.read(InputStreamReader.java:184)
    at java.io.BufferedReader.fill(BufferedReader.java:161)
    at java.io.BufferedReader.readLine(BufferedReader.java:324)
    at java.io.BufferedReader.readLine(BufferedReader.java:389)
    at java.io.BufferedReader$1.hasNext(BufferedReader.java:571)
    ... 4 more

What is the difference here, and how do I fix it?

解决方案

This has to do with character encoding. Computers only deal with numbers. To store text, the characters in the text have to be converted to and from numbers, using some scheme. That scheme is called the character encoding. There are many different character encodings; some of the well-known standard character encodings are ASCII, ISO-8859-1 and UTF-8.

In the first example, you read all the bytes (numbers) in the file and then convert them to characters by passing them to the constructor of class String. This will use the default character encoding of your system (whatever it is on your operating system) to convert the bytes to characters.

In the second example, where you use Files.lines(...), the UTF-8 character encoding will be used, according to the documentation. When a sequence of bytes is found in the file that is not a valid UTF-8 sequence, you'll get a MalformedInputException.

The default character encoding of your system may or may not be UTF-8, so that can explain a difference in behaviour.

You'll have to find out what character encoding is used for the file, and then explicitly use that. For example:

String content = new String(Files.readAllBytes(Paths.get("_template.txt")),
        StandardCharsets.ISO_8859_1);

Second example:

Stream<String> lines = Files.lines(Paths.get("_template.txt"),
        StandardCharsets.ISO_8859_1);

这篇关于Files.readAllBytes与Files.lines获取MalformedInputException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-15 14:24