本文介绍了Hadoop Java MapReduce解析杰克逊问题的JSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Hadoop Java M / R程序(0.20.205)中使用Jackson JSON解析器(1.9.5)。以下是JSON示例:

  {id:23423423,name:abc,location:{ displayName:Florida,Rosario,objectType:place},price:1234.55} 


$ b $现在,假设我只想分析id,location.displayName和price,所以我创建了下面的Java对象,并省略了不需要的字段。

  @JsonIgnoreProperties(ignoreUnknown = true)
public class Transaction {
private long id;
私人位置位置;
私人双重价格;

private static final ObjectMapper mapper = new ObjectMapper();

..setter / getter方法在这里用于id,Location,price

@JsonIgnoreProperties(ignoreUnknown = true)
public static class Location {
private String displayName;

public String getDisplayName {return displayName; }
public void setDisplayName(String displayName){this.displayName = displayName; }


公共静态final事务fromJsonDoc(String jsonDoc)throws IOException {
JsonNode rootNode = mapper.readTree(jsonDoc);
返回mapper.treeToValue(rootNode,Transaction.class);


$ / code>

当我以独立模式运行这个程序时(不是在Hadoop分布式模式)。所有我想要正确解析的字段。但是,只要我尝试解析Hadoop map only作业中的数据,我只会获取id字段,而不是 location.displayName 和价格(它们不是反序列化的并为空)。看起来,在MapReduce中运行时, @JsonIgnoreProperties(ignoreUnknown = true)注释有些不起作用,我想要的字段没有被反序列化(id后面的所有内容都是空值)。如果我将所有字段,getter和setter添加到我的 Transaction 对象并删除 @JsonIgnoreProperties ,那么一切正常。
有人有建议,为什么会发生这种情况?我只是举了一个简单的例子,但实际上我的JSON文档非常复杂,我不想将所有字段反序列化。我在这里做错了什么?



这是我在主要方法和Java / Map减少程序中使用Jackson的方式。

  Transaction tran = Transaction.fromJsonDoc(jsonRec); 
System.out.println(id:+ tran.getId()); //在
System.out.println(location:+ tran.getLocation()。getDisplayName())中工作; //仅适用于独立执行,但不适用于Map / Reduce


解决方案

类加载和注释的棘手部分是虚拟机显然允许删除它无法识别的注释。我不知道这是否会导致问题,但可能值得检查。 Hadoop过去捆绑了老版本的Jackson(1.1?),并且自1.4版本中添加了 @JsonIgnoreProperties ,这可能解释了这个问题。



这怎么会发生?您必须使用更新版本进行编译(以查看注释),但运行时环境可能使用旧版本(1.1)。因为您没有在代码中主动使用注释类(它只与类关联),所以类加载器会放弃这个注释,因为它无法从jar中找到它。


I am using Jackson JSON parser (1.9.5) in Hadoop Java M/R program (0.20.205). Given the JSON example below:

{"id":23423423, "name":"abc", "location":{"displayName":"Florida, Rosario","objectType":"place"}, "price":1234.55}

Now, let say I just want to parse out id, location.displayName, and price so I created the following Java object and I am omitting unwanted fields.

@JsonIgnoreProperties(ignoreUnknown = true)
public class Transaction {
  private long id;
  private Location location;
  private double price;

  private static final ObjectMapper mapper = new ObjectMapper();

  ..setter/getter method would be here for id, Location, price

  @JsonIgnoreProperties(ignoreUnknown = true)
  public static class Location {
     private String displayName;

     public String getDisplayName { return displayName; }
     public void setDisplayName(String displayName) { this.displayName = displayName; }
  }

  public static final Transaction fromJsonDoc(String jsonDoc) throws IOException {
     JsonNode rootNode = mapper.readTree(jsonDoc);
     return mapper.treeToValue(rootNode, Transaction.class);
  }
}

When I run this program in standalone mode (not in Hadoop distributed mode). All the fields that I want parse out correctly. However, as soon as I try to parse out the data in Hadoop map only job, I only get the id field and not the location.displayName and price (they are not deserialized and are null). It seems that the @JsonIgnoreProperties(ignoreUnknown = true) annotation is somehow not working property when running in MapReduce and the fields that I want don't get deserialized (everything after id is null). If I add all the fields and getters and setter to to my Transaction object and remove @JsonIgnoreProperties, then everything works fine.Does anyone have a suggestion why this is happening? I just gave a simple example but in reality my JSON document is very complex and I don't want to deserialize all the fields out of it. Am I doing something wrong here?

This is how I am using Jackson in main method and Java/Map reduce program.

Transaction tran = Transaction.fromJsonDoc(jsonRec);
System.out.println("id: " + tran.getId());  //works in both
System.out.println("location: " + tran.getLocation().getDisplayName());  //works only in standalone execution but not in Map/Reduce
解决方案

This might be due to class loading problems: old version of jackson core or so.The tricky part wrt class loading and annotations is that VM is apparently allowed to just drop annotations it does not recognize. I don't know if this could be causing problem you have, but it may be worth checking. Hadoop used to bundle rather old version of Jackson (1.1?), and since @JsonIgnoreProperties was added in 1.4, this might explain the problem.

How could this occur? You must be compiling using a more recent version (to see the annotation), but perhaps runtime environment is using old (1.1) version. Because you do not actively use annotation class from your code (it is "only" associated with the class), class loader would then drop this annotation as it can not find it from jar.

这篇关于Hadoop Java MapReduce解析杰克逊问题的JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

05-21 06:02