本文介绍了写在HBase的火花的工作:与存在的类型一个难题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图写一个星火作业应该把它的输出到HBase的。至于我可以告诉大家,这样做的正确方法是使用方法 saveAsHadoopDataset org.apache.spark.rdd.PairRDDFunctions - 这需要我的 RDD 是由对

I am trying to write a Spark job that should put its output into HBase. AS far as I can tell, the right way to do this is to use the method saveAsHadoopDataset on org.apache.spark.rdd.PairRDDFunctions - this requires that my RDD is composed of pairs.

该方法 saveAsHadoopDataset 要求 JobConf ,而这正是我试图构建。据,有一件事我有在设置我的 JobConf 是输出格式(实际上它好好尝试没有工作),像

The method saveAsHadoopDataset requires a JobConf, and this is what I am trying to construct. According to this link, one thing I have to set on my JobConf is the output format (in fact it doens't work without), like

jobConfig.setOutputFormat(classOf[TableOutputFormat])

问题是,显然这不能编译,因为 TableOutputFormat 是通用的,即使它忽略其类型参数。所以,我曾尝试各种组合,如

The problem is that apparently this does not compile, because TableOutputFormat is generic, even though it ignores its type parameter. So I have tried various combinations, such as

jobConfig.setOutputFormat(classOf[TableOutputFormat[Unit]])
jobConfig.setOutputFormat(classOf[TableOutputFormat[_]])

但在任何情况下,我得到一个错误

but in any case I get an error

required: Class[_ <: org.apache.hadoop.mapred.OutputFormat[_, _]]

现在,据我所知,类[_&LT ;: org.apache.hadoop.ma pred.OutputFormat [_,_]] 翻译到类[T] {forSome型T&LT ;: org.apache.hadoop.ma pred.OutputFormat [_,_]} 。这里就是我想我有一个问题,因为:

Now, as far I can tell, Class[_ <: org.apache.hadoop.mapred.OutputFormat[_, _]] translates to Class[T] forSome { type T <: org.apache.hadoop.mapred.OutputFormat[_, _] }. Here is where I think I have a problem, because:


  • 是不变

  • TableOutputFormat [T]&LT ;: OUTPUTFORMAT [T,突变] ,但

  • 我不知道在需求类型的子类型如何生存互动T&LT ;: OUTPUTFORMAT [_,_]

  • Class is invariant
  • TableOutputFormat[T] <: OutputFormat[T, Mutation], but
  • I am not sure how existential types interact with subtyping in the requirement T <: OutputFormat[_, _]

有没有办法获得 OUTPUTFORMAT的一个亚型[_,_] TableOutputFormat ?这似乎从Java和斯卡拉仿制药之间的差异问题就出现了 - 我能为这个做

Is there a way to obtain a subtype of OutputFormat[_, _] from TableOutputFormat? It seems the problem arises from the differences between generics in Java and in Scala - what can I do for this?

编辑:

原来,这甚至微妙。我曾试图定义自己在REPL方法

It turns out this is even subtler. I have tried to define myself a method in the REPL

def foo(x: Class[_ <: OutputFormat[_, _]]) = x

和我实际上可以调用它

foo(classOf[TableOutputFormat[Unit]])

甚至

foo(classOf[TableOutputFormat[_]])

有关的事项。但我不能叫

for that matters. But I cannot call

jobConf.setOutputFormat(classOf[TableOutputFormat[_]])

setOutputFormat 的Java中的原始签名是无效setOutputFormat(类&LT ;?延伸OUTPUTFORMAT&GT; theClass描述)。我怎样才能把它从Scala呢?

The original signature of setOutputFormat in Java is void setOutputFormat(Class<? extends OutputFormat> theClass). How can I call it from Scala?

推荐答案

这是非常奇怪的,你是100%肯定你有你的导入正确(编辑:是的,这是问题,看评论),和你有正确的版本在您的构建文件的文物?也许它可以帮助你,如果我从我的工作项目提供了code片断:

That's very strange, are you 100% sure you have your imports correct ( yes, this was problem, see comments), and you have the correct versions of artefacts in your build file? Maybe it could help you if I provide a code snippet from my working project:

import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.mapred.JobConf
import org.apache.hadoop.hbase.mapred.TableOutputFormat

val conf = HBaseConfiguration.create()

val jobConfig: JobConf = new JobConf(conf, this.getClass)
jobConfig.setOutputFormat(classOf[TableOutputFormat])
jobConfig.set(TableOutputFormat.OUTPUT_TABLE, outputTable)

和一些DEPS我有:

"org.apache.hadoop" % "hadoop-client" % "2.3.0-mr1-cdh5.0.0",
"org.apache.hbase" % "hbase-client" % "0.96.1.1-cdh5.0.0", 
"org.apache.hbase" % "hbase-common" % "0.96.1.1-cdh5.0.0", 

"org.apache.hbase" % "hbase-hadoop-compat" % "0.96.1.1-cdh5.0.0",
"org.apache.hbase" % "hbase-it" % "0.96.1.1-cdh5.0.0", /
"org.apache.hbase" % "hbase-hadoop2-compat" % "0.96.1.1-cdh5.0.0",

"org.apache.hbase" % "hbase-prefix-tree" % "0.96.1.1-cdh5.0.0", 
"org.apache.hbase" % "hbase-protocol" % "0.96.1.1-cdh5.0.0", 
"org.apache.hbase" % "hbase-server" % "0.96.1.1-cdh5.0.0",
"org.apache.hbase" % "hbase-shell" % "0.96.1.1-cdh5.0.0", 

"org.apache.hbase" % "hbase-testing-util" % "0.96.1.1-cdh5.0.0", 
"org.apache.hbase" % "hbase-thrift" % "0.96.1.1-cdh5.0.0",

这篇关于写在HBase的火花的工作:与存在的类型一个难题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-20 01:19