问题描述
我试图写一个星火作业应该把它的输出到HBase的。至于我可以告诉大家,这样做的正确方法是使用方法 saveAsHadoopDataset
在 org.apache.spark.rdd.PairRDDFunctions
- 这需要我的 RDD
是由对
I am trying to write a Spark job that should put its output into HBase. AS far as I can tell, the right way to do this is to use the method saveAsHadoopDataset
on org.apache.spark.rdd.PairRDDFunctions
- this requires that my RDD
is composed of pairs.
该方法 saveAsHadoopDataset
要求 JobConf
,而这正是我试图构建。据,有一件事我有在设置我的 JobConf
是输出格式(实际上它好好尝试没有工作),像
The method saveAsHadoopDataset
requires a JobConf
, and this is what I am trying to construct. According to this link, one thing I have to set on my JobConf
is the output format (in fact it doens't work without), like
jobConfig.setOutputFormat(classOf[TableOutputFormat])
问题是,显然这不能编译,因为 TableOutputFormat
是通用的,即使它忽略其类型参数。所以,我曾尝试各种组合,如
The problem is that apparently this does not compile, because TableOutputFormat
is generic, even though it ignores its type parameter. So I have tried various combinations, such as
jobConfig.setOutputFormat(classOf[TableOutputFormat[Unit]])
jobConfig.setOutputFormat(classOf[TableOutputFormat[_]])
但在任何情况下,我得到一个错误
but in any case I get an error
required: Class[_ <: org.apache.hadoop.mapred.OutputFormat[_, _]]
现在,据我所知,类[_&LT ;: org.apache.hadoop.ma pred.OutputFormat [_,_]]
翻译到类[T] {forSome型T&LT ;: org.apache.hadoop.ma pred.OutputFormat [_,_]}
。这里就是我想我有一个问题,因为:
Now, as far I can tell, Class[_ <: org.apache.hadoop.mapred.OutputFormat[_, _]]
translates to Class[T] forSome { type T <: org.apache.hadoop.mapred.OutputFormat[_, _] }
. Here is where I think I have a problem, because:
-
类
是不变 -
TableOutputFormat [T]&LT ;: OUTPUTFORMAT [T,突变]
,但 - 我不知道在需求
类型的子类型如何生存互动T&LT ;: OUTPUTFORMAT [_,_]
Class
is invariantTableOutputFormat[T] <: OutputFormat[T, Mutation]
, but- I am not sure how existential types interact with subtyping in the requirement
T <: OutputFormat[_, _]
有没有办法获得 OUTPUTFORMAT的一个亚型[_,_]
从 TableOutputFormat
?这似乎从Java和斯卡拉仿制药之间的差异问题就出现了 - 我能为这个做
Is there a way to obtain a subtype of OutputFormat[_, _]
from TableOutputFormat
? It seems the problem arises from the differences between generics in Java and in Scala - what can I do for this?
编辑:
原来,这甚至微妙。我曾试图定义自己在REPL方法
It turns out this is even subtler. I have tried to define myself a method in the REPL
def foo(x: Class[_ <: OutputFormat[_, _]]) = x
和我实际上可以调用它
foo(classOf[TableOutputFormat[Unit]])
甚至
foo(classOf[TableOutputFormat[_]])
有关的事项。但我不能叫
for that matters. But I cannot call
jobConf.setOutputFormat(classOf[TableOutputFormat[_]])
setOutputFormat
的Java中的原始签名是无效setOutputFormat(类&LT ;?延伸OUTPUTFORMAT&GT; theClass描述)
。我怎样才能把它从Scala呢?
The original signature of setOutputFormat
in Java is void setOutputFormat(Class<? extends OutputFormat> theClass)
. How can I call it from Scala?
推荐答案
这是非常奇怪的,你是100%肯定你有你的导入正确(编辑:是的,这是问题,看评论),和你有正确的版本在您的构建文件的文物?也许它可以帮助你,如果我从我的工作项目提供了code片断:
That's very strange, are you 100% sure you have your imports correct ( yes, this was problem, see comments), and you have the correct versions of artefacts in your build file? Maybe it could help you if I provide a code snippet from my working project:
import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.mapred.JobConf
import org.apache.hadoop.hbase.mapred.TableOutputFormat
val conf = HBaseConfiguration.create()
val jobConfig: JobConf = new JobConf(conf, this.getClass)
jobConfig.setOutputFormat(classOf[TableOutputFormat])
jobConfig.set(TableOutputFormat.OUTPUT_TABLE, outputTable)
和一些DEPS我有:
"org.apache.hadoop" % "hadoop-client" % "2.3.0-mr1-cdh5.0.0",
"org.apache.hbase" % "hbase-client" % "0.96.1.1-cdh5.0.0",
"org.apache.hbase" % "hbase-common" % "0.96.1.1-cdh5.0.0",
"org.apache.hbase" % "hbase-hadoop-compat" % "0.96.1.1-cdh5.0.0",
"org.apache.hbase" % "hbase-it" % "0.96.1.1-cdh5.0.0", /
"org.apache.hbase" % "hbase-hadoop2-compat" % "0.96.1.1-cdh5.0.0",
"org.apache.hbase" % "hbase-prefix-tree" % "0.96.1.1-cdh5.0.0",
"org.apache.hbase" % "hbase-protocol" % "0.96.1.1-cdh5.0.0",
"org.apache.hbase" % "hbase-server" % "0.96.1.1-cdh5.0.0",
"org.apache.hbase" % "hbase-shell" % "0.96.1.1-cdh5.0.0",
"org.apache.hbase" % "hbase-testing-util" % "0.96.1.1-cdh5.0.0",
"org.apache.hbase" % "hbase-thrift" % "0.96.1.1-cdh5.0.0",
这篇关于写在HBase的火花的工作:与存在的类型一个难题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!