本文介绍了从Spark作业中调用JDBC进行Impala/Hive并创建表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在Scala中编写一个Spark作业,该作业将打开与Impala的jdbc连接,并让我创建一个表并执行其他操作.

I am trying to write a spark job in scala that would open a jdbc connection with Impala and let me create a table and perform other operations.

我该怎么做?任何例子都会有很大帮助.谢谢!

How do I do this? Any example would be of great help.Thank you!

推荐答案

val JDBCDriver = "com.cloudera.impala.jdbc41.Driver"
val ConnectionURL = "jdbc:impala://url.server.net:21050/default;auth=noSasl"

Class.forName(JDBCDriver).newInstance
val con = DriverManager.getConnection(ConnectionURL)
val stmt = con.createStatement()
val rs = stmt.executeQuery(query)

val resultSetList = Iterator.continually((rs.next(), rs)).takeWhile(_._1).map(r => {
    getRowFromResultSet(r._2) // (ResultSet) => (spark.sql.Row)
}).toList

sc.parallelize(resultSetList)

这篇关于从Spark作业中调用JDBC进行Impala/Hive并创建表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-11 06:54