本文介绍了Spark 找不到 JDBC 驱动程序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我一直在使用 sbt 和程序集来将我的所有依赖项打包到一个 jar 中以用于我的 Spark 作业.我有几个工作,我使用 c3p0 来设置连接池信息,将其广播出去,然后在 RDD 上使用 foreachPartition 来获取连接,然后插入数据进入数据库.在我的 sbt 构建脚本中,我包括

So I've been using sbt with assembly to package all my dependencies into a single jar for my spark jobs. I've got several jobs where I was using c3p0 to setup connection pool information, broadcast that out, and then use foreachPartition on the RDD to then grab a connection, and insert the data into the database. In my sbt build script, I include

"mysql" % "mysql-connector-java" % "5.1.33"

这可确保 JDBC 连接器与作业打包在一起.一切正常.

This makes sure the JDBC connector is packaged up with the job. Everything works great.

所以最近我开始尝试使用 SparkSQL,并意识到使用 1.3.0

So recently I started playing around with SparkSQL and realized it's much easier to simply take a dataframe and save it to a jdbc source with the new features in 1.3.0

我收到以下异常:

java.sql.SQLException: 找不到合适的驱动程序jdbc:mysql://some.domain.com/myschema?user=user&password=password 在java.sql.DriverManager.getConnection(DriverManager.java:596) 在java.sql.DriverManager.getConnection(DriverManager.java:233)

当我在本地运行它时,我通过设置来解决

When I was running this locally I got around it by setting

SPARK_CLASSPATH=/path/where/mysql-connector-is.jar

最终我想知道的是,为什么该工作应该与它打包在一起时无法找到驱动程序?我的其他工作从未遇到过这个问题.据我所知 c3p0 和数据帧代码都使用了 java.sql.DriverManager(它根据我的说法为您处理导入所有内容)所以它应该可以正常工作??如果有什么东西阻止了组装方法的工作,我需要做些什么才能使它工作?

Ultimately what I'm wanting to know is, why is the job not capable of finding the driver when it should be packaged up with it? My other jobs never had this problem. From what I can tell both c3p0 and the dataframe code both make use of the java.sql.DriverManager (which handles importing everything for you from what I can tell) so it should work just fine?? If there is something that prevents the assembly method from working, what do I need to do to make this work?

推荐答案

此人遇到了类似问题:http://apache-spark-user-list.1001560.n3.nabble.com/How-to-use-DataFrame-with-MySQL-td22178.html

您是否已将连接器驱动程序更新到最新版本?调用load()的时候也指定了驱动类吗?

Have you updated your connector drivers to the most recent version? Also did you specify the driver class when you called load()?

Map<String, String> options = new HashMap<String, String>();
options.put("url", "jdbc:mysql://localhost:3306/video_rcmd?user=root&password=123456");
options.put("dbtable", "video");
options.put("driver", "com.mysql.cj.jdbc.Driver"); //here
DataFrame jdbcDF = sqlContext.load("jdbc", options);

在spark/conf/spark-defaults.conf中,你也可以设置spark.driver.extraClassPath和spark.executor.extraClassPath为你的MySql驱动.jar的路径

In spark/conf/spark-defaults.conf, you can also set spark.driver.extraClassPath and spark.executor.extraClassPath to the path of your MySql driver .jar

这篇关于Spark 找不到 JDBC 驱动程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

05-21 04:34